WorldWideScience

Sample records for ensemble prediction system

  1. Reliability of windstorm predictions in the ECMWF ensemble prediction system

    Science.gov (United States)

    Becker, Nico; Ulbrich, Uwe

    2016-04-01

    Windstorms caused by extratropical cyclones are one of the most dangerous natural hazards in the European region. Therefore, reliable predictions of such storm events are needed. Case studies have shown that ensemble prediction systems (EPS) are able to provide useful information about windstorms between two and five days prior to the event. In this work, ensemble predictions with the European Centre for Medium-Range Weather Forecasts (ECMWF) EPS are evaluated in a four year period. Within the 50 ensemble members, which are initialized every 12 hours and are run for 10 days, windstorms are identified and tracked in time and space. By using a clustering approach, different predictions of the same storm are identified in the different ensemble members and compared to reanalysis data. The occurrence probability of the predicted storms is estimated by fitting a bivariate normal distribution to the storm track positions. Our results show, for example, that predicted storm clusters with occurrence probabilities of more than 50% have a matching observed storm in 80% of all cases at a lead time of two days. The predicted occurrence probabilities are reliable up to 3 days lead time. At longer lead times the occurrence probabilities are overestimated by the EPS.

  2. Decadal climate predictions improved by ocean ensemble dispersion filtering

    Science.gov (United States)

    Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.

    2017-06-01

    Decadal predictions by Earth system models aim to capture the state and phase of the climate several years in advance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-term weather forecasts represent an initial value problem and long-term climate projections represent a boundary condition problem, the decadal climate prediction falls in-between these two time scales. In recent years, more precise initialization techniques of coupled Earth system models and increased ensemble sizes have improved decadal predictions. However, climate models in general start losing the initialized signal and its predictive skill from one forecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improved by a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. We found that this procedure, called ensemble dispersion filter, results in more accurate results than the standard decadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions show an increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with larger ensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from ocean ensemble dispersion filtering toward the ensemble mean.Plain Language SummaryDecadal predictions aim to predict the climate several years in advance. Atmosphere-ocean interaction plays an important role for such climate forecasts. The ocean memory due to its heat capacity holds big potential skill. In recent years, more precise initialization techniques of coupled Earth system models (incl. atmosphere and ocean) have improved decadal predictions. Ensembles are another important aspect. Applying slightly perturbed predictions to trigger the famous butterfly effect results in an ensemble. Instead of evaluating one prediction, but the whole ensemble with its

  3. Developing an Ensemble Prediction System based on COSMO-DE

    Science.gov (United States)

    Theis, S.; Gebhardt, C.; Buchhold, M.; Ben Bouallègue, Z.; Ohl, R.; Paulat, M.; Peralta, C.

    2010-09-01

    The numerical weather prediction model COSMO-DE is a configuration of the COSMO model with a horizontal grid size of 2.8 km. It has been running operationally at DWD since 2007, it covers the area of Germany and produces forecasts with a lead time of 0-21 hours. The model COSMO-DE is convection-permitting, which means that it does without a parametrisation of deep convection and simulates deep convection explicitly. One aim is an improved forecast of convective heavy rain events. Convection-permitting models are in operational use at several weather services, but currently not in ensemble mode. It is expected that an ensemble system could reveal the advantages of a convection-permitting model even better. The probabilistic approach is necessary, because the explicit simulation of convective processes for more than a few hours cannot be viewed as a deterministic forecast anymore. This is due to the chaotic behaviour and short life cycle of the processes which are simulated explicitly now. In the framework of the project COSMO-DE-EPS, DWD is developing and implementing an ensemble prediction system (EPS) for the model COSMO-DE. The project COSMO-DE-EPS comprises the generation of ensemble members, as well as the verification and visualization of the ensemble forecasts and also statistical postprocessing. A pre-operational mode of the EPS with 20 ensemble members is foreseen to start in 2010. Operational use is envisaged to start in 2012, after an upgrade to 40 members and inclusion of statistical postprocessing. The presentation introduces the project COSMO-DE-EPS and describes the design of the ensemble as it is planned for the pre-operational mode. In particular, the currently implemented method for the generation of ensemble members will be explained and discussed. The method includes variations of initial conditions, lateral boundary conditions, and model physics. At present, pragmatic methods are applied which resemble the basic ideas of a multi-model approach

  4. A short-range ensemble prediction system for southern Africa

    CSIR Research Space (South Africa)

    Park, R

    2012-10-01

    Full Text Available system for southern Africa R PARK, WA LANDMAN AND F ENGELBRECHT CSIR, PO Box 395, Pretoria, South Africa, 0001 Email: xxxxxxxxxxxxxx@csir.co.za ? www.csir.co.za INTRODUCTION This research has been conducted in order to develop a short-range ensemble... stream_source_info Park_2012.pdf.txt stream_content_type text/plain stream_size 7211 Content-Encoding ISO-8859-1 stream_name Park_2012.pdf.txt Content-Type text/plain; charset=ISO-8859-1 A short-range ensemble prediction...

  5. A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico

    KAUST Repository

    Hoteit, Ibrahim

    2013-09-01

    This paper describes the development of an advanced ensemble Kalman filter (EnKF)-based ocean data assimilation system for prediction of the evolution of the loop current in the Gulf of Mexico (GoM). The system integrates the Data Assimilation Research Testbed (DART) assimilation package with the Massachusetts Institute of Technology ocean general circulation model (MITgcm). The MITgcm/DART system supports the assimilation of a wide range of ocean observations and uses an ensemble approach to solve the nonlinear assimilation problems. The GoM prediction system was implemented with an eddy-resolving 1/10th degree configuration of the MITgcm. Assimilation experiments were performed over a 6-month period between May and October during a strong loop current event in 1999. The model was sequentially constrained with weekly satellite sea surface temperature and altimetry data. Experiments results suggest that the ensemble-based assimilation system shows a high predictive skill in the GoM, with estimated ensemble spread mainly concentrated around the front of the loop current. Further analysis of the system estimates demonstrates that the ensemble assimilation accurately reproduces the observed features without imposing any negative impact on the dynamical balance of the system. Results from sensitivity experiments with respect to the ensemble filter parameters are also presented and discussed. © 2013 Elsevier B.V.

  6. NCAR's Experimental Real-time Convection-allowing Ensemble Prediction System

    Science.gov (United States)

    Schwartz, C. S.; Romine, G. S.; Sobash, R.; Fossell, K.

    2016-12-01

    Since April 2015, the National Center for Atmospheric Research's (NCAR's) Mesoscale and Microscale Meteorology (MMM) Laboratory, in collaboration with NCAR's Computational Information Systems Laboratory (CISL), has been producing daily, real-time, 10-member, 48-hr ensemble forecasts with 3-km horizontal grid spacing over the conterminous United States (http://ensemble.ucar.edu). These computationally-intensive, next-generation forecasts are produced on the Yellowstone supercomputer, have been embraced by both amateur and professional weather forecasters, are widely used by NCAR and university researchers, and receive considerable attention on social media. Initial conditions are supplied by NCAR's Data Assimilation Research Testbed (DART) software and the forecast model is NCAR's Weather Research and Forecasting (WRF) model; both WRF and DART are community tools. This presentation will focus on cutting-edge research results leveraging the ensemble dataset, including winter weather predictability, severe weather forecasting, and power outage modeling. Additionally, the unique design of the real-time analysis and forecast system and computational challenges and solutions will be described.

  7. Various multistage ensembles for prediction of heating energy consumption

    Directory of Open Access Journals (Sweden)

    Radisa Jovanovic

    2015-04-01

    Full Text Available Feedforward neural network models are created for prediction of daily heating energy consumption of a NTNU university campus Gloshaugen using actual measured data for training and testing. Improvement of prediction accuracy is proposed by using neural network ensemble. Previously trained feed-forward neural networks are first separated into clusters, using k-means algorithm, and then the best network of each cluster is chosen as member of an ensemble. Two conventional averaging methods for obtaining ensemble output are applied; simple and weighted. In order to achieve better prediction results, multistage ensemble is investigated. As second level, adaptive neuro-fuzzy inference system with various clustering and membership functions are used to aggregate the selected ensemble members. Feedforward neural network in second stage is also analyzed. It is shown that using ensemble of neural networks can predict heating energy consumption with better accuracy than the best trained single neural network, while the best results are achieved with multistage ensemble.

  8. Three-model ensemble wind prediction in southern Italy

    Science.gov (United States)

    Torcasio, Rosa Claudia; Federico, Stefano; Calidonna, Claudia Roberta; Avolio, Elenio; Drofa, Oxana; Landi, Tony Christian; Malguzzi, Piero; Buzzi, Andrea; Bonasoni, Paolo

    2016-03-01

    Quality of wind prediction is of great importance since a good wind forecast allows the prediction of available wind power, improving the penetration of renewable energies into the energy market. Here, a 1-year (1 December 2012 to 30 November 2013) three-model ensemble (TME) experiment for wind prediction is considered. The models employed, run operationally at National Research Council - Institute of Atmospheric Sciences and Climate (CNR-ISAC), are RAMS (Regional Atmospheric Modelling System), BOLAM (BOlogna Limited Area Model), and MOLOCH (MOdello LOCale in H coordinates). The area considered for the study is southern Italy and the measurements used for the forecast verification are those of the GTS (Global Telecommunication System). Comparison with observations is made every 3 h up to 48 h of forecast lead time. Results show that the three-model ensemble outperforms the forecast of each individual model. The RMSE improvement compared to the best model is between 22 and 30 %, depending on the season. It is also shown that the three-model ensemble outperforms the IFS (Integrated Forecasting System) of the ECMWF (European Centre for Medium-Range Weather Forecast) for the surface wind forecasts. Notably, the three-model ensemble forecast performs better than each unbiased model, showing the added value of the ensemble technique. Finally, the sensitivity of the three-model ensemble RMSE to the length of the training period is analysed.

  9. An evaluation of the Canadian global meteorological ensemble prediction system for short-term hydrological forecasting

    Directory of Open Access Journals (Sweden)

    F. Anctil

    2009-11-01

    Full Text Available Hydrological forecasting consists in the assessment of future streamflow. Current deterministic forecasts do not give any information concerning the uncertainty, which might be limiting in a decision-making process. Ensemble forecasts are expected to fill this gap.

    In July 2007, the Meteorological Service of Canada has improved its ensemble prediction system, which has been operational since 1998. It uses the GEM model to generate a 20-member ensemble on a 100 km grid, at mid-latitudes. This improved system is used for the first time for hydrological ensemble predictions. Five watersheds in Quebec (Canada are studied: Chaudière, Châteauguay, Du Nord, Kénogami and Du Lièvre. An interesting 17-day rainfall event has been selected in October 2007. Forecasts are produced in a 3 h time step for a 3-day forecast horizon. The deterministic forecast is also available and it is compared with the ensemble ones. In order to correct the bias of the ensemble, an updating procedure has been applied to the output data. Results showed that ensemble forecasts are more skilful than the deterministic ones, as measured by the Continuous Ranked Probability Score (CRPS, especially for 72 h forecasts. However, the hydrological ensemble forecasts are under dispersed: a situation that improves with the increasing length of the prediction horizons. We conjecture that this is due in part to the fact that uncertainty in the initial conditions of the hydrological model is not taken into account.

  10. Towards an Australian ensemble streamflow forecasting system for flood prediction and water management

    Science.gov (United States)

    Bennett, J.; David, R. E.; Wang, Q.; Li, M.; Shrestha, D. L.

    2016-12-01

    Flood forecasting in Australia has historically relied on deterministic forecasting models run only when floods are imminent, with considerable forecaster input and interpretation. These now co-existed with a continually available 7-day streamflow forecasting service (also deterministic) aimed at operational water management applications such as environmental flow releases. The 7-day service is not optimised for flood prediction. We describe progress on developing a system for ensemble streamflow forecasting that is suitable for both flood prediction and water management applications. Precipitation uncertainty is handled through post-processing of Numerical Weather Prediction (NWP) output with a Bayesian rainfall post-processor (RPP). The RPP corrects biases, downscales NWP output, and produces reliable ensemble spread. Ensemble precipitation forecasts are used to force a semi-distributed conceptual rainfall-runoff model. Uncertainty in precipitation forecasts is insufficient to reliably describe streamflow forecast uncertainty, particularly at shorter lead-times. We characterise hydrological prediction uncertainty separately with a 4-stage error model. The error model relies on data transformation to ensure residuals are homoscedastic and symmetrically distributed. To ensure streamflow forecasts are accurate and reliable, the residuals are modelled using a mixture-Gaussian distribution with distinct parameters for the rising and falling limbs of the forecast hydrograph. In a case study of the Murray River in south-eastern Australia, we show ensemble predictions of floods generally have lower errors than deterministic forecasting methods. We also discuss some of the challenges in operationalising short-term ensemble streamflow forecasts in Australia, including meeting the needs for accurate predictions across all flow ranges and comparing forecasts generated by event and continuous hydrological models.

  11. The Development of Storm Surge Ensemble Prediction System and Case Study of Typhoon Meranti in 2016

    Science.gov (United States)

    Tsai, Y. L.; Wu, T. R.; Terng, C. T.; Chu, C. H.

    2017-12-01

    Taiwan is under the threat of storm surge and associated inundation, which is located at a potentially severe storm generation zone. The use of ensemble prediction can help forecasters to know the characteristic of storm surge under the uncertainty of track and intensity. In addition, it can help the deterministic forecasting. In this study, the kernel of ensemble prediction system is based on COMCOT-SURGE (COrnell Multi-grid COupled Tsunami Model - Storm Surge). COMCOT-SURGE solves nonlinear shallow water equations in Open Ocean and coastal regions with the nested-grid scheme and adopts wet-dry-cell treatment to calculate potential inundation area. In order to consider tide-surge interaction, the global TPXO 7.1 tide model provides the tidal boundary conditions. After a series of validations and case studies, COMCOT-SURGE has become an official operating system of Central Weather Bureau (CWB) in Taiwan. In this study, the strongest typhoon in 2016, Typhoon Meranti, is chosen as a case study. We adopt twenty ensemble members from CWB WRF Ensemble Prediction System (CWB WEPS), which differs from parameters of microphysics, boundary layer, cumulus, and surface. From box-and-whisker results, maximum observed storm surges were located in the interval of the first and third quartile at more than 70 % gauge locations, e.g. Toucheng, Chengkung, and Jiangjyun. In conclusion, the ensemble prediction can effectively help forecasters to predict storm surge especially under the uncertainty of storm track and intensity

  12. Development of a regional ensemble prediction method for probabilistic weather prediction

    International Nuclear Information System (INIS)

    Nohara, Daisuke; Tamura, Hidetoshi; Hirakuchi, Hiromaru

    2015-01-01

    A regional ensemble prediction method has been developed to provide probabilistic weather prediction using a numerical weather prediction model. To obtain consistent perturbations with the synoptic weather pattern, both of initial and lateral boundary perturbations were given by differences between control and ensemble member of the Japan Meteorological Agency (JMA)'s operational one-week ensemble forecast. The method provides a multiple ensemble member with a horizontal resolution of 15 km for 48-hour based on a downscaling of the JMA's operational global forecast accompanied with the perturbations. The ensemble prediction was examined in the case of heavy snow fall event in Kanto area on January 14, 2013. The results showed that the predictions represent different features of high-resolution spatiotemporal distribution of precipitation affected by intensity and location of extra-tropical cyclone in each ensemble member. Although the ensemble prediction has model bias of mean values and variances in some variables such as wind speed and solar radiation, the ensemble prediction has a potential to append a probabilistic information to a deterministic prediction. (author)

  13. A polynomial chaos ensemble hydrologic prediction system for efficient parameter inference and robust uncertainty assessment

    Science.gov (United States)

    Wang, S.; Huang, G. H.; Baetz, B. W.; Huang, W.

    2015-11-01

    This paper presents a polynomial chaos ensemble hydrologic prediction system (PCEHPS) for an efficient and robust uncertainty assessment of model parameters and predictions, in which possibilistic reasoning is infused into probabilistic parameter inference with simultaneous consideration of randomness and fuzziness. The PCEHPS is developed through a two-stage factorial polynomial chaos expansion (PCE) framework, which consists of an ensemble of PCEs to approximate the behavior of the hydrologic model, significantly speeding up the exhaustive sampling of the parameter space. Multiple hypothesis testing is then conducted to construct an ensemble of reduced-dimensionality PCEs with only the most influential terms, which is meaningful for achieving uncertainty reduction and further acceleration of parameter inference. The PCEHPS is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability. A detailed comparison between the HYMOD hydrologic model, the ensemble of PCEs, and the ensemble of reduced PCEs is performed in terms of accuracy and efficiency. Results reveal temporal and spatial variations in parameter sensitivities due to the dynamic behavior of hydrologic systems, and the effects (magnitude and direction) of parametric interactions depending on different hydrological metrics. The case study demonstrates that the PCEHPS is capable not only of capturing both expert knowledge and probabilistic information in the calibration process, but also of implementing an acceleration of more than 10 times faster than the hydrologic model without compromising the predictive accuracy.

  14. The Hydrologic Ensemble Prediction Experiment (HEPEX)

    Science.gov (United States)

    Wood, A. W.; Thielen, J.; Pappenberger, F.; Schaake, J. C.; Hartman, R. K.

    2012-12-01

    The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF). With support from the US National Weather Service (NWS) and the European Commission (EC), the HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support in emergency management and water resources sectors. The strategy to meet this goal includes meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. HEPEX has organized about a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Today, the HEPEX mission is to demonstrate the added value of hydrological ensemble prediction systems (HEPS) for emergency management and water resources sectors to make decisions that have important consequences for economy, public health, safety, and the environment. HEPEX is now organised around six major themes that represent core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.

  15. Can decadal climate predictions be improved by ocean ensemble dispersion filtering?

    Science.gov (United States)

    Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.

    2017-12-01

    Decadal predictions by Earth system models aim to capture the state and phase of the climate several years inadvance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-termweather forecasts represent an initial value problem and long-term climate projections represent a boundarycondition problem, the decadal climate prediction falls in-between these two time scales. The ocean memorydue to its heat capacity holds big potential skill on the decadal scale. In recent years, more precise initializationtechniques of coupled Earth system models (incl. atmosphere and ocean) have improved decadal predictions.Ensembles are another important aspect. Applying slightly perturbed predictions results in an ensemble. Insteadof using and evaluating one prediction, but the whole ensemble or its ensemble average, improves a predictionsystem. However, climate models in general start losing the initialized signal and its predictive skill from oneforecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improvedby a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. Wefound that this procedure, called ensemble dispersion filter, results in more accurate results than the standarddecadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions showan increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with largerensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from oceanensemble dispersion filtering toward the ensemble mean. This study is part of MiKlip (fona-miklip.de) - a major project on decadal climate prediction in Germany.We focus on the Max-Planck-Institute Earth System Model using the low-resolution version (MPI-ESM-LR) andMiKlip's basic initialization strategy as in 2017 published decadal climate forecast: http

  16. A short-range multi-model ensemble weather prediction system for South Africa

    CSIR Research Space (South Africa)

    Landman, S

    2010-09-01

    Full Text Available prediction system (EPS) at the South African Weather Service (SAWS) are examined. The ensemble consists of different forecasts from the 12-km LAM of the UK Met Office Unified Model (UM) and the Conformal-Cubic Atmospheric Model (CCAM) covering the South...

  17. Ensemble method for dengue prediction.

    Science.gov (United States)

    Buczak, Anna L; Baugher, Benjamin; Moniz, Linda J; Bagley, Thomas; Babin, Steven M; Guven, Erhan

    2018-01-01

    In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.

  18. Ensemble method for dengue prediction.

    Directory of Open Access Journals (Sweden)

    Anna L Buczak

    Full Text Available In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico during four dengue seasons: 1 peak height (i.e., maximum weekly number of cases during a transmission season; 2 peak week (i.e., week in which the maximum weekly number of cases occurred; and 3 total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date.Our approach used ensemble models created by combining three disparate types of component models: 1 two-dimensional Method of Analogues models incorporating both dengue and climate data; 2 additive seasonal Holt-Winters models with and without wavelet smoothing; and 3 simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations.Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week.The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.

  19. A comparison between the ECMWF and COSMO Ensemble Prediction Systems applied to short-term wind power forecasting on real data

    DEFF Research Database (Denmark)

    Alessandrini, S.; Sperati, S.; Pinson, Pierre

    2013-01-01

    together with a single forecast power value for each future time horizon. A comparison between two different ensemble forecasting models, ECMWF EPS (Ensemble Prediction System in use at the European Centre for Medium-Range Weather Forecasts) and COSMO-LEPS (Limited-area Ensemble Prediction System developed...... ahead forecast horizon. A statistical calibration of the ensemble wind speed members based on the use of past wind speed measurements is explained. The two models are compared using common verification indices and diagrams. The higher horizontal resolution model (COSMO-LEPS) shows slightly better...

  20. On the proper use of Ensembles for Predictive Uncertainty assessment

    Science.gov (United States)

    Todini, Ezio; Coccia, Gabriele; Ortiz, Enrique

    2015-04-01

    uncertainty of the ensemble mean and that of the ensemble spread. The results of this new approach are illustrated by using data and forecasts from an operational real time flood forecasting. Coccia, G. and Todini, E. 2011. Recent developments in predictive uncertainty assessment based on the Model Conditional Processor approach. Hydrology and Earth System Sciences, 15, 3253-3274. doi:10.5194/hess-15-3253-2011. Krzysztofowicz, R. 1999 Bayesian theory of probabilistic forecasting via deterministic hydrologic model, Water Resour. Res., 35, 2739-2750. Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, 2005. Using Bayesian model averaging to calibrate forecast ensembles, Mon. Weather Rev., 133, 1155-1174. Reggiani, P., Renner, M., Weerts, A., and van Gelder, P., 2009. Uncertainty assessment via Bayesian revision of ensemble streamflow predictions in the operational river Rhine forecasting system, Water Resour. Res., 45, W02428, doi:10.1029/2007WR006758. Todini E. 2004. Role and treatment of uncertainty in real-time flood forecasting. Hydrological Processes 18(14), 2743_2746 Todini, E. 2008. A model conditional processor to assess predictive uncertainty in flood forecasting. Intl. J. River Basin Management, 6(2): 123-137.

  1. A new ensemble model for short term wind power prediction

    DEFF Research Database (Denmark)

    Madsen, Henrik; Albu, Razvan-Daniel; Felea, Ioan

    2012-01-01

    As the objective of this study, a non-linear ensemble system is used to develop a new model for predicting wind speed in short-term time scale. Short-term wind power prediction becomes an extremely important field of research for the energy sector. Regardless of the recent advancements in the re-search...... of prediction models, it was observed that different models have different capabilities and also no single model is suitable under all situations. The idea behind EPS (ensemble prediction systems) is to take advantage of the unique features of each subsystem to detain diverse patterns that exist in the dataset...

  2. Ocean Predictability and Uncertainty Forecasts Using Local Ensemble Transfer Kalman Filter (LETKF)

    Science.gov (United States)

    Wei, M.; Hogan, P. J.; Rowley, C. D.; Smedstad, O. M.; Wallcraft, A. J.; Penny, S. G.

    2017-12-01

    Ocean predictability and uncertainty are studied with an ensemble system that has been developed based on the US Navy's operational HYCOM using the Local Ensemble Transfer Kalman Filter (LETKF) technology. One of the advantages of this method is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates operational observations using ensemble method. The background covariance during this assimilation process is implicitly supplied with the ensemble avoiding the difficult task of developing tangent linear and adjoint models out of HYCOM with the complicated hybrid isopycnal vertical coordinate for 4D-VAR. The flow-dependent background covariance from the ensemble will be an indispensable part in the next generation hybrid 4D-Var/ensemble data assimilation system. The predictability and uncertainty for the ocean forecasts are studied initially for the Gulf of Mexico. The results are compared with another ensemble system using Ensemble Transfer (ET) method which has been used in the Navy's operational center. The advantages and disadvantages are discussed.

  3. The state of the art of flood forecasting - Hydrological Ensemble Prediction Systems

    Science.gov (United States)

    Thielen-Del Pozo, J.; Pappenberger, F.; Salamon, P.; Bogner, K.; Burek, P.; de Roo, A.

    2010-09-01

    Flood forecasting systems form a key part of ‘preparedness' strategies for disastrous floods and provide hydrological services, civil protection authorities and the public with information of upcoming events. Provided the warning leadtime is sufficiently long, adequate preparatory actions can be taken to efficiently reduce the impacts of the flooding. Because of the specific characteristics of each catchment, varying data availability and end-user demands, the design of the best flood forecasting system may differ from catchment to catchment. However, despite the differences in concept and data needs, there is one underlying issue that spans across all systems. There has been an growing awareness and acceptance that uncertainty is a fundamental issue of flood forecasting and needs to be dealt with at the different spatial and temporal scales as well as the different stages of the flood generating processes. Today, operational flood forecasting centres change increasingly from single deterministic forecasts to probabilistic forecasts with various representations of the different contributions of uncertainty. The move towards these so-called Hydrological Ensemble Prediction Systems (HEPS) in flood forecasting represents the state of the art in forecasting science, following on the success of the use of ensembles for weather forecasting (Buizza et al., 2005) and paralleling the move towards ensemble forecasting in other related disciplines such as climate change predictions. The use of HEPS has been internationally fostered by initiatives such as "The Hydrologic Ensemble Prediction Experiment" (HEPEX), created with the aim to investigate how best to produce, communicate and use hydrologic ensemble forecasts in hydrological short-, medium- und long term prediction of hydrological processes. The advantages of quantifying the different contributions of uncertainty as well as the overall uncertainty to obtain reliable and useful flood forecasts also for extreme events

  4. Relative effects of statistical preprocessing and postprocessing on a regional hydrological ensemble prediction system

    Science.gov (United States)

    Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso

    2018-03-01

    The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases

  5. Towards a GME ensemble forecasting system: Ensemble initialization using the breeding technique

    Directory of Open Access Journals (Sweden)

    Jan D. Keller

    2008-12-01

    Full Text Available The quantitative forecast of precipitation requires a probabilistic background particularly with regard to forecast lead times of more than 3 days. As only ensemble simulations can provide useful information of the underlying probability density function, we built a new ensemble forecasting system (GME-EFS based on the GME model of the German Meteorological Service (DWD. For the generation of appropriate initial ensemble perturbations we chose the breeding technique developed by Toth and Kalnay (1993, 1997, which develops perturbations by estimating the regions of largest model error induced uncertainty. This method is applied and tested in the framework of quasi-operational forecasts for a three month period in 2007. The performance of the resulting ensemble forecasts are compared to the operational ensemble prediction systems ECMWF EPS and NCEP GFS by means of ensemble spread of free atmosphere parameters (geopotential and temperature and ensemble skill of precipitation forecasting. This comparison indicates that the GME ensemble forecasting system (GME-EFS provides reasonable forecasts with spread skill score comparable to that of the NCEP GFS. An analysis with the continuous ranked probability score exhibits a lack of resolution for the GME forecasts compared to the operational ensembles. However, with significant enhancements during the 3 month test period, the first results of our work with the GME-EFS indicate possibilities for further development as well as the potential for later operational usage.

  6. Multi-Model Ensemble Wake Vortex Prediction

    Science.gov (United States)

    Koerner, Stephan; Holzaepfel, Frank; Ahmad, Nash'at N.

    2015-01-01

    Several multi-model ensemble methods are investigated for predicting wake vortex transport and decay. This study is a joint effort between National Aeronautics and Space Administration and Deutsches Zentrum fuer Luft- und Raumfahrt to develop a multi-model ensemble capability using their wake models. An overview of different multi-model ensemble methods and their feasibility for wake applications is presented. The methods include Reliability Ensemble Averaging, Bayesian Model Averaging, and Monte Carlo Simulations. The methodologies are evaluated using data from wake vortex field experiments.

  7. Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.

    Directory of Open Access Journals (Sweden)

    Nikola Simidjievski

    Full Text Available Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting, significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.

  8. SVM and SVM Ensembles in Breast Cancer Prediction.

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  9. SVM and SVM Ensembles in Breast Cancer Prediction.

    Directory of Open Access Journals (Sweden)

    Min-Wei Huang

    Full Text Available Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  10. Skill forecasting from different wind power ensemble prediction methods

    International Nuclear Information System (INIS)

    Pinson, Pierre; Nielsen, Henrik A; Madsen, Henrik; Kariniotakis, George

    2007-01-01

    This paper presents an investigation on alternative approaches to the providing of uncertainty estimates associated to point predictions of wind generation. Focus is given to skill forecasts in the form of prediction risk indices, aiming at giving a comprehensive signal on the expected level of forecast uncertainty. Ensemble predictions of wind generation are used as input. A proposal for the definition of prediction risk indices is given. Such skill forecasts are based on the dispersion of ensemble members for a single prediction horizon, or over a set of successive look-ahead times. It is shown on the test case of a Danish offshore wind farm how prediction risk indices may be related to several levels of forecast uncertainty (and energy imbalances). Wind power ensemble predictions are derived from the transformation of ECMWF and NCEP ensembles of meteorological variables to power, as well as by a lagged average approach alternative. The ability of risk indices calculated from the various types of ensembles forecasts to resolve among situations with different levels of uncertainty is discussed

  11. Evaluation of the NMC regional ensemble prediction system during the Beijing 2008 Olympic Games

    Science.gov (United States)

    Li, Xiaoli; Tian, Hua; Deng, Guo

    2011-10-01

    Based on the B08RDP (Beijing 2008 Olympic Games Mesoscale Ensemble Prediction Research and Development Project) that was launched by the World Weather Research Programme (WWRP) in 2004, a regional ensemble prediction system (REPS) at a 15-km horizontal resolution was developed at the National Meteorological Center (NMC) of the China Meteorological Administration (CMA). Supplementing to the forecasters' subjective affirmation on the promising performance of the REPS during the 2008 Beijing Olympic Games (BOG), this paper focuses on the objective verification of the REPS for precipitation forecasts during the BOG period. By use of a set of advanced probabilistic verification scores, the value of the REPS compared to the quasi-operational global ensemble prediction system (GEPS) is assessed for a 36-day period (21 July-24 August 2008). The evaluation here involves different aspects of the REPS and GEPS, including their general forecast skills, specific attributes (reliability and resolution), and related economic values. The results indicate that the REPS generally performs significantly better for the short-range precipitation forecasts than the GEPS, and for light to heavy rainfall events, the REPS provides more skillful forecasts for accumulated 6- and 24-h precipitation. By further identifying the performance of the REPS through the attribute-focused measures, it is found that the advantages of the REPS over the GEPS come from better reliability (smaller biases and better dispersion) and increased resolution. Also, evaluation of a decision-making score reveals that a much larger group of users benefits from using the REPS forecasts than using the single model (the control run) forecasts, especially for the heavy rainfall events.

  12. Analysis of the regional MiKlip decadal prediction system over Europe: skill, added value of regionalization, and ensemble size dependeny

    Science.gov (United States)

    Reyers, Mark; Moemken, Julia; Pinto, Joaquim; Feldmann, Hendrik; Kottmeier, Christoph; MiKlip Module-C Team

    2017-04-01

    Decadal climate predictions can provide a useful basis for decision making support systems for the public and private sectors. Several generations of decadal hindcasts and predictions have been generated throughout the German research program MiKlip. Together with the global climate predictions computed with MPI-ESM, the regional climate model (RCM) COSMO-CLM is used for regional downscaling by MiKlip Module-C. The RCMs provide climate information on spatial and temporal scales closer to the needs of potential users. In this study, two downscaled hindcast generations are analysed (named b0 and b1). The respective global generations are both initialized by nudging them towards different reanalysis anomaly fields. An ensemble of five starting years (1961, 1971, 1981, 1991, and 2001), each comprising ten ensemble members, is used for both generations in order to quantify the regional decadal prediction skill for precipitation and near-surface temperature and wind speed over Europe. All datasets (including hindcasts, observations, reanalysis, and historical MPI-ESM runs) are pre-processed in an analogue manner by (i) removing the long-term trend and (ii) re-gridding to a common grid. Our analysis shows that there is potential for skillful decadal predictions over Europe in the regional MiKlip ensemble, but the skill is not systematic and depends on the PRUDENCE region and the variable. Further, the differences between the two hindcast generations are mostly small. As we used detrended time series, the predictive skill found in our study can probably attributed to reasonable predictions of anomalies which are associated with the natural climate variability. In a sensitivity study, it is shown that the results may strongly change when the long-term trend is kept in the datasets, as here the skill of predicting the long-term trend (e.g. for temperature) also plays a major role. The regionalization of the global ensemble provides an added value for decadal predictions for

  13. Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

    Directory of Open Access Journals (Sweden)

    Matthew B Biggs

    2017-03-01

    Full Text Available Genome-scale metabolic network reconstructions (GENREs are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA. We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.

  14. Prediction and Monitoring of Monsoon Intraseasonal Oscillations over Indian Monsoon Region in an Ensemble Prediction System using CFSv2

    Science.gov (United States)

    Borah, Nabanita; Sukumarpillai, Abhilash; Sahai, Atul Kumar; Chattopadhyay, Rajib; Joseph, Susmitha; De, Soumyendu; Nath Goswami, Bhupendra; Kumar, Arun

    2014-05-01

    An ensemble prediction system (EPS) is devised for the extended range prediction (ERP) of monsoon intraseasonal oscillations (MISO) of Indian summer monsoon (ISM) using NCEP Climate Forecast System model version2 at T126 horizontal resolution. The EPS is formulated by producing 11 member ensembles through the perturbation of atmospheric initial conditions. The hindcast experiments were conducted at every 5-day interval for 45 days lead time starting from 16th May to 28th September during 2001-2012. The general simulation of ISM characteristics and the ERP skill of the proposed EPS at pentad mean scale are evaluated in the present study. Though the EPS underestimates both the mean and variability of ISM rainfall, it simulates the northward propagation of MISO reasonably well. It is found that the signal-to-noise ratio becomes unity by about18 days and the predictability error saturates by about 25 days. Though useful deterministic forecasts could be generated up to 2nd pentad lead, significant correlations are observed even up to 4th pentad lead. The skill in predicting large-scale MISO, which is assessed by comparing the predicted and observed MISO indices, is found to be ~17 days. It is noted that the prediction skill of actual rainfall is closely related to the prediction of amplitude of large scale MISO as well as the initial conditions related to the different phases of MISO. Categorical prediction skills reveals that break is more skillfully predicted, followed by active and then normal. The categorical probability skill scores suggest that useful probabilistic forecasts could be generated even up to 4th pentad lead.

  15. Assessing uncertainties in flood forecasts for decision making: prototype of an operational flood management system integrating ensemble predictions

    Directory of Open Access Journals (Sweden)

    J. Dietrich

    2009-08-01

    Full Text Available Ensemble forecasts aim at framing the uncertainties of the potential future development of the hydro-meteorological situation. A probabilistic evaluation can be used to communicate forecast uncertainty to decision makers. Here an operational system for ensemble based flood forecasting is presented, which combines forecasts from the European COSMO-LEPS, SRNWP-PEPS and COSMO-DE prediction systems. A multi-model lagged average super-ensemble is generated by recombining members from different runs of these meteorological forecast systems. A subset of the super-ensemble is selected based on a priori model weights, which are obtained from ensemble calibration. Flood forecasts are simulated by the conceptual rainfall-runoff-model ArcEGMO. Parameter uncertainty of the model is represented by a parameter ensemble, which is a priori generated from a comprehensive uncertainty analysis during model calibration. The use of a computationally efficient hydrological model within a flood management system allows us to compute the hydro-meteorological model chain for all members of the sub-ensemble. The model chain is not re-computed before new ensemble forecasts are available, but the probabilistic assessment of the output is updated when new information from deterministic short range forecasts or from assimilation of measured data becomes available. For hydraulic modelling, with the desired result of a probabilistic inundation map with high spatial resolution, a replacement model can help to overcome computational limitations. A prototype of the developed framework has been applied for a case study in the Mulde river basin. However these techniques, in particular the probabilistic assessment and the derivation of decision rules are still in their infancy. Further research is necessary and promising.

  16. The Operational Hydro-meteorological Ensemble Prediction System at Meteo-France and its representation interface for the French Service for Flood Prediction (SCHAPI) : description and undergoing developments.

    Science.gov (United States)

    Rousset-Regimbeau, F.; Martin, E.; Thirel, G.; Habets, F.; Coustau, M.; Roquelaure, S.; De Saint Aubin, C.; Ardilouze, C.

    2012-04-01

    The coupled physically-based hydro-meteorological model SAFRAN-ISBA-MODCOU (SIM) is developed at Meteo-France for many years. This fully distributed catchment model is used in a pre-operational mode since 2005 for producing mid-range ensemble streamflow forecasts based on the 51-member 10-day ECMWF EPS. Improvements have been made during the past few years.. First, a statistical adaptation has been performed to improve the meteorological ensemble predictions from the ECMWF. It has been developped over a 3-year archive, and assessed over a 1-year period. Its impact on the performance of the streamflow forecasts has been calculated over 8 months of predictions. Then, a past discharges assimilation system has been implemented in order to improve the initial states of these ensemble streamflow forecasts. It has been developped in the framework of a Phd thesis, and it is now evaluated in real-time conditions. Moreover, an improvement of the physics of the ISBA model (the exponential profile of the hydraulic conductivity in the soil) was implemented. Finally, this system provides ensemble 10-day streamflow prediction to the French National Service for Flood Prediction (SCHAPI). A collaboration between Meteo-France and SCHAPI led to the development of a new website. This website shows the streamflow predictions for about 200 selected river stations over France (selected regarding their interest for flood warning) , as well as alerts for high flows (two levels of high flows corresponding to the levels of risk of the French flood warning system). It aims at providing to the French hydrological forecaters a real-time tool for mid-range flood awareness.

  17. A Hierarchical Method for Transient Stability Prediction of Power Systems Using the Confidence of a SVM-Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Yanzhen Zhou

    2016-09-01

    Full Text Available Machine learning techniques have been widely used in transient stability prediction of power systems. When using the post-fault dynamic responses, it is difficult to draw a definite conclusion about how long the duration of response data used should be in order to balance the accuracy and speed. Besides, previous studies have the problem of lacking consideration for the confidence level. To solve these problems, a hierarchical method for transient stability prediction based on the confidence of ensemble classifier using multiple support vector machines (SVMs is proposed. Firstly, multiple datasets are generated by bootstrap sampling, then features are randomly picked up to compress the datasets. Secondly, the confidence indices are defined and multiple SVMs are built based on these generated datasets. By synthesizing the probabilistic outputs of multiple SVMs, the prediction results and confidence of the ensemble classifier will be obtained. Finally, different ensemble classifiers with different response times are built to construct different layers of the proposed hierarchical scheme. The simulation results show that the proposed hierarchical method can balance the accuracy and rapidity of the transient stability prediction. Moreover, the hierarchical method can reduce the misjudgments of unstable instances and cooperate with the time domain simulation to insure the security and stability of power systems.

  18. An ensemble classifier to predict track geometry degradation

    International Nuclear Information System (INIS)

    Cárdenas-Gallo, Iván; Sarmiento, Carlos A.; Morales, Gilberto A.; Bolivar, Manuel A.; Akhavan-Tabatabaei, Raha

    2017-01-01

    Railway operations are inherently complex and source of several problems. In particular, track geometry defects are one of the leading causes of train accidents in the United States. This paper presents a solution approach which entails the construction of an ensemble classifier to forecast the degradation of track geometry. Our classifier is constructed by solving the problem from three different perspectives: deterioration, regression and classification. We considered a different model from each perspective and our results show that using an ensemble method improves the predictive performance. - Highlights: • We present an ensemble classifier to forecast the degradation of track geometry. • Our classifier considers three perspectives: deterioration, regression and classification. • We construct and test three models and our results show that using an ensemble method improves the predictive performance.

  19. Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall over Northern Tropical Africa

    Science.gov (United States)

    Vogel, Peter; Knippertz, Peter; Fink, Andreas H.; Schlueter, Andreas; Gneiting, Tilmann

    2018-04-01

    Accumulated precipitation forecasts are of high socioeconomic importance for agriculturally dominated societies in northern tropical Africa. In this study, we analyze the performance of nine operational global ensemble prediction systems (EPSs) relative to climatology-based forecasts for 1 to 5-day accumulated precipitation based on the monsoon seasons 2007-2014 for three regions within northern tropical Africa. To assess the full potential of raw ensemble forecasts across spatial scales, we apply state-of-the-art statistical postprocessing methods in form of Bayesian Model Averaging (BMA) and Ensemble Model Output Statistics (EMOS), and verify against station and spatially aggregated, satellite-based gridded observations. Raw ensemble forecasts are uncalibrated, unreliable, and underperform relative to climatology, independently of region, accumulation time, monsoon season, and ensemble. Differences between raw ensemble and climatological forecasts are large, and partly stem from poor prediction for low precipitation amounts. BMA and EMOS postprocessed forecasts are calibrated, reliable, and strongly improve on the raw ensembles, but - somewhat disappointingly - typically do not outperform climatology. Most EPSs exhibit slight improvements over the period 2007-2014, but overall have little added value compared to climatology. We suspect that the parametrization of convection is a potential cause for the sobering lack of ensemble forecast skill in a region dominated by mesoscale convective systems.

  20. Singular vectors, predictability and ensemble forecasting for weather and climate

    International Nuclear Information System (INIS)

    Palmer, T N; Zanna, Laure

    2013-01-01

    The local instabilities of a nonlinear dynamical system can be characterized by the leading singular vectors of its linearized operator. The leading singular vectors are perturbations with the greatest linear growth and are therefore key in assessing the system’s predictability. In this paper, the analysis of singular vectors for the predictability of weather and climate and ensemble forecasting is discussed. An overview of the role of singular vectors in informing about the error growth rate in numerical models of the atmosphere is given. This is followed by their use in the initialization of ensemble weather forecasts. Singular vectors for the ocean and coupled ocean–atmosphere system in order to understand the predictability of climate phenomena such as ENSO and meridional overturning circulation are reviewed and their potential use to initialize seasonal and decadal forecasts is considered. As stochastic parameterizations are being implemented, some speculations are made about the future of singular vectors for the predictability of weather and climate for theoretical applications and at the operational level. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Lyapunov analysis: from dynamical systems theory to applications’. (review)

  1. Ensemble prediction of air quality using the WRF/CMAQ model system for health effect studies in China

    Science.gov (United States)

    Hu, Jianlin; Li, Xun; Huang, Lin; Ying, Qi; Zhang, Qiang; Zhao, Bin; Wang, Shuxiao; Zhang, Hongliang

    2017-11-01

    Accurate exposure estimates are required for health effect analyses of severe air pollution in China. Chemical transport models (CTMs) are widely used to provide spatial distribution, chemical composition, particle size fractions, and source origins of air pollutants. The accuracy of air quality predictions in China is greatly affected by the uncertainties of emission inventories. The Community Multiscale Air Quality (CMAQ) model with meteorological inputs from the Weather Research and Forecasting (WRF) model were used in this study to simulate air pollutants in China in 2013. Four simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC), the Emission Inventory for China by School of Environment at Tsinghua University (SOE), the Emissions Database for Global Atmospheric Research (EDGAR), and the Regional Emission inventory in Asia version 2 (REAS2). Model performance of each simulation was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 generally meet the model performance criteria, but performance differences exist in different regions, for different pollutants, and among inventories. Ensemble predictions were calculated by linearly combining the results from different inventories to minimize the sum of the squared errors between the ensemble results and the observations in all cities. The ensemble concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB) and mean fractional errors (MFEs) of the ensemble annual PM2.5 in the 60 cities are -0.11 and 0.24, respectively, which are better than the MFB (-0.25 to -0.16) and MFE (0.26-0.31) of individual simulations. The ensemble annual daily maximum 1 h O3 (O3-1h) concentrations are also improved, with mean normalized bias (MNB) of 0.03 and mean normalized errors (MNE) of 0.14, compared to MNB of 0.06-0.19 and

  2. The Hydrologic Ensemble Prediction Experiment (HEPEX)

    Science.gov (United States)

    Wood, Andy; Wetterhall, Fredrik; Ramos, Maria-Helena

    2015-04-01

    The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF), and co-sponsored by the US National Weather Service (NWS) and the European Commission (EC). The HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support. HEPEX pursues this goal through research efforts and practical implementations involving six core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. HEPEX has grown through meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. In the last decade, HEPEX has organized over a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Through these interactions and an active online blog (www.hepex.org), HEPEX has built a strong and active community of nearly 400 researchers & practitioners around the world. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.

  3. Limited-area short-range ensemble predictions targeted for heavy rain in Europe

    Directory of Open Access Journals (Sweden)

    K. Sattler

    2005-01-01

    Full Text Available Inherent uncertainties in short-range quantitative precipitation forecasts (QPF from the high-resolution, limited-area numerical weather prediction model DMI-HIRLAM (LAM are addressed using two different approaches to creating a small ensemble of LAM simulations, with focus on prediction of extreme rainfall events over European river basins. The first ensemble type is designed to represent uncertainty in the atmospheric state of the initial condition and at the lateral LAM boundaries. The global ensemble prediction system (EPS from ECMWF serves as host model to the LAM and provides the state perturbations, from which a small set of significant members is selected. The significance is estimated on the basis of accumulated precipitation over a target area of interest, which contains the river basin(s under consideration. The selected members provide the initial and boundary data for the ensemble integration in the LAM. A second ensemble approach tries to address a portion of the model-inherent uncertainty responsible for errors in the forecasted precipitation field by utilising different parameterisation schemes for condensation and convection in the LAM. Three periods around historical heavy rain events that caused or contributed to disastrous river flooding in Europe are used to study the performance of the LAM ensemble designs. The three cases exhibit different dynamic and synoptic characteristics and provide an indication of the ensemble qualities in different weather situations. Precipitation analyses from the Deutsche Wetterdienst (DWD are used as the verifying reference and a comparison of daily rainfall amounts is referred to the respective river basins of the historical cases.

  4. The North American Multi-Model Ensemble (NMME): Phase-1 Seasonal to Interannual Prediction, Phase-2 Toward Developing Intra-Seasonal Prediction

    Science.gov (United States)

    Kirtman, Ben P.; Min, Dughong; Infanti, Johnna M.; Kinter, James L., III; Paolino, Daniel A.; Zhang, Qin; vandenDool, Huug; Saha, Suranjana; Mendez, Malaquias Pena; Becker, Emily; hide

    2013-01-01

    The recent US National Academies report "Assessment of Intraseasonal to Interannual Climate Prediction and Predictability" was unequivocal in recommending the need for the development of a North American Multi-Model Ensemble (NMME) operational predictive capability. Indeed, this effort is required to meet the specific tailored regional prediction and decision support needs of a large community of climate information users. The multi-model ensemble approach has proven extremely effective at quantifying prediction uncertainty due to uncertainty in model formulation, and has proven to produce better prediction quality (on average) then any single model ensemble. This multi-model approach is the basis for several international collaborative prediction research efforts, an operational European system and there are numerous examples of how this multi-model ensemble approach yields superior forecasts compared to any single model. Based on two NOAA Climate Test Bed (CTB) NMME workshops (February 18, and April 8, 2011) a collaborative and coordinated implementation strategy for a NMME prediction system has been developed and is currently delivering real-time seasonal-to-interannual predictions on the NOAA Climate Prediction Center (CPC) operational schedule. The hindcast and real-time prediction data is readily available (e.g., http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/) and in graphical format from CPC (http://origin.cpc.ncep.noaa.gov/products/people/wd51yf/NMME/index.html). Moreover, the NMME forecast are already currently being used as guidance for operational forecasters. This paper describes the new NMME effort, presents an overview of the multi-model forecast quality, and the complementary skill associated with individual models.

  5. Development of the Ensemble Navy Aerosol Analysis Prediction System (ENAAPS and its application of the Data Assimilation Research Testbed (DART in support of aerosol forecasting

    Directory of Open Access Journals (Sweden)

    J. I. Rubin

    2016-03-01

    Full Text Available An ensemble-based forecast and data assimilation system has been developed for use in Navy aerosol forecasting. The system makes use of an ensemble of the Navy Aerosol Analysis Prediction System (ENAAPS at 1 × 1°, combined with an ensemble adjustment Kalman filter from NCAR's Data Assimilation Research Testbed (DART. The base ENAAPS-DART system discussed in this work utilizes the Navy Operational Global Analysis Prediction System (NOGAPS meteorological ensemble to drive offline NAAPS simulations coupled with the DART ensemble Kalman filter architecture to assimilate bias-corrected MODIS aerosol optical thickness (AOT retrievals. This work outlines the optimization of the 20-member ensemble system, including consideration of meteorology and source-perturbed ensemble members as well as covariance inflation. Additional tests with 80 meteorological and source members were also performed. An important finding of this work is that an adaptive covariance inflation method, which has not been previously tested for aerosol applications, was found to perform better than a temporally and spatially constant covariance inflation. Problems were identified with the constant inflation in regions with limited observational coverage. The second major finding of this work is that combined meteorology and aerosol source ensembles are superior to either in isolation and that both are necessary to produce a robust system with sufficient spread in the ensemble members as well as realistic correlation fields for spreading observational information. The inclusion of aerosol source ensembles improves correlation fields for large aerosol source regions, such as smoke and dust in Africa, by statistically separating freshly emitted from transported aerosol species. However, the source ensembles have limited efficacy during long-range transport. Conversely, the meteorological ensemble generates sufficient spread at the synoptic scale to enable observational impact

  6. Ensemble prediction of air quality using the WRF/CMAQ model system for health effect studies in China

    Directory of Open Access Journals (Sweden)

    J. Hu

    2017-11-01

    Full Text Available Accurate exposure estimates are required for health effect analyses of severe air pollution in China. Chemical transport models (CTMs are widely used to provide spatial distribution, chemical composition, particle size fractions, and source origins of air pollutants. The accuracy of air quality predictions in China is greatly affected by the uncertainties of emission inventories. The Community Multiscale Air Quality (CMAQ model with meteorological inputs from the Weather Research and Forecasting (WRF model were used in this study to simulate air pollutants in China in 2013. Four simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC, the Emission Inventory for China by School of Environment at Tsinghua University (SOE, the Emissions Database for Global Atmospheric Research (EDGAR, and the Regional Emission inventory in Asia version 2 (REAS2. Model performance of each simulation was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 generally meet the model performance criteria, but performance differences exist in different regions, for different pollutants, and among inventories. Ensemble predictions were calculated by linearly combining the results from different inventories to minimize the sum of the squared errors between the ensemble results and the observations in all cities. The ensemble concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB and mean fractional errors (MFEs of the ensemble annual PM2.5 in the 60 cities are −0.11 and 0.24, respectively, which are better than the MFB (−0.25 to −0.16 and MFE (0.26–0.31 of individual simulations. The ensemble annual daily maximum 1 h O3 (O3-1h concentrations are also improved, with mean normalized bias (MNB of 0.03 and mean normalized errors (MNE of 0.14, compared to MNB

  7. Seasonal-to-decadal predictions with the ensemble Kalman filter and the Norwegian Earth System Model: a twin experiment

    Directory of Open Access Journals (Sweden)

    Francois Counillon

    2014-03-01

    Full Text Available Here, we firstly demonstrate the potential of an advanced flow dependent data assimilation method for performing seasonal-to-decadal prediction and secondly, reassess the use of sea surface temperature (SST for initialisation of these forecasts. We use the Norwegian Climate Prediction Model (NorCPM, which is based on the Norwegian Earth System Model (NorESM and uses the deterministic ensemble Kalman filter to assimilate observations. NorESM is a fully coupled system based on the Community Earth System Model version 1, which includes an ocean, an atmosphere, a sea ice and a land model. A numerically efficient coarse resolution version of NorESM is used. We employ a twin experiment methodology to provide an upper estimate of predictability in our model framework (i.e. without considering model bias of NorCPM that assimilates synthetic monthly SST data (EnKF-SST. The accuracy of EnKF-SST is compared to an unconstrained ensemble run (FREE and ensemble predictions made with near perfect (i.e. microscopic SST perturbation initial conditions (PERFECT. We perform 10 cycles, each consisting of a 10-yr assimilation phase, followed by a 10-yr prediction. The results indicate that EnKF-SST improves sea level, ice concentration, 2 m atmospheric temperature, precipitation and 3-D hydrography compared to FREE. Improvements for the hydrography are largest near the surface and are retained for longer periods at depth. Benefits in salinity are retained for longer periods compared to temperature. Near-surface improvements are largest in the tropics, while improvements at intermediate depths are found in regions of large-scale currents, regions of deep convection, and at the Mediterranean Sea outflow. However, the benefits are often small compared to PERFECT, in particular, at depth suggesting that more observations should be assimilated in addition to SST. The EnKF-SST system is also tested for standard ocean circulation indices and demonstrates decadal

  8. HBC-Evo: predicting human breast cancer by exploiting amino acid sequence-based feature spaces and evolutionary ensemble system.

    Science.gov (United States)

    Majid, Abdul; Ali, Safdar

    2015-01-01

    We developed genetic programming (GP)-based evolutionary ensemble system for the early diagnosis, prognosis and prediction of human breast cancer. This system has effectively exploited the diversity in feature and decision spaces. First, individual learners are trained in different feature spaces using physicochemical properties of protein amino acids. Their predictions are then stacked to develop the best solution during GP evolution process. Finally, results for HBC-Evo system are obtained with optimal threshold, which is computed using particle swarm optimization. Our novel approach has demonstrated promising results compared to state of the art approaches.

  9. Distinguishing high and low flow domains in urban drainage systems 2 days ahead using numerical weather prediction ensembles

    Science.gov (United States)

    Courdent, Vianney; Grum, Morten; Mikkelsen, Peter Steen

    2018-01-01

    Precipitation constitutes a major contribution to the flow in urban storm- and wastewater systems. Forecasts of the anticipated runoff flows, created from radar extrapolation and/or numerical weather predictions, can potentially be used to optimize operation in both wet and dry weather periods. However, flow forecasts are inevitably uncertain and their use will ultimately require a trade-off between the value of knowing what will happen in the future and the probability and consequence of being wrong. In this study we examine how ensemble forecasts from the HIRLAM-DMI-S05 numerical weather prediction (NWP) model subject to three different ensemble post-processing approaches can be used to forecast flow exceedance in a combined sewer for a wide range of ratios between the probability of detection (POD) and the probability of false detection (POFD). We use a hydrological rainfall-runoff model to transform the forecasted rainfall into forecasted flow series and evaluate three different approaches to establishing the relative operating characteristics (ROC) diagram of the forecast, which is a plot of POD against POFD for each fraction of concordant ensemble members and can be used to select the weight of evidence that matches the desired trade-off between POD and POFD. In the first approach, the rainfall input to the model is calculated for each of 25 ensemble members as a weighted average of rainfall from the NWP cells over the catchment where the weights are proportional to the areal intersection between the catchment and the NWP cells. In the second approach, a total of 2825 flow ensembles are generated using rainfall input from the neighbouring NWP cells up to approximately 6 cells in all directions from the catchment. In the third approach, the first approach is extended spatially by successively increasing the area covered and for each spatial increase and each time step selecting only the cell with the highest intensity resulting in a total of 175 ensemble

  10. Skill forecasting from ensemble predictions of wind power

    DEFF Research Database (Denmark)

    Pinson, Pierre; Nielsen, Henrik Aalborg; Madsen, Henrik

    2009-01-01

    Optimal management and trading of wind generation calls for the providing of uncertainty estimates along with the commonly provided short-term wind power point predictions. Alternative approaches for the use of probabilistic forecasting are introduced. More precisely, focus is given to prediction...... risk indices aiming to give a comprehensive signal on the expected level of forecast uncertainty. Ensemble predictions of wind generation are used as input. A proposal for the definition of prediction risk indices is given. Such skill forecasts are based on the spread of ensemble forecasts (i.e. a set...... of alternative scenarios for the coming period) for a single prediction horizon or over a took-ahead period. It is shown on the test case of a Danish offshore wind farm how these prediction risk indices may be related to several levels of forecast uncertainty (and potential energy imbalances). Wind power...

  11. River Flow Prediction Using the Nearest Neighbor Probabilistic Ensemble Method

    Directory of Open Access Journals (Sweden)

    H. Sanikhani

    2016-02-01

    Full Text Available Introduction: In the recent years, researchers interested on probabilistic forecasting of hydrologic variables such river flow.A probabilistic approach aims at quantifying the prediction reliability through a probability distribution function or a prediction interval for the unknown future value. The evaluation of the uncertainty associated to the forecast is seen as a fundamental information, not only to correctly assess the prediction, but also to compare forecasts from different methods and to evaluate actions and decisions conditionally on the expected values. Several probabilistic approaches have been proposed in the literature, including (1 methods that use resampling techniques to assess parameter and model uncertainty, such as the Metropolis algorithm or the Generalized Likelihood Uncertainty Estimation (GLUE methodology for an application to runoff prediction, (2 methods based on processing the forecast errors of past data to produce the probability distributions of future values and (3 methods that evaluate how the uncertainty propagates from the rainfall forecast to the river discharge prediction, as the Bayesian forecasting system. Materials and Methods: In this study, two different probabilistic methods are used for river flow prediction.Then the uncertainty related to the forecast is quantified. One approach is based on linear predictors and in the other, nearest neighbor was used. The nonlinear probabilistic ensemble can be used for nonlinear time series analysis using locally linear predictors, while NNPE utilize a method adapted for one step ahead nearest neighbor methods. In this regard, daily river discharge (twelve years of Dizaj and Mashin Stations on Baranduz-Chay basin in west Azerbijan and Zard-River basin in Khouzestan provinces were used, respectively. The first six years of data was applied for fitting the model. The next three years was used to calibration and the remained three yeas utilized for testing the models

  12. Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 1: Optimization criteria

    Directory of Open Access Journals (Sweden)

    D. Brochero

    2011-11-01

    Full Text Available Hydrological Ensemble Prediction Systems (HEPS, obtained by forcing rainfall-runoff models with Meteorological Ensemble Prediction Systems (MEPS, have been recognized as useful approaches to quantify uncertainties of hydrological forecasting systems. This task is complex both in terms of the coupling of information and computational time, which may create an operational barrier. The main objective of the current work is to assess the degree of simplification (reduction of the number of hydrological members that can be achieved with a HEPS configured using 16 lumped hydrological models driven by the 50 weather ensemble forecasts from the European Centre for Medium-range Weather Forecasts (ECMWF. Here, Backward Greedy Selection (BGS is proposed to assess the weight that each model must represent within a subset that offers similar or better performance than a reference set of 800 hydrological members. These hydrological models' weights represent the participation of each hydrological model within a simplified HEPS which would issue real-time forecasts in a relatively short computational time. The methodology uses a variation of the k-fold cross-validation, allowing an optimal use of the information, and employs a multi-criterion framework that represents the combination of resolution, reliability, consistency, and diversity. Results show that the degree of reduction of members can be established in terms of maximum number of members required (complexity of the HEPS or the maximization of the relationship between the different scores (performance.

  13. Ensemble prediction of floods – catchment non-linearity and forecast probabilities

    Directory of Open Access Journals (Sweden)

    C. Reszler

    2007-07-01

    Full Text Available Quantifying the uncertainty of flood forecasts by ensemble methods is becoming increasingly important for operational purposes. The aim of this paper is to examine how the ensemble distribution of precipitation forecasts propagates in the catchment system, and to interpret the flood forecast probabilities relative to the forecast errors. We use the 622 km2 Kamp catchment in Austria as an example where a comprehensive data set, including a 500 yr and a 1000 yr flood, is available. A spatially-distributed continuous rainfall-runoff model is used along with ensemble and deterministic precipitation forecasts that combine rain gauge data, radar data and the forecast fields of the ALADIN and ECMWF numerical weather prediction models. The analyses indicate that, for long lead times, the variability of the precipitation ensemble is amplified as it propagates through the catchment system as a result of non-linear catchment response. In contrast, for lead times shorter than the catchment lag time (e.g. 12 h and less, the variability of the precipitation ensemble is decreased as the forecasts are mainly controlled by observed upstream runoff and observed precipitation. Assuming that all ensemble members are equally likely, the statistical analyses for five flood events at the Kamp showed that the ensemble spread of the flood forecasts is always narrower than the distribution of the forecast errors. This is because the ensemble forecasts focus on the uncertainty in forecast precipitation as the dominant source of uncertainty, and other sources of uncertainty are not accounted for. However, a number of analyses, including Relative Operating Characteristic diagrams, indicate that the ensemble spread is a useful indicator to assess potential forecast errors for lead times larger than 12 h.

  14. Identifying and Assessing Gaps in Subseasonal to Seasonal Prediction Skill using the North American Multi-model Ensemble

    Science.gov (United States)

    Pegion, K.; DelSole, T. M.; Becker, E.; Cicerone, T.

    2016-12-01

    Predictability represents the upper limit of prediction skill if we had an infinite member ensemble and a perfect model. It is an intrinsic limit of the climate system associated with the chaotic nature of the atmosphere. Producing a forecast system that can make predictions very near to this limit is the ultimate goal of forecast system development. Estimates of predictability together with calculations of current prediction skill are often used to define the gaps in our prediction capabilities on subseasonal to seasonal timescales and to inform the scientific issues that must be addressed to build the next forecast system. Quantification of the predictability is also important for providing a scientific basis for relaying to stakeholders what kind of climate information can be provided to inform decision-making and what kind of information is not possible given the intrinsic predictability of the climate system. One challenge with predictability estimates is that different prediction systems can give different estimates of the upper limit of skill. How do we know which estimate of predictability is most representative of the true predictability of the climate system? Previous studies have used the spread-error relationship and the autocorrelation to evaluate the fidelity of the signal and noise estimates. Using a multi-model ensemble prediction system, we can quantify whether these metrics accurately indicate an individual model's ability to properly estimate the signal, noise, and predictability. We use this information to identify the best estimates of predictability for 2-meter temperature, precipitation, and sea surface temperature from the North American Multi-model Ensemble and compare with current skill to indicate the regions with potential for improving skill.

  15. Prediction of drug synergy in cancer using ensemble-based machine learning techniques

    Science.gov (United States)

    Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder

    2018-04-01

    Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.

  16. Ensemble ecosystem modeling for predicting ecosystem response to predator reintroduction.

    Science.gov (United States)

    Baker, Christopher M; Gordon, Ascelin; Bode, Michael

    2017-04-01

    Introducing a new or extirpated species to an ecosystem is risky, and managers need quantitative methods that can predict the consequences for the recipient ecosystem. Proponents of keystone predator reintroductions commonly argue that the presence of the predator will restore ecosystem function, but this has not always been the case, and mathematical modeling has an important role to play in predicting how reintroductions will likely play out. We devised an ensemble modeling method that integrates species interaction networks and dynamic community simulations and used it to describe the range of plausible consequences of 2 keystone-predator reintroductions: wolves (Canis lupus) to Yellowstone National Park and dingoes (Canis dingo) to a national park in Australia. Although previous methods for predicting ecosystem responses to such interventions focused on predicting changes around a given equilibrium, we used Lotka-Volterra equations to predict changing abundances through time. We applied our method to interaction networks for wolves in Yellowstone National Park and for dingoes in Australia. Our model replicated the observed dynamics in Yellowstone National Park and produced a larger range of potential outcomes for the dingo network. However, we also found that changes in small vertebrates or invertebrates gave a good indication about the potential future state of the system. Our method allowed us to predict when the systems were far from equilibrium. Our results showed that the method can also be used to predict which species may increase or decrease following a reintroduction and can identify species that are important to monitor (i.e., species whose changes in abundance give extra insight into broad changes in the system). Ensemble ecosystem modeling can also be applied to assess the ecosystem-wide implications of other types of interventions including assisted migration, biocontrol, and invasive species eradication. © 2016 Society for Conservation Biology.

  17. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Demonstrating the value of larger ensembles in forecasting physical systems

    Directory of Open Access Journals (Sweden)

    Reason L. Machete

    2016-12-01

    Full Text Available Ensemble simulation propagates a collection of initial states forward in time in a Monte Carlo fashion. Depending on the fidelity of the model and the properties of the initial ensemble, the goal of ensemble simulation can range from merely quantifying variations in the sensitivity of the model all the way to providing actionable probability forecasts of the future. Whatever the goal is, success depends on the properties of the ensemble, and there is a longstanding discussion in meteorology as to the size of initial condition ensemble most appropriate for Numerical Weather Prediction. In terms of resource allocation: how is one to divide finite computing resources between model complexity, ensemble size, data assimilation and other components of the forecast system. One wishes to avoid undersampling information available from the model's dynamics, yet one also wishes to use the highest fidelity model available. Arguably, a higher fidelity model can better exploit a larger ensemble; nevertheless it is often suggested that a relatively small ensemble, say ~16 members, is sufficient and that larger ensembles are not an effective investment of resources. This claim is shown to be dubious when the goal is probabilistic forecasting, even in settings where the forecast model is informative but imperfect. Probability forecasts for a ‘simple’ physical system are evaluated at different lead times; ensembles of up to 256 members are considered. The pure density estimation context (where ensemble members are drawn from the same underlying distribution as the target differs from the forecasting context, where one is given a high fidelity (but imperfect model. In the forecasting context, the information provided by additional members depends also on the fidelity of the model, the ensemble formation scheme (data assimilation, the ensemble interpretation and the nature of the observational noise. The effect of increasing the ensemble size is quantified by

  19. An assessment of the ECMWF tropical cyclone ensemble forecasting system and its use for insurance loss predictions

    Science.gov (United States)

    Aemisegger, F.; Martius, O.; Wüest, M.

    2010-09-01

    Tropical cyclones (TC) are amongst the most impressive and destructive weather systems of Earth's atmosphere. The costs related to such intense natural disasters have been rising in recent years and may potentially continue to increase in the near future due to changes in magnitude, timing, duration or location of tropical storms. This is a challenging situation for numerical weather prediction, which should provide a decision basis for short term protective measures through high quality medium range forecasts on the one hand. On the other hand, the insurance system bears great responsibility in elaborating proactive plans in order to face these extreme events that individuals cannot manage independently. Real-time prediction and early warning systems are needed in the insurance sector in order to face an imminent hazard and minimise losses. Early loss estimates are important in order to allocate capital and to communicate to investors. The ECMWF TC identification algorithm delivers information on the track and intensity of storms based on the ensemble forecasting system. This provides a physically based framework to assess the uncertainty in the forecast of a specific event. The performance of the ECMWF TC ensemble forecasts is evaluated in terms of cyclone intensity and location in this study and the value of such a physically-based quantification of uncertainty in the meteorological forecast for the estimation of insurance losses is assessed. An evaluation of track and intensity forecasts of hurricanes in the North Atlantic during the years 2005 to 2009 is carried out. Various effects are studied like the differences in forecasts over land or sea, as well as links between storm intensity and forecast error statistics. The value of the ECMWF TC forecasting system for the global re-insurer Swiss Re was assessed by performing insurance loss predictions using their in-house loss model for several case studies of particularly devastating events. The generally known

  20. Skill prediction of local weather forecasts based on the ECMWF ensemble

    Directory of Open Access Journals (Sweden)

    C. Ziehmann

    2001-01-01

    Full Text Available Ensemble Prediction has become an essential part of numerical weather forecasting. In this paper we investigate the ability of ensemble forecasts to provide an a priori estimate of the expected forecast skill. Several quantities derived from the local ensemble distribution are investigated for a two year data set of European Centre for Medium-Range Weather Forecasts (ECMWF temperature and wind speed ensemble forecasts at 30 German stations. The results indicate that the population of the ensemble mode provides useful information for the uncertainty in temperature forecasts. The ensemble entropy is a similar good measure. This is not true for the spread if it is simply calculated as the variance of the ensemble members with respect to the ensemble mean. The number of clusters in the C regions is almost unrelated to the local skill. For wind forecasts, the results are less promising.

  1. Ensemble-based Regional Climate Prediction: Political Impacts

    Science.gov (United States)

    Miguel, E.; Dykema, J.; Satyanath, S.; Anderson, J. G.

    2008-12-01

    Accurate forecasts of regional climate, including temperature and precipitation, have significant implications for human activities, not just economically but socially. Sub Saharan Africa is a region that has displayed an exceptional propensity for devastating civil wars. Recent research in political economy has revealed a strong statistical relationship between year to year fluctuations in precipitation and civil conflict in this region in the 1980s and 1990s. To investigate how climate change may modify the regional risk of civil conflict in the future requires a probabilistic regional forecast that explicitly accounts for the community's uncertainty in the evolution of rainfall under anthropogenic forcing. We approach the regional climate prediction aspect of this question through the application of a recently demonstrated method called generalized scalar prediction (Leroy et al. 2009), which predicts arbitrary scalar quantities of the climate system. This prediction method can predict change in any variable or linear combination of variables of the climate system averaged over a wide range spatial scales, from regional to hemispheric to global. Generalized scalar prediction utilizes an ensemble of model predictions to represent the community's uncertainty range in climate modeling in combination with a timeseries of any type of observational data that exhibits sensitivity to the scalar of interest. It is not necessary to prioritize models in deriving with the final prediction. We present the results of the application of generalized scalar prediction for regional forecasts of temperature and precipitation and Sub Saharan Africa. We utilize the climate predictions along with the established statistical relationship between year-to-year rainfall variability in Sub Saharan Africa to investigate the potential impact of climate change on civil conflict within that region.

  2. Verification of an ensemble prediction system for storm surge forecast in the Adriatic Sea

    Science.gov (United States)

    Mel, Riccardo; Lionello, Piero

    2014-12-01

    In the Adriatic Sea, storm surges present a significant threat to Venice and to the flat coastal areas of the northern coast of the basin. Sea level forecast is of paramount importance for the management of daily activities and for operating the movable barriers that are presently being built for the protection of the city. In this paper, an EPS (ensemble prediction system) for operational forecasting of storm surge in the northern Adriatic Sea is presented and applied to a 3-month-long period (October-December 2010). The sea level EPS is based on the HYPSE (hydrostatic Padua Sea elevation) model, which is a standard single-layer nonlinear shallow water model, whose forcings (mean sea level pressure and surface wind fields) are provided by the ensemble members of the ECMWF (European Center for Medium-Range Weather Forecasts) EPS. Results are verified against observations at five tide gauges located along the Croatian and Italian coasts of the Adriatic Sea. Forecast uncertainty increases with the predicted value of the storm surge and with the forecast lead time. The EMF (ensemble mean forecast) provided by the EPS has a rms (root mean square) error lower than the DF (deterministic forecast), especially for short (up to 3 days) lead times. Uncertainty for short lead times of the forecast and for small storm surges is mainly caused by uncertainty of the initial condition of the hydrodynamical model. Uncertainty for large lead times and large storm surges is mainly caused by uncertainty in the meteorological forcings. The EPS spread increases with the rms error of the forecast. For large lead times the EPS spread and the forecast error substantially coincide. However, the EPS spread in this study, which does not account for uncertainty in the initial condition, underestimates the error during the early part of the forecast and for small storm surge values. On the contrary, it overestimates the rms error for large surge values. The PF (probability forecast) of the EPS

  3. Ensemble atmospheric dispersion calculations for decision support systems

    International Nuclear Information System (INIS)

    Borysiewicz, M.; Potempski, S.; Galkowski, A.; Zelazny, R.

    2003-01-01

    This document describes two approaches to long-range atmospheric dispersion of pollutants based on the ensemble concept. In the first part of the report some experiences related to the exercises undertaken under the ENSEMBLE project of the European Union are presented. The second part is devoted to the implementation of mesoscale numerical prediction models RAMS and atmospheric dispersion model HYPACT on Beowulf cluster and theirs usage for ensemble forecasting and long range atmospheric ensemble dispersion calculations based on available meteorological data from NCEO, NOAA (USA). (author)

  4. Benchmarking ensemble streamflow prediction skill in the UK

    Science.gov (United States)

    Harrigan, Shaun; Prudhomme, Christel; Parry, Simon; Smith, Katie; Tanguy, Maliko

    2018-03-01

    ; correlation between catchment base flow index (BFI) and ESP skill was very strong (Spearman's rank correlation coefficient = 0.90 at 1-month lead time). This was in contrast to the more highly responsive catchments in the north and west which were generally not skilful at seasonal lead times. Overall, this work provides scientific justification for when and where use of such a relatively simple forecasting approach is appropriate in the UK. This study, furthermore, creates a low cost benchmark against which potential skill improvements from more sophisticated hydro-meteorological ensemble prediction systems can be judged.

  5. Flood Forecasting Based on TIGGE Precipitation Ensemble Forecast

    Directory of Open Access Journals (Sweden)

    Jinyin Ye

    2016-01-01

    Full Text Available TIGGE (THORPEX International Grand Global Ensemble was a major part of the THORPEX (Observing System Research and Predictability Experiment. It integrates ensemble precipitation products from all the major forecast centers in the world and provides systematic evaluation on the multimodel ensemble prediction system. Development of meteorologic-hydrologic coupled flood forecasting model and early warning model based on the TIGGE precipitation ensemble forecast can provide flood probability forecast, extend the lead time of the flood forecast, and gain more time for decision-makers to make the right decision. In this study, precipitation ensemble forecast products from ECMWF, NCEP, and CMA are used to drive distributed hydrologic model TOPX. We focus on Yi River catchment and aim to build a flood forecast and early warning system. The results show that the meteorologic-hydrologic coupled model can satisfactorily predict the flow-process of four flood events. The predicted occurrence time of peak discharges is close to the observations. However, the magnitude of the peak discharges is significantly different due to various performances of the ensemble prediction systems. The coupled forecasting model can accurately predict occurrence of the peak time and the corresponding risk probability of peak discharge based on the probability distribution of peak time and flood warning, which can provide users a strong theoretical foundation and valuable information as a promising new approach.

  6. Improving Robustness of Hydrologic Ensemble Predictions Through Probabilistic Pre- and Post-Processing in Sequential Data Assimilation

    Science.gov (United States)

    Wang, S.; Ancell, B. C.; Huang, G. H.; Baetz, B. W.

    2018-03-01

    Data assimilation using the ensemble Kalman filter (EnKF) has been increasingly recognized as a promising tool for probabilistic hydrologic predictions. However, little effort has been made to conduct the pre- and post-processing of assimilation experiments, posing a significant challenge in achieving the best performance of hydrologic predictions. This paper presents a unified data assimilation framework for improving the robustness of hydrologic ensemble predictions. Statistical pre-processing of assimilation experiments is conducted through the factorial design and analysis to identify the best EnKF settings with maximized performance. After the data assimilation operation, statistical post-processing analysis is also performed through the factorial polynomial chaos expansion to efficiently address uncertainties in hydrologic predictions, as well as to explicitly reveal potential interactions among model parameters and their contributions to the predictive accuracy. In addition, the Gaussian anamorphosis is used to establish a seamless bridge between data assimilation and uncertainty quantification of hydrologic predictions. Both synthetic and real data assimilation experiments are carried out to demonstrate feasibility and applicability of the proposed methodology in the Guadalupe River basin, Texas. Results suggest that statistical pre- and post-processing of data assimilation experiments provide meaningful insights into the dynamic behavior of hydrologic systems and enhance robustness of hydrologic ensemble predictions.

  7. A deep learning-based multi-model ensemble method for cancer prediction.

    Science.gov (United States)

    Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong

    2018-01-01

    Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. A Prediction Method of Airport Noise Based on Hybrid Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Tao XU

    2014-05-01

    Full Text Available Using monitoring history data to build and to train a prediction model for airport noise is a normal method in recent years. However, the single model built in different ways has various performances in the storage, efficiency and accuracy. In order to predict the noise accurately in some complex environment around airport, this paper presents a prediction method based on hybrid ensemble learning. The proposed method ensembles three algorithms: artificial neural network as an active learner, nearest neighbor as a passive leaner and nonlinear regression as a synthesized learner. The experimental results show that the three learners can meet forecast demands respectively in on- line, near-line and off-line. And the accuracy of prediction is improved by integrating these three learners’ results.

  9. A Diagnostics Tool to detect ensemble forecast system anomaly and guide operational decisions

    Science.gov (United States)

    Park, G. H.; Srivastava, A.; Shrestha, E.; Thiemann, M.; Day, G. N.; Draijer, S.

    2017-12-01

    The hydrologic community is moving toward using ensemble forecasts to take uncertainty into account during the decision-making process. The New York City Department of Environmental Protection (DEP) implements several types of ensemble forecasts in their decision-making process: ensemble products for a statistical model (Hirsch and enhanced Hirsch); the National Weather Service (NWS) Advanced Hydrologic Prediction Service (AHPS) forecasts based on the classical Ensemble Streamflow Prediction (ESP) technique; and the new NWS Hydrologic Ensemble Forecasting Service (HEFS) forecasts. To remove structural error and apply the forecasts to additional forecast points, the DEP post processes both the AHPS and the HEFS forecasts. These ensemble forecasts provide mass quantities of complex data, and drawing conclusions from these forecasts is time-consuming and difficult. The complexity of these forecasts also makes it difficult to identify system failures resulting from poor data, missing forecasts, and server breakdowns. To address these issues, we developed a diagnostic tool that summarizes ensemble forecasts and provides additional information such as historical forecast statistics, forecast skill, and model forcing statistics. This additional information highlights the key information that enables operators to evaluate the forecast in real-time, dynamically interact with the data, and review additional statistics, if needed, to make better decisions. We used Bokeh, a Python interactive visualization library, and a multi-database management system to create this interactive tool. This tool compiles and stores data into HTML pages that allows operators to readily analyze the data with built-in user interaction features. This paper will present a brief description of the ensemble forecasts, forecast verification results, and the intended applications for the diagnostic tool.

  10. On Ensemble Nonlinear Kalman Filtering with Symmetric Analysis Ensembles

    KAUST Repository

    Luo, Xiaodong

    2010-09-19

    The ensemble square root filter (EnSRF) [1, 2, 3, 4] is a popular method for data assimilation in high dimensional systems (e.g., geophysics models). Essentially the EnSRF is a Monte Carlo implementation of the conventional Kalman filter (KF) [5, 6]. It is mainly different from the KF at the prediction steps, where it is some ensembles, rather then the means and covariance matrices, of the system state that are propagated forward. In doing this, the EnSRF is computationally more efficient than the KF, since propagating a covariance matrix forward in high dimensional systems is prohibitively expensive. In addition, the EnSRF is also very convenient in implementation. By propagating the ensembles of the system state, the EnSRF can be directly applied to nonlinear systems without any change in comparison to the assimilation procedures in linear systems. However, by adopting the Monte Carlo method, the EnSRF also incurs certain sampling errors. One way to alleviate this problem is to introduce certain symmetry to the ensembles, which can reduce the sampling errors and spurious modes in evaluation of the means and covariances of the ensembles [7]. In this contribution, we present two methods to produce symmetric ensembles. One is based on the unscented transform [8, 9], which leads to the unscented Kalman filter (UKF) [8, 9] and its variant, the ensemble unscented Kalman filter (EnUKF) [7]. The other is based on Stirling’s interpolation formula (SIF), which results in the divided difference filter (DDF) [10]. Here we propose a simplified divided difference filter (sDDF) in the context of ensemble filtering. The similarity and difference between the sDDF and the EnUKF will be discussed. Numerical experiments will also be conducted to investigate the performance of the sDDF and the EnUKF, and compare them to a well‐established EnSRF, the ensemble transform Kalman filter (ETKF) [2].

  11. Probabilistic Predictions of PM2.5 Using a Novel Ensemble Design for the NAQFC

    Science.gov (United States)

    Kumar, R.; Lee, J. A.; Delle Monache, L.; Alessandrini, S.; Lee, P.

    2017-12-01

    Poor air quality (AQ) in the U.S. is estimated to cause about 60,000 premature deaths with costs of 100B-150B annually. To reduce such losses, the National AQ Forecasting Capability (NAQFC) at the National Oceanic and Atmospheric Administration (NOAA) produces forecasts of ozone, particulate matter less than 2.5 mm in diameter (PM2.5), and other pollutants so that advance notice and warning can be issued to help individuals and communities limit the exposure and reduce air pollution-caused health problems. The current NAQFC, based on the U.S. Environmental Protection Agency Community Multi-scale AQ (CMAQ) modeling system, provides only deterministic AQ forecasts and does not quantify the uncertainty associated with the predictions, which could be large due to the chaotic nature of atmosphere and nonlinearity in atmospheric chemistry. This project aims to take NAQFC a step further in the direction of probabilistic AQ prediction by exploring and quantifying the potential value of ensemble predictions of PM2.5, and perturbing three key aspects of PM2.5 modeling: the meteorology, emissions, and CMAQ secondary organic aerosol formulation. This presentation focuses on the impact of meteorological variability, which is represented by three members of NOAA's Short-Range Ensemble Forecast (SREF) system that were down-selected by hierarchical cluster analysis. These three SREF members provide the physics configurations and initial/boundary conditions for the Weather Research and Forecasting (WRF) model runs that generate required output variables for driving CMAQ that are missing in operational SREF output. We conducted WRF runs for Jan, Apr, Jul, and Oct 2016 to capture seasonal changes in meteorology. Estimated emissions of trace gases and aerosols via the Sparse Matrix Operator Kernel (SMOKE) system were developed using the WRF output. WRF and SMOKE output drive a 3-member CMAQ mini-ensemble of once-daily, 48-h PM2.5 forecasts for the same four months. The CMAQ mini-ensemble

  12. Ensemble-based prediction of RNA secondary structures.

    Science.gov (United States)

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  13. Prediction of Coal Face Gas Concentration by Multi-Scale Selective Ensemble Hybrid Modeling

    Directory of Open Access Journals (Sweden)

    WU Xiang

    2014-06-01

    Full Text Available A selective ensemble hybrid modeling prediction method based on wavelet transformation is proposed to improve the fitting and generalization capability of the existing prediction models of the coal face gas concentration, which has a strong stochastic volatility. Mallat algorithm was employed for the multi-scale decomposition and single-scale reconstruction of the gas concentration time series. Then, it predicted every subsequence by sparsely weighted multi unstable ELM(extreme learning machine predictor within method SERELM(sparse ensemble regressors of ELM. At last, it superimposed the predicted values of these models to obtain the predicted values of the original sequence. The proposed method takes advantage of characteristics of multi scale analysis of wavelet transformation, accuracy and fast characteristics of ELM prediction and the generalization ability of L1 regularized selective ensemble learning method. The results show that the forecast accuracy has large increase by using the proposed method. The average relative error is 0.65%, the maximum relative error is 4.16% and the probability of relative error less than 1% reaches 0.785.

  14. Impacts of calibration strategies and ensemble methods on ensemble flood forecasting over Lanjiang basin, Southeast China

    Science.gov (United States)

    Liu, Li; Xu, Yue-Ping

    2017-04-01

    Ensemble flood forecasting driven by numerical weather prediction products is becoming more commonly used in operational flood forecasting applications.In this study, a hydrological ensemble flood forecasting system based on Variable Infiltration Capacity (VIC) model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated.The hydrological model is optimized by parallel programmed ɛ-NSGAII multi-objective algorithm and two respectively parameterized models are determined to simulate daily flows and peak flows coupled with a modular approach.The results indicatethat the ɛ-NSGAII algorithm permits more efficient optimization and rational determination on parameter setting.It is demonstrated that the multimodel ensemble streamflow mean have better skills than the best singlemodel ensemble mean (ECMWF) and the multimodel ensembles weighted on members and skill scores outperform other multimodel ensembles. For typical flood event, it is proved that the flood can be predicted 3-4 days in advance, but the flows in rising limb can be captured with only 1-2 days ahead due to the flash feature. With respect to peak flows selected by Peaks Over Threshold approach, the ensemble means from either singlemodel or multimodels are generally underestimated as the extreme values are smoothed out by ensemble process.

  15. An Integrated Ensemble-Based Operational Framework to Predict Urban Flooding: A Case Study of Hurricane Sandy in the Passaic and Hackensack River Basins

    Science.gov (United States)

    Saleh, F.; Ramaswamy, V.; Georgas, N.; Blumberg, A. F.; Wang, Y.

    2016-12-01

    Advances in computational resources and modeling techniques are opening the path to effectively integrate existing complex models. In the context of flood prediction, recent extreme events have demonstrated the importance of integrating components of the hydrosystem to better represent the interactions amongst different physical processes and phenomena. As such, there is a pressing need to develop holistic and cross-disciplinary modeling frameworks that effectively integrate existing models and better represent the operative dynamics. This work presents a novel Hydrologic-Hydraulic-Hydrodynamic Ensemble (H3E) flood prediction framework that operationally integrates existing predictive models representing coastal (New York Harbor Observing and Prediction System, NYHOPS), hydrologic (US Army Corps of Engineers Hydrologic Modeling System, HEC-HMS) and hydraulic (2-dimensional River Analysis System, HEC-RAS) components. The state-of-the-art framework is forced with 125 ensemble meteorological inputs from numerical weather prediction models including the Global Ensemble Forecast System, the European Centre for Medium-Range Weather Forecasts (ECMWF), the Canadian Meteorological Centre (CMC), the Short Range Ensemble Forecast (SREF) and the North American Mesoscale Forecast System (NAM). The framework produces, within a 96-hour forecast horizon, on-the-fly Google Earth flood maps that provide critical information for decision makers and emergency preparedness managers. The utility of the framework was demonstrated by retrospectively forecasting an extreme flood event, hurricane Sandy in the Passaic and Hackensack watersheds (New Jersey, USA). Hurricane Sandy caused significant damage to a number of critical facilities in this area including the New Jersey Transit's main storage and maintenance facility. The results of this work demonstrate that ensemble based frameworks provide improved flood predictions and useful information about associated uncertainties, thus

  16. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods.

    Science.gov (United States)

    Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng

    2017-05-18

    Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).

  17. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  18. Multiple-Swarm Ensembles: Improving the Predictive Power and Robustness of Predictive Models and Its Use in Computational Biology.

    Science.gov (United States)

    Alves, Pedro; Liu, Shuang; Wang, Daifeng; Gerstein, Mark

    2018-01-01

    Machine learning is an integral part of computational biology, and has already shown its use in various applications, such as prognostic tests. In the last few years in the non-biological machine learning community, ensembling techniques have shown their power in data mining competitions such as the Netflix challenge; however, such methods have not found wide use in computational biology. In this work, we endeavor to show how ensembling techniques can be applied to practical problems, including problems in the field of bioinformatics, and how they often outperform other machine learning techniques in both predictive power and robustness. Furthermore, we develop a methodology of ensembling, Multi-Swarm Ensemble (MSWE) by using multiple particle swarm optimizations and demonstrate its ability to further enhance the performance of ensembles.

  19. Revisiting the synoptic-scale predictability of severe European winter storms using ECMWF ensemble reforecasts

    Directory of Open Access Journals (Sweden)

    F. Pantillon

    2017-10-01

    Full Text Available New insights into the synoptic-scale predictability of 25 severe European winter storms of the 1995–2015 period are obtained using the homogeneous ensemble reforecast dataset from the European Centre for Medium-Range Weather Forecasts. The predictability of the storms is assessed with different metrics including (a the track and intensity to investigate the storms' dynamics and (b the Storm Severity Index to estimate the impact of the associated wind gusts. The storms are well predicted by the whole ensemble up to 2–4 days ahead. At longer lead times, the number of members predicting the observed storms decreases and the ensemble average is not clearly defined for the track and intensity. The Extreme Forecast Index and Shift of Tails are therefore computed from the deviation of the ensemble from the model climate. Based on these indices, the model has some skill in forecasting the area covered by extreme wind gusts up to 10 days, which indicates a clear potential for early warnings. However, large variability is found between the individual storms. The poor predictability of outliers appears related to their physical characteristics such as explosive intensification or small size. Longer datasets with more cases would be needed to further substantiate these points.

  20. Enhancing Predictive Accuracy of Cardiac Autonomic Neuropathy Using Blood Biochemistry Features and Iterative Multitier Ensembles.

    Science.gov (United States)

    Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F

    2016-01-01

    Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.

  1. Visualizing uncertainties in a storm surge ensemble data assimilation and forecasting system

    KAUST Repository

    Hollt, Thomas

    2015-01-15

    We present a novel integrated visualization system that enables the interactive visual analysis of ensemble simulations and estimates of the sea surface height and other model variables that are used for storm surge prediction. Coastal inundation, caused by hurricanes and tropical storms, poses large risks for today\\'s societies. High-fidelity numerical models of water levels driven by hurricane-force winds are required to predict these events, posing a challenging computational problem, and even though computational models continue to improve, uncertainties in storm surge forecasts are inevitable. Today, this uncertainty is often exposed to the user by running the simulation many times with different parameters or inputs following a Monte-Carlo framework in which uncertainties are represented as stochastic quantities. This results in multidimensional, multivariate and multivalued data, so-called ensemble data. While the resulting datasets are very comprehensive, they are also huge in size and thus hard to visualize and interpret. In this paper, we tackle this problem by means of an interactive and integrated visual analysis system. By harnessing the power of modern graphics processing units for visualization as well as computation, our system allows the user to browse through the simulation ensembles in real time, view specific parameter settings or simulation models and move between different spatial and temporal regions without delay. In addition, our system provides advanced visualizations to highlight the uncertainty or show the complete distribution of the simulations at user-defined positions over the complete time series of the prediction. We highlight the benefits of our system by presenting its application in a real-world scenario using a simulation of Hurricane Ike.

  2. Ensemble learned vaccination uptake prediction using web search queries

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official...... vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the ?first study to predict vaccination uptake...

  3. An ensemble method to predict target genes and pathways in uveal melanoma

    Directory of Open Access Journals (Sweden)

    Wei Chao

    2018-04-01

    Full Text Available This work proposes to predict target genes and pathways for uveal melanoma (UM based on an ensemble method and pathway analyses. Methods: The ensemble method integrated a correlation method (Pearson correlation coefficient, PCC, a causal inference method (IDA and a regression method (Lasso utilizing the Borda count election method. Subsequently, to validate the performance of PIL method, comparisons between confirmed database and predicted miRNA targets were performed. Ultimately, pathway enrichment analysis was conducted on target genes in top 1000 miRNA-mRNA interactions to identify target pathways for UM patients. Results: Thirty eight of the predicted interactions were matched with the confirmed interactions, indicating that the ensemble method was a suitable and feasible approach to predict miRNA targets. We obtained 50 seed miRNA-mRNA interactions of UM patients and extracted target genes from these interactions, such as ASPG, BSDC1 and C4BP. The 601 target genes in top 1,000 miRNA-mRNA interactions were enriched in 12 target pathways, of which Phototransduction was the most significant one. Conclusion: The target genes and pathways might provide a new way to reveal the molecular mechanism of UM and give hand for target treatments and preventions of this malignant tumor.

  4. ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers

    Directory of Open Access Journals (Sweden)

    Sangeeta Lal

    2017-01-01

    Full Text Available Background: Software developers insert log statements in the source code to record program execution information. However, optimizing the number of log statements in the source code is challenging. Machine learning based within-project logging prediction tools, proposed in previous studies, may not be suitable for new or small software projects. For such software projects, we can use cross-project logging prediction. Aim: The aim of the study presented here is to investigate cross-project logging prediction methods and techniques. Method: The proposed method is ECLogger, which is a novel, ensemble-based, cross-project, catch-block logging prediction model. In the research We use 9 base classifiers were used and combined using ensemble techniques. The performance of ECLogger was evaluated on on three open-source Java projects: Tomcat, CloudStack and Hadoop. Results: ECLogger Bagging, ECLogger AverageVote, and ECLogger MajorityVote show a considerable improvement in the average Logged F-measure (LF on 3, 5, and 4 source -> target project pairs, respectively, compared to the baseline classifiers. ECLogger AverageVote performs best and shows improvements of 3.12% (average LF and 6.08% (average ACC – Accuracy. Conclusion: The classifier based on ensemble techniques, such as bagging, average vote, and majority vote outperforms the baseline classifier. Overall, the ECLogger AverageVote model performs best. The results show that the CloudStack project is more generalizable than the other projects.

  5. Ensemble Methods

    Science.gov (United States)

    Re, Matteo; Valentini, Giorgio

    2012-03-01

    Ensemble methods are statistical and computational learning procedures reminiscent of the human social learning behavior of seeking several opinions before making any crucial decision. The idea of combining the opinions of different "experts" to obtain an overall “ensemble” decision is rooted in our culture at least from the classical age of ancient Greece, and it has been formalized during the Enlightenment with the Condorcet Jury Theorem[45]), which proved that the judgment of a committee is superior to those of individuals, provided the individuals have reasonable competence. Ensembles are sets of learning machines that combine in some way their decisions, or their learning algorithms, or different views of data, or other specific characteristics to obtain more reliable and more accurate predictions in supervised and unsupervised learning problems [48,116]. A simple example is represented by the majority vote ensemble, by which the decisions of different learning machines are combined, and the class that receives the majority of “votes” (i.e., the class predicted by the majority of the learning machines) is the class predicted by the overall ensemble [158]. In the literature, a plethora of terms other than ensembles has been used, such as fusion, combination, aggregation, and committee, to indicate sets of learning machines that work together to solve a machine learning problem [19,40,56,66,99,108,123], but in this chapter we maintain the term ensemble in its widest meaning, in order to include the whole range of combination methods. Nowadays, ensemble methods represent one of the main current research lines in machine learning [48,116], and the interest of the research community on ensemble methods is witnessed by conferences and workshops specifically devoted to ensembles, first of all the multiple classifier systems (MCS) conference organized by Roli, Kittler, Windeatt, and other researchers of this area [14,62,85,149,173]. Several theories have been

  6. Predicting lymphatic filariasis transmission and elimination dynamics using a multi-model ensemble framework

    Directory of Open Access Journals (Sweden)

    Morgan E. Smith

    2017-03-01

    Full Text Available Mathematical models of parasite transmission provide powerful tools for assessing the impacts of interventions. Owing to complexity and uncertainty, no single model may capture all features of transmission and elimination dynamics. Multi-model ensemble modelling offers a framework to help overcome biases of single models. We report on the development of a first multi-model ensemble of three lymphatic filariasis (LF models (EPIFIL, LYMFASIM, and TRANSFIL, and evaluate its predictive performance in comparison with that of the constituents using calibration and validation data from three case study sites, one each from the three major LF endemic regions: Africa, Southeast Asia and Papua New Guinea (PNG. We assessed the performance of the respective models for predicting the outcomes of annual MDA strategies for various baseline scenarios thought to exemplify the current endemic conditions in the three regions. The results show that the constructed multi-model ensemble outperformed the single models when evaluated across all sites. Single models that best fitted calibration data tended to do less well in simulating the out-of-sample, or validation, intervention data. Scenario modelling results demonstrate that the multi-model ensemble is able to compensate for variance between single models in order to produce more plausible predictions of intervention impacts. Our results highlight the value of an ensemble approach to modelling parasite control dynamics. However, its optimal use will require further methodological improvements as well as consideration of the organizational mechanisms required to ensure that modelling results and data are shared effectively between all stakeholders.

  7. Using synchronization in multi-model ensembles to improve prediction

    Science.gov (United States)

    Hiemstra, P.; Selten, F.

    2012-04-01

    In recent decades, many climate models have been developed to understand and predict the behavior of the Earth's climate system. Although these models are all based on the same basic physical principles, they still show different behavior. This is for example caused by the choice of how to parametrize sub-grid scale processes. One method to combine these imperfect models, is to run a multi-model ensemble. The models are given identical initial conditions and are integrated forward in time. A multi-model estimate can for example be a weighted mean of the ensemble members. We propose to go a step further, and try to obtain synchronization between the imperfect models by connecting the multi-model ensemble, and exchanging information. The combined multi-model ensemble is also known as a supermodel. The supermodel has learned from observations how to optimally exchange information between the ensemble members. In this study we focused on the density and formulation of the onnections within the supermodel. The main question was whether we could obtain syn-chronization between two climate models when connecting only a subset of their state spaces. Limiting the connected subspace has two advantages: 1) it limits the transfer of data (bytes) between the ensemble, which can be a limiting factor in large scale climate models, and 2) learning the optimal connection strategy from observations is easier. To answer the research question, we connected two identical quasi-geostrohic (QG) atmospheric models to each other, where the model have different initial conditions. The QG model is a qualitatively realistic simulation of the winter flow on the Northern hemisphere, has three layers and uses a spectral imple-mentation. We connected the models in the original spherical harmonical state space, and in linear combinations of these spherical harmonics, i.e. Empirical Orthogonal Functions (EOFs). We show that when connecting through spherical harmonics, we only need to connect 28% of

  8. Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing

    KAUST Repository

    Toye, Habib

    2017-05-26

    We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.

  9. Global Ensemble Forecast System (GEFS) [1 Deg.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...

  10. Sea surface temperature predictions using a multi-ocean analysis ensemble scheme

    Science.gov (United States)

    Zhang, Ying; Zhu, Jieshun; Li, Zhongxian; Chen, Haishan; Zeng, Gang

    2017-08-01

    This study examined the global sea surface temperature (SST) predictions by a so-called multiple-ocean analysis ensemble (MAE) initialization method which was applied in the National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2). Different from most operational climate prediction practices which are initialized by a specific ocean analysis system, the MAE method is based on multiple ocean analyses. In the paper, the MAE method was first justified by analyzing the ocean temperature variability in four ocean analyses which all are/were applied for operational climate predictions either at the European Centre for Medium-range Weather Forecasts or at NCEP. It was found that these systems exhibit substantial uncertainties in estimating the ocean states, especially at the deep layers. Further, a set of MAE hindcasts was conducted based on the four ocean analyses with CFSv2, starting from each April during 1982-2007. The MAE hindcasts were verified against a subset of hindcasts from the NCEP CFS Reanalysis and Reforecast (CFSRR) Project. Comparisons suggested that MAE shows better SST predictions than CFSRR over most regions where ocean dynamics plays a vital role in SST evolutions, such as the El Niño and Atlantic Niño regions. Furthermore, significant improvements were also found in summer precipitation predictions over the equatorial eastern Pacific and Atlantic oceans, for which the local SST prediction improvements should be responsible. The prediction improvements by MAE imply a problem for most current climate predictions which are based on a specific ocean analysis system. That is, their predictions would drift towards states biased by errors inherent in their ocean initialization system, and thus have large prediction errors. In contrast, MAE arguably has an advantage by sampling such structural uncertainties, and could efficiently cancel these errors out in their predictions.

  11. Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life

    International Nuclear Information System (INIS)

    Hu Chao; Youn, Byeng D.; Wang Pingfeng; Taek Yoon, Joung

    2012-01-01

    Prognostics aims at determining whether a failure of an engineered system (e.g., a nuclear power plant) is impending and estimating the remaining useful life (RUL) before the failure occurs. The traditional data-driven prognostic approach is to construct multiple candidate algorithms using a training data set, evaluate their respective performance using a testing data set, and select the one with the best performance while discarding all the others. This approach has three shortcomings: (i) the selected standalone algorithm may not be robust; (ii) it wastes the resources for constructing the algorithms that are discarded; (iii) it requires the testing data in addition to the training data. To overcome these drawbacks, this paper proposes an ensemble data-driven prognostic approach which combines multiple member algorithms with a weighted-sum formulation. Three weighting schemes, namely the accuracy-based weighting, diversity-based weighting and optimization-based weighting, are proposed to determine the weights of member algorithms. The k-fold cross validation (CV) is employed to estimate the prediction error required by the weighting schemes. The results obtained from three case studies suggest that the ensemble approach with any weighting scheme gives more accurate RUL predictions compared to any sole algorithm when member algorithms producing diverse RUL predictions have comparable prediction accuracy and that the optimization-based weighting scheme gives the best overall performance among the three weighting schemes.

  12. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    Science.gov (United States)

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  13. Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods.

    Science.gov (United States)

    Notaro, Marco; Schubach, Max; Robinson, Peter N; Valentini, Giorgio

    2017-10-12

    The prediction of human gene-abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene-disease associations has been widely investigated, the related problem of gene-phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a "flat" learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository.

  14. Wind power application research on the fusion of the determination and ensemble prediction

    Science.gov (United States)

    Lan, Shi; Lina, Xu; Yuzhu, Hao

    2017-07-01

    The fused product of wind speed for the wind farm is designed through the use of wind speed products of ensemble prediction from the European Centre for Medium-Range Weather Forecasts (ECMWF) and professional numerical model products on wind power based on Mesoscale Model5 (MM5) and Beijing Rapid Update Cycle (BJ-RUC), which are suitable for short-term wind power forecasting and electric dispatch. The single-valued forecast is formed by calculating the different ensemble statistics of the Bayesian probabilistic forecasting representing the uncertainty of ECMWF ensemble prediction. Using autoregressive integrated moving average (ARIMA) model to improve the time resolution of the single-valued forecast, and based on the Bayesian model averaging (BMA) and the deterministic numerical model prediction, the optimal wind speed forecasting curve and the confidence interval are provided. The result shows that the fusion forecast has made obvious improvement to the accuracy relative to the existing numerical forecasting products. Compared with the 0-24 h existing deterministic forecast in the validation period, the mean absolute error (MAE) is decreased by 24.3 % and the correlation coefficient (R) is increased by 12.5 %. In comparison with the ECMWF ensemble forecast, the MAE is reduced by 11.7 %, and R is increased 14.5 %. Additionally, MAE did not increase with the prolongation of the forecast ahead.

  15. HEPS4Power - Extended-range Hydrometeorological Ensemble Predictions for Improved Hydropower Operations and Revenues

    Science.gov (United States)

    Bogner, Konrad; Monhart, Samuel; Liniger, Mark; Spririg, Christoph; Jordan, Fred; Zappa, Massimiliano

    2015-04-01

    In recent years large progresses have been achieved in the operational prediction of floods and hydrological drought with up to ten days lead time. Both the public and the private sectors are currently using probabilistic runoff forecast in order to monitoring water resources and take actions when critical conditions are to be expected. The use of extended-range predictions with lead times exceeding 10 days is not yet established. The hydropower sector in particular might have large benefits from using hydro meteorological forecasts for the next 15 to 60 days in order to optimize the operations and the revenues from their watersheds, dams, captions, turbines and pumps. The new Swiss Competence Centers in Energy Research (SCCER) targets at boosting research related to energy issues in Switzerland. The objective of HEPS4POWER is to demonstrate that operational extended-range hydro meteorological forecasts have the potential to become very valuable tools for fine tuning the production of energy from hydropower systems. The project team covers a specific system-oriented value chain starting from the collection and forecast of meteorological data (MeteoSwiss), leading to the operational application of state-of-the-art hydrological models (WSL) and terminating with the experience in data presentation and power production forecasts for end-users (e-dric.ch). The first task of the HEPS4POWER will be the downscaling and post-processing of ensemble extended-range meteorological forecasts (EPS). The goal is to provide well-tailored forecasts of probabilistic nature that should be reliable in statistical and localized at catchment or even station level. The hydrology related task will consist in feeding the post-processed meteorological forecasts into a HEPS using a multi-model approach by implementing models with different complexity. Also in the case of the hydrological ensemble predictions, post-processing techniques need to be tested in order to improve the quality of the

  16. A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico

    KAUST Repository

    Hoteit, Ibrahim; Hoar, Timothy J.; Gopalakrishnan, Ganesh; Collins, Nancy S.; Anderson, Jeffrey L.; Cornuelle, Bruce D.; Kö hl, Armin; Heimbach, Patrick

    2013-01-01

    Research Testbed (DART) assimilation package with the Massachusetts Institute of Technology ocean general circulation model (MITgcm). The MITgcm/DART system supports the assimilation of a wide range of ocean observations and uses an ensemble approach

  17. Assessing probabilistic predictions of ENSO phase and intensity from the North American Multimodel Ensemble

    Science.gov (United States)

    Tippett, Michael K.; Ranganathan, Meghana; L'Heureux, Michelle; Barnston, Anthony G.; DelSole, Timothy

    2017-05-01

    Here we examine the skill of three, five, and seven-category monthly ENSO probability forecasts (1982-2015) from single and multi-model ensemble integrations of the North American Multimodel Ensemble (NMME) project. Three-category forecasts are typical and provide probabilities for the ENSO phase (El Niño, La Niña or neutral). Additional forecast categories indicate the likelihood of ENSO conditions being weak, moderate or strong. The level of skill observed for differing numbers of forecast categories can help to determine the appropriate degree of forecast precision. However, the dependence of the skill score itself on the number of forecast categories must be taken into account. For reliable forecasts with same quality, the ranked probability skill score (RPSS) is fairly insensitive to the number of categories, while the logarithmic skill score (LSS) is an information measure and increases as categories are added. The ignorance skill score decreases to zero as forecast categories are added, regardless of skill level. For all models, forecast formats and skill scores, the northern spring predictability barrier explains much of the dependence of skill on target month and forecast lead. RPSS values for monthly ENSO forecasts show little dependence on the number of categories. However, the LSS of multimodel ensemble forecasts with five and seven categories show statistically significant advantages over the three-category forecasts for the targets and leads that are least affected by the spring predictability barrier. These findings indicate that current prediction systems are capable of providing more detailed probabilistic forecasts of ENSO phase and amplitude than are typically provided.

  18. A short-term ensemble wind speed forecasting system for wind power applications

    Science.gov (United States)

    Baidya Roy, S.; Traiteur, J. J.; Callicutt, D.; Smith, M.

    2011-12-01

    This study develops an adaptive, blended forecasting system to provide accurate wind speed forecasts 1 hour ahead of time for wind power applications. The system consists of an ensemble of 21 forecasts with different configurations of the Weather Research and Forecasting Single Column Model (WRFSCM) and a persistence model. The ensemble is calibrated against observations for a 2 month period (June-July, 2008) at a potential wind farm site in Illinois using the Bayesian Model Averaging (BMA) technique. The forecasting system is evaluated against observations for August 2008 at the same site. The calibrated ensemble forecasts significantly outperform the forecasts from the uncalibrated ensemble while significantly reducing forecast uncertainty under all environmental stability conditions. The system also generates significantly better forecasts than persistence, autoregressive (AR) and autoregressive moving average (ARMA) models during the morning transition and the diurnal convective regimes. This forecasting system is computationally more efficient than traditional numerical weather prediction models and can generate a calibrated forecast, including model runs and calibration, in approximately 1 minute. Currently, hour-ahead wind speed forecasts are almost exclusively produced using statistical models. However, numerical models have several distinct advantages over statistical models including the potential to provide turbulence forecasts. Hence, there is an urgent need to explore the role of numerical models in short-term wind speed forecasting. This work is a step in that direction and is likely to trigger a debate within the wind speed forecasting community.

  19. Dynamical predictive power of the generalized Gibbs ensemble revealed in a second quench.

    Science.gov (United States)

    Zhang, J M; Cui, F C; Hu, Jiangping

    2012-04-01

    We show that a quenched and relaxed completely integrable system is hardly distinguishable from the corresponding generalized Gibbs ensemble in a dynamical sense. To be specific, the response of the quenched and relaxed system to a second quench can be accurately reproduced by using the generalized Gibbs ensemble as a substitute. Remarkably, as demonstrated with the transverse Ising model and the hard-core bosons in one dimension, not only the steady values but even the transient, relaxation dynamics of the physical variables can be accurately reproduced by using the generalized Gibbs ensemble as a pseudoinitial state. This result is an important complement to the previously established result that a quenched and relaxed system is hardly distinguishable from the generalized Gibbs ensemble in a static sense. The relevance of the generalized Gibbs ensemble in the nonequilibrium dynamics of completely integrable systems is then greatly strengthened.

  20. Predicting gene function using hierarchical multi-label decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Kocev Dragi

    2010-01-01

    Full Text Available Abstract Background S. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO. We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.

  1. Simultaneous calibration of ensemble river flow predictions over an entire range of lead times

    Science.gov (United States)

    Hemri, S.; Fundel, F.; Zappa, M.

    2013-10-01

    Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.

  2. Urban runoff forecasting with ensemble weather predictions

    DEFF Research Database (Denmark)

    Pedersen, Jonas Wied; Courdent, Vianney Augustin Thomas; Vezzaro, Luca

    This research shows how ensemble weather forecasts can be used to generate urban runoff forecasts up to 53 hours into the future. The results highlight systematic differences between ensemble members that needs to be accounted for when these forecasts are used in practice.......This research shows how ensemble weather forecasts can be used to generate urban runoff forecasts up to 53 hours into the future. The results highlight systematic differences between ensemble members that needs to be accounted for when these forecasts are used in practice....

  3. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkyu, E-mail: sangkyu.lee@mail.mcgill.ca; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam [Medical Physics Unit, McGill University, Montreal, Quebec H3G1A4 (Canada); Faria, Sergio; Kopek, Neil; Brisebois, Pascale [Department of Radiation Oncology, Montreal General Hospital, Montreal, H3G1A4 (Canada); Bradley, Jeffrey D.; Robinson, Clifford [Radiation Oncology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110 (United States)

    2015-05-15

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  4. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    International Nuclear Information System (INIS)

    Lee, Sangkyu; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam; Faria, Sergio; Kopek, Neil; Brisebois, Pascale; Bradley, Jeffrey D.; Robinson, Clifford

    2015-01-01

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  5. A novel least squares support vector machine ensemble model for NOx emission prediction of a coal-fired boiler

    International Nuclear Information System (INIS)

    Lv, You; Liu, Jizhen; Yang, Tingting; Zeng, Deliang

    2013-01-01

    Real operation data of power plants are inclined to be concentrated in some local areas because of the operators’ habits and control system design. In this paper, a novel least squares support vector machine (LSSVM)-based ensemble learning paradigm is proposed to predict NO x emission of a coal-fired boiler using real operation data. In view of the plant data characteristics, a soft fuzzy c-means cluster algorithm is proposed to decompose the original data and guarantee the diversity of individual learners. Subsequently the base LSSVM is trained in each individual subset to solve the subtask. Finally, partial least squares (PLS) is applied as the combination strategy to eliminate the collinear and redundant information of the base learners. Considering that the fuzzy membership also has an effect on the ensemble output, the membership degree is added as one of the variables of the combiner. The single LSSVM and other ensemble models using different decomposition and combination strategies are also established to make a comparison. The result shows that the new soft FCM-LSSVM-PLS ensemble method can predict NO x emission accurately. Besides, because of the divide and conquer frame, the total time consumed in the searching the parameters and training also decreases evidently. - Highlights: • A novel LSSVM ensemble model to predict NO x emissions is presented. • LSSVM is used as the base learner and PLS is employed as the combiner. • The model is applied to process data from a 660 MW coal-fired boiler. • The generalization ability of the model is enhanced. • The time consuming in training and searching the parameters decreases sharply

  6. Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach.

    Science.gov (United States)

    Awad, Aya; Bader-El-Den, Mohamed; McNicholas, James; Briggs, Jim

    2017-12-01

    Mortality prediction of hospitalized patients is an important problem. Over the past few decades, several severity scoring systems and machine learning mortality prediction models have been developed for predicting hospital mortality. By contrast, early mortality prediction for intensive care unit patients remains an open challenge. Most research has focused on severity of illness scoring systems or data mining (DM) models designed for risk estimation at least 24 or 48h after ICU admission. This study highlights the main data challenges in early mortality prediction in ICU patients and introduces a new machine learning based framework for Early Mortality Prediction for Intensive Care Unit patients (EMPICU). The proposed method is evaluated on the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database. Mortality prediction models are developed for patients at the age of 16 or above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU). We employ the ensemble learning Random Forest (RF), the predictive Decision Trees (DT), the probabilistic Naive Bayes (NB) and the rule-based Projective Adaptive Resonance Theory (PART) models. The primary outcome was hospital mortality. The explanatory variables included demographic, physiological, vital signs and laboratory test variables. Performance measures were calculated using cross-validated area under the receiver operating characteristic curve (AUROC) to minimize bias. 11,722 patients with single ICU stays are considered. Only patients at the age of 16 years old and above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU) are considered in this study. The proposed EMPICU framework outperformed standard scoring systems (SOFA, SAPS-I, APACHE-II, NEWS and qSOFA) in terms of AUROC and time (i.e. at 6h compared to 48h or more after admission). The results show that although there are many values missing in the first few hour of ICU admission

  7. Numerical climate modeling and verification of selected areas for heat waves of Pakistan using ensemble prediction system

    International Nuclear Information System (INIS)

    Amna, S; Samreen, N; Khalid, B; Shamim, A

    2013-01-01

    Depending upon the topography, there is an extreme variation in the temperature of Pakistan. Heat waves are the Weather-related events, having significant impact on the humans, including all socioeconomic activities and health issues as well which changes according to the climatic conditions of the area. The forecasting climate is of prime importance for being aware of future climatic changes, in order to mitigate them. The study used the Ensemble Prediction System (EPS) for the purpose of modeling seasonal weather hind-cast of three selected areas i.e., Islamabad, Jhelum and Muzaffarabad. This research was purposely carried out in order to suggest the most suitable climate model for Pakistan. Real time and simulated data of five General Circulation Models i.e., ECMWF, ERA-40, MPI, Meteo France and UKMO for selected areas was acquired from Pakistan Meteorological Department. Data incorporated constituted the statistical temperature records of 32 years for the months of June, July and August. This study was based on EPS to calculate probabilistic forecasts produced by single ensembles. Verification was done out to assess the quality of the forecast t by using standard probabilistic measures of Brier Score, Brier Skill Score, Cross Validation and Relative Operating Characteristic curve. The results showed ECMWF the most suitable model for Islamabad and Jhelum; and Meteo France for Muzaffarabad. Other models have significant results by omitting particular initial conditions.

  8. Evaluation of the Plant-Craig stochastic convection scheme in an ensemble forecasting system

    Science.gov (United States)

    Keane, R. J.; Plant, R. S.; Tennant, W. J.

    2015-12-01

    The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic element only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.

  9. Global Ensemble Forecast System (GEFS) [2.5 Deg.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...

  10. The Ensembl genome database project.

    Science.gov (United States)

    Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

    2002-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

  11. Evaluation of medium-range ensemble flood forecasting based on calibration strategies and ensemble methods in Lanjiang Basin, Southeast China

    Science.gov (United States)

    Liu, Li; Gao, Chao; Xuan, Weidong; Xu, Yue-Ping

    2017-11-01

    Ensemble flood forecasts by hydrological models using numerical weather prediction products as forcing data are becoming more commonly used in operational flood forecasting applications. In this study, a hydrological ensemble flood forecasting system comprised of an automatically calibrated Variable Infiltration Capacity model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated. The hydrological model is optimized by the parallel programmed ε-NSGA II multi-objective algorithm. According to the solutions by ε-NSGA II, two differently parameterized models are determined to simulate daily flows and peak flows at each of the three hydrological stations. Then a simple yet effective modular approach is proposed to combine these daily and peak flows at the same station into one composite series. Five ensemble methods and various evaluation metrics are adopted. The results show that ε-NSGA II can provide an objective determination on parameter estimation, and the parallel program permits a more efficient simulation. It is also demonstrated that the forecasts from ECMWF have more favorable skill scores than other Ensemble Prediction Systems. The multimodel ensembles have advantages over all the single model ensembles and the multimodel methods weighted on members and skill scores outperform other methods. Furthermore, the overall performance at three stations can be satisfactory up to ten days, however the hydrological errors can degrade the skill score by approximately 2 days, and the influence persists until a lead time of 10 days with a weakening trend. With respect to peak flows selected by the Peaks Over Threshold approach, the ensemble means from single models or multimodels are generally underestimated, indicating that the ensemble mean can bring overall improvement in forecasting of flows. For

  12. Ensemble Streamflow Prediction in Korea: Past and Future 5 Years

    Science.gov (United States)

    Jeong, D.; Kim, Y.; Lee, J.

    2005-05-01

    The Ensemble Streamflow Prediction (ESP) approach was first introduced in 2000 by the Hydrology Research Group (HRG) at Seoul National University as an alternative probabilistic forecasting technique for improving the 'Water Supply Outlook' That is issued every month by the Ministry of Construction and Transportation in Korea. That study motivated the Korea Water Resources Corporation (KOWACO) to establish their seasonal probabilistic forecasting system for the 5 major river basins using the ESP approach. In cooperation with the HRG, the KOWACO developed monthly optimal multi-reservoir operating systems for the Geum river basin in 2004, which coupled the ESP forecasts with an optimization model using sampling stochastic dynamic programming. The user interfaces for both ESP and SSDP have also been designed for the developed computer systems to become more practical. More projects for developing ESP systems to the other 3 major river basins (i.e. the Nakdong, Han and Seomjin river basins) was also completed by the HRG and KOWACO at the end of December 2004. Therefore, the ESP system has become the most important mid- and long-term streamflow forecast technique in Korea. In addition to the practical aspects, resent research experience on ESP has raised some concerns into ways of improving the accuracy of ESP in Korea. Jeong and Kim (2002) performed an error analysis on its resulting probabilistic forecasts and found that the modeling error is dominant in the dry season, while the meteorological error is dominant in the flood season. To address the first issue, Kim et al. (2004) tested various combinations and/or combining techniques and showed that the ESP probabilistic accuracy could be improved considerably during the dry season when the hydrologic models were combined and/or corrected. In addition, an attempt was also made to improve the ESP accuracy for the flood season using climate forecast information. This ongoing project handles three types of climate

  13. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Science.gov (United States)

    Drzewiecki, Wojciech

    2016-12-01

    In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.

  14. Spam comments prediction using stacking with ensemble learning

    Science.gov (United States)

    Mehmood, Arif; On, Byung-Won; Lee, Ingyu; Ashraf, Imran; Choi, Gyu Sang

    2018-01-01

    Illusive comments of product or services are misleading for people in decision making. The current methodologies to predict deceptive comments are concerned for feature designing with single training model. Indigenous features have ability to show some linguistic phenomena but are hard to reveal the latent semantic meaning of the comments. We propose a prediction model on general features of documents using stacking with ensemble learning. Term Frequency/Inverse Document Frequency (TF/IDF) features are inputs to stacking of Random Forest and Gradient Boosted Trees and the outputs of the base learners are encapsulated with decision tree to make final training of the model. The results exhibits that our approach gives the accuracy of 92.19% which outperform the state-of-the-art method.

  15. Enhancing COSMO-DE ensemble forecasts by inexpensive techniques

    Directory of Open Access Journals (Sweden)

    Zied Ben Bouallègue

    2013-02-01

    Full Text Available COSMO-DE-EPS, a convection-permitting ensemble prediction system based on the high-resolution numerical weather prediction model COSMO-DE, is pre-operational since December 2010, providing probabilistic forecasts which cover Germany. This ensemble system comprises 20 members based on variations of the lateral boundary conditions, the physics parameterizations and the initial conditions. In order to increase the sample size in a computationally inexpensive way, COSMO-DE-EPS is combined with alternative ensemble techniques: the neighborhood method and the time-lagged approach. Their impact on the quality of the resulting probabilistic forecasts is assessed. Objective verification is performed over a six months period, scores based on the Brier score and its decomposition are shown for June 2011. The combination of the ensemble system with the alternative approaches improves probabilistic forecasts of precipitation in particular for high precipitation thresholds. Moreover, combining COSMO-DE-EPS with only the time-lagged approach improves the skill of area probabilities for precipitation and does not deteriorate the skill of 2 m-temperature and wind gusts forecasts.

  16. Ensemble system for Part-of-Speech tagging

    OpenAIRE

    Dell'Orletta, Felice

    2009-01-01

    The paper contains a description of the Felice-POS-Tagger and of its performance in Evalita 2009. Felice-POS-Tagger is an ensemble system that combines six different POS taggers. When evaluated on the official test set, the ensemble system outperforms each of the single tagger components and achieves the highest accuracy score in Evalita 2009 POS Closed Task. It is shown rst that the errors made from the dierent taggers are complementary, and then how to use this complementary behavior to the...

  17. The prediction of surface temperature in the new seasonal prediction system based on the MPI-ESM coupled climate model

    Science.gov (United States)

    Baehr, J.; Fröhlich, K.; Botzet, M.; Domeisen, D. I. V.; Kornblueh, L.; Notz, D.; Piontek, R.; Pohlmann, H.; Tietsche, S.; Müller, W. A.

    2015-05-01

    A seasonal forecast system is presented, based on the global coupled climate model MPI-ESM as used for CMIP5 simulations. We describe the initialisation of the system and analyse its predictive skill for surface temperature. The presented system is initialised in the atmospheric, oceanic, and sea ice component of the model from reanalysis/observations with full field nudging in all three components. For the initialisation of the ensemble, bred vectors with a vertically varying norm are implemented in the ocean component to generate initial perturbations. In a set of ensemble hindcast simulations, starting each May and November between 1982 and 2010, we analyse the predictive skill. Bias-corrected ensemble forecasts for each start date reproduce the observed surface temperature anomalies at 2-4 months lead time, particularly in the tropics. Niño3.4 sea surface temperature anomalies show a small root-mean-square error and predictive skill up to 6 months. Away from the tropics, predictive skill is mostly limited to the ocean, and to regions which are strongly influenced by ENSO teleconnections. In summary, the presented seasonal prediction system based on a coupled climate model shows predictive skill for surface temperature at seasonal time scales comparable to other seasonal prediction systems using different underlying models and initialisation strategies. As the same model underlying our seasonal prediction system—with a different initialisation—is presently also used for decadal predictions, this is an important step towards seamless seasonal-to-decadal climate predictions.

  18. Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics

    Science.gov (United States)

    Lazarus, S. M.; Holman, B. P.; Splitt, M. E.

    2017-12-01

    A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.

  19. Time-dependent generalized Gibbs ensembles in open quantum systems

    Science.gov (United States)

    Lange, Florian; Lenarčič, Zala; Rosch, Achim

    2018-04-01

    Generalized Gibbs ensembles have been used as powerful tools to describe the steady state of integrable many-particle quantum systems after a sudden change of the Hamiltonian. Here, we demonstrate numerically that they can be used for a much broader class of problems. We consider integrable systems in the presence of weak perturbations which break both integrability and drive the system to a state far from equilibrium. Under these conditions, we show that the steady state and the time evolution on long timescales can be accurately described by a (truncated) generalized Gibbs ensemble with time-dependent Lagrange parameters, determined from simple rate equations. We compare the numerically exact time evolutions of density matrices for small systems with a theory based on block-diagonal density matrices (diagonal ensemble) and a time-dependent generalized Gibbs ensemble containing only a small number of approximately conserved quantities, using the one-dimensional Heisenberg model with perturbations described by Lindblad operators as an example.

  20. Ensemble Classifiers for Predicting HIV-1 Resistance from Three Rule-Based Genotypic Resistance Interpretation Systems.

    Science.gov (United States)

    Raposo, Letícia M; Nobre, Flavio F

    2017-08-30

    Resistance to antiretrovirals (ARVs) is a major problem faced by HIV-infected individuals. Different rule-based algorithms were developed to infer HIV-1 susceptibility to antiretrovirals from genotypic data. However, there is discordance between them, resulting in difficulties for clinical decisions about which treatment to use. Here, we developed ensemble classifiers integrating three interpretation algorithms: Agence Nationale de Recherche sur le SIDA (ANRS), Rega, and the genotypic resistance interpretation system from Stanford HIV Drug Resistance Database (HIVdb). Three approaches were applied to develop a classifier with a single resistance profile: stacked generalization, a simple plurality vote scheme and the selection of the interpretation system with the best performance. The strategies were compared with the Friedman's test and the performance of the classifiers was evaluated using the F-measure, sensitivity and specificity values. We found that the three strategies had similar performances for the selected antiretrovirals. For some cases, the stacking technique with naïve Bayes as the learning algorithm showed a statistically superior F-measure. This study demonstrates that ensemble classifiers can be an alternative tool for clinical decision-making since they provide a single resistance profile from the most commonly used resistance interpretation systems.

  1. Adiabatic passage and ensemble control of quantum systems

    International Nuclear Information System (INIS)

    Leghtas, Z; Sarlette, A; Rouchon, P

    2011-01-01

    This paper considers population transfer between eigenstates of a finite quantum ladder controlled by a classical electric field. Using an appropriate change of variables, we show that this setting can be set in the framework of adiabatic passage, which is known to facilitate ensemble control of quantum systems. Building on this insight, we present a mathematical proof of robustness for a control protocol-chirped pulse-practised by experimentalists to drive an ensemble of quantum systems from the ground state to the most excited state. We then propose new adiabatic control protocols using a single chirped and amplitude-shaped pulse, to robustly perform any permutation of eigenstate populations, on an ensemble of systems with unknown coupling strengths. These adiabatic control protocols are illustrated by simulations on a four-level ladder.

  2. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project.

    Science.gov (United States)

    Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif

    2017-01-01

    Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.

  3. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT project.

    Directory of Open Access Journals (Sweden)

    Manal Alghamdi

    Full Text Available Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE. The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree and achieved high accuracy of prediction (AUC = 0.92. The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.

  4. The GMAO Hybrid Ensemble-Variational Atmospheric Data Assimilation System: Version 2.0

    Science.gov (United States)

    Todling, Ricardo; El Akkraoui, Amal

    2018-01-01

    should point out that Release 1.0 of this document was made available to GMAO in mid-2013, when we introduced Hybrid 3D-Var capability to GEOS ADAS. This initial version of the documentation included a considerably different state-of-science introductory section but many of the same detailed description of the mechanisms of GEOS EnADAS. We are glad to report that a few of the desirable Future Works listed in Release 1.0 have now been added to the present version of GEOS EnADAS. These include the ability to exercise an Ensemble Prediction System that uses the ensemble analyses of GEOS EnADAS and (a very early, but functional version of) a tool to support Ensemble Forecast Sensitivity and Observation Impact applications.

  5. Invariant methods for an ensemble-based sensitivity analysis of a passive containment cooling system of an AP1000 nuclear power plant

    International Nuclear Information System (INIS)

    Di Maio, Francesco; Nicola, Giancarlo; Borgonovo, Emanuele; Zio, Enrico

    2016-01-01

    Sensitivity Analysis (SA) is performed to gain fundamental insights on a system behavior that is usually reproduced by a model and to identify the most relevant input variables whose variations affect the system model functional response. For the reliability analysis of passive safety systems of Nuclear Power Plants (NPPs), models are Best Estimate (BE) Thermal Hydraulic (TH) codes, that predict the system functional response in normal and accidental conditions and, in this paper, an ensemble of three alternative invariant SA methods is innovatively set up for a SA on the TH code input variables. The ensemble aggregates the input variables raking orders provided by Pearson correlation ratio, Delta method and Beta method. The capability of the ensemble is shown on a BE–TH code of the Passive Containment Cooling System (PCCS) of an Advanced Pressurized water reactor AP1000, during a Loss Of Coolant Accident (LOCA), whose output probability density function (pdf) is approximated by a Finite Mixture Model (FMM), on the basis of a limited number of simulations. - Highlights: • We perform the reliability analysis of a passive safety system of Nuclear Power Plant (NPP). • We use a Thermal Hydraulic (TH) code for predicting the NPP response to accidents. • We propose an ensemble of Invariant Methods for the sensitivity analysis of the TH code • The ensemble aggregates the rankings of Pearson correlation, Delta and Beta methods. • The approach is tested on a Passive Containment Cooling System of an AP1000 NPP.

  6. Distinguishing high and low flow domains in urban drainage systems 2 days ahead using numerical weather prediction ensembles

    DEFF Research Database (Denmark)

    Courdent, Vianney Augustin Thomas; Grum, Morten; Mikkelsen, Peter Steen

    2018-01-01

    Precipitation constitutes a major contribution to the flow in urban storm- and wastewater systems. Forecasts of the anticipated runoff flows, created from radar extrapolation and/or numerical weather predictions, can potentially be used to optimize operation in both wet and dry weather periods...... to transform the forecasted rainfall into forecasted flow series and evaluate three different approaches to establishing the relative operating characteristics (ROC) diagram of the forecast, which is a plot of POD against POFD for each fraction of concordant ensemble members and can be used to select...... itself from earlier research in being the first application to urban hydrology, with fast runoff and small catchments that are highly sensitive to local extremes. Furthermore, no earlier reference has been found on the highly efficient third approach using only neighbouring cells with the highest threat...

  7. Cortical ensemble activity increasingly predicts behaviour outcomes during learning of a motor task

    Science.gov (United States)

    Laubach, Mark; Wessberg, Johan; Nicolelis, Miguel A. L.

    2000-06-01

    When an animal learns to make movements in response to different stimuli, changes in activity in the motor cortex seem to accompany and underlie this learning. The precise nature of modifications in cortical motor areas during the initial stages of motor learning, however, is largely unknown. Here we address this issue by chronically recording from neuronal ensembles located in the rat motor cortex, throughout the period required for rats to learn a reaction-time task. Motor learning was demonstrated by a decrease in the variance of the rats' reaction times and an increase in the time the animals were able to wait for a trigger stimulus. These behavioural changes were correlated with a significant increase in our ability to predict the correct or incorrect outcome of single trials based on three measures of neuronal ensemble activity: average firing rate, temporal patterns of firing, and correlated firing. This increase in prediction indicates that an association between sensory cues and movement emerged in the motor cortex as the task was learned. Such modifications in cortical ensemble activity may be critical for the initial learning of motor tasks.

  8. An ensemble machine learning approach to predict survival in breast cancer.

    Science.gov (United States)

    Djebbari, Amira; Liu, Ziying; Phan, Sieu; Famili, Fazel

    2008-01-01

    Current breast cancer predictive signatures are not unique. Can we use this fact to our advantage to improve prediction? From the machine learning perspective, it is well known that combining multiple classifiers can improve classification performance. We propose an ensemble machine learning approach which consists of choosing feature subsets and learning predictive models from them. We then combine models based on certain model fusion criteria and we also introduce a tuning parameter to control sensitivity. Our method significantly improves classification performance with a particular emphasis on sensitivity which is critical to avoid misclassifying poor prognosis patients as good prognosis.

  9. EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to Improve Prediction Accuracy.

    Science.gov (United States)

    Zhou, Xianxiao; Wang, Minghui; Katsyv, Igor; Irie, Hanna; Zhang, Bin

    2018-04-24

    Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods. We first designed an expression weighted cosine method (EWCos) to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed EMUDRA (Ensemble of Multiple Drug Repositioning Approaches) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs. The EMUDRA R package is available at doi:10.7303/syn11510888. bin.zhang@mssm.edu or zhangb@hotmail.com. Supplementary data are available at Bioinformatics online.

  10. A multi-model ensemble approach to seabed mapping

    Science.gov (United States)

    Diesing, Markus; Stephens, David

    2015-06-01

    Seabed habitat mapping based on swath acoustic data and ground-truth samples is an emergent and active marine science discipline. Significant progress could be achieved by transferring techniques and approaches that have been successfully developed and employed in such fields as terrestrial land cover mapping. One such promising approach is the multiple classifier system, which aims at improving classification performance by combining the outputs of several classifiers. Here we present results of a multi-model ensemble applied to multibeam acoustic data covering more than 5000 km2 of seabed in the North Sea with the aim to derive accurate spatial predictions of seabed substrate. A suite of six machine learning classifiers (k-Nearest Neighbour, Support Vector Machine, Classification Tree, Random Forest, Neural Network and Naïve Bayes) was trained with ground-truth sample data classified into seabed substrate classes and their prediction accuracy was assessed with an independent set of samples. The three and five best performing models were combined to classifier ensembles. Both ensembles led to increased prediction accuracy as compared to the best performing single classifier. The improvements were however not statistically significant at the 5% level. Although the three-model ensemble did not perform significantly better than its individual component models, we noticed that the five-model ensemble did perform significantly better than three of the five component models. A classifier ensemble might therefore be an effective strategy to improve classification performance. Another advantage is the fact that the agreement in predicted substrate class between the individual models of the ensemble could be used as a measure of confidence. We propose a simple and spatially explicit measure of confidence that is based on model agreement and prediction accuracy.

  11. Wave ensemble forecast in the Western Mediterranean Sea, application to an early warning system.

    Science.gov (United States)

    Pallares, Elena; Hernandez, Hector; Moré, Jordi; Espino, Manuel; Sairouni, Abdel

    2015-04-01

    The Western Mediterranean Sea is a highly heterogeneous and variable area, as is reflected on the wind field, the current field, and the waves, mainly in the first kilometers offshore. As a result of this variability, the wave forecast in these regions is quite complicated to perform, usually with some accuracy problems during energetic storm events. Moreover, is in these areas where most of the economic activities take part, including fisheries, sailing, tourism, coastal management and offshore renewal energy platforms. In order to introduce an indicator of the probability of occurrence of the different sea states and give more detailed information of the forecast to the end users, an ensemble wave forecast system is considered. The ensemble prediction systems have already been used in the last decades for the meteorological forecast; to deal with the uncertainties of the initial conditions and the different parametrizations used in the models, which may introduce some errors in the forecast, a bunch of different perturbed meteorological simulations are considered as possible future scenarios and compared with the deterministic forecast. In the present work, the SWAN wave model (v41.01) has been implemented for the Western Mediterranean sea, forced with wind fields produced by the deterministic Global Forecast System (GFS) and Global Ensemble Forecast System (GEFS). The wind fields includes a deterministic forecast (also named control), between 11 and 21 ensemble members, and some intelligent member obtained from the ensemble, as the mean of all the members. Four buoys located in the study area, moored in coastal waters, have been used to validate the results. The outputs include all the time series, with a forecast horizon of 8 days and represented in spaghetti diagrams, the spread of the system and the probability at different thresholds. The main goal of this exercise is to be able to determine the degree of the uncertainty of the wave forecast, meaningful

  12. Development of multimodel ensemble based district level medium ...

    Indian Academy of Sciences (India)

    tively by computing the anomaly correlation coef- ficient between the predicted rainfall and observed rainfall. High resolution (lat./long.) gridded data ..... particularly in the prediction of intensity and mesoscale rainfall features causing inland flooding. During recent years, Ensemble. Prediction System (EPS) has emerged as ...

  13. EnsembleGASVR: A novel ensemble method for classifying missense single nucleotide polymorphisms

    KAUST Repository

    Rapakoulia, Trisevgeni; Theofilatos, Konstantinos A.; Kleftogiannis, Dimitrios A.; Likothanasis, Spiridon D.; Tsakalidis, Athanasios K.; Mavroudi, Seferina P.

    2014-01-01

    do not support their predictions with confidence scores. Results: To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a twostep algorithm, which in its first step applies a novel

  14. Ensemble Bayesian forecasting system Part I: Theory and algorithms

    Science.gov (United States)

    Herr, Henry D.; Krzysztofowicz, Roman

    2015-05-01

    The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of

  15. Adaptive Encoding of Outcome Prediction by Prefrontal Cortex Ensembles Supports Behavioral Flexibility.

    Science.gov (United States)

    Del Arco, Alberto; Park, Junchol; Wood, Jesse; Kim, Yunbok; Moghaddam, Bita

    2017-08-30

    The prefrontal cortex (PFC) is thought to play a critical role in behavioral flexibility by monitoring action-outcome contingencies. How PFC ensembles represent shifts in behavior in response to changes in these contingencies remains unclear. We recorded single-unit activity and local field potentials in the dorsomedial PFC (dmPFC) of male rats during a set-shifting task that required them to update their behavior, among competing options, in response to changes in action-outcome contingencies. As behavior was updated, a subset of PFC ensembles encoded the current trial outcome before the outcome was presented. This novel outcome-prediction encoding was absent in a control task, in which actions were rewarded pseudorandomly, indicating that PFC neurons are not merely providing an expectancy signal. In both control and set-shifting tasks, dmPFC neurons displayed postoutcome discrimination activity, indicating that these neurons also monitor whether a behavior is successful in generating rewards. Gamma-power oscillatory activity increased before the outcome in both tasks but did not differentiate between expected outcomes, suggesting that this measure is not related to set-shifting behavior but reflects expectation of an outcome after action execution. These results demonstrate that PFC neurons support flexible rule-based action selection by predicting outcomes that follow a particular action. SIGNIFICANCE STATEMENT Tracking action-outcome contingencies and modifying behavior when those contingencies change is critical to behavioral flexibility. We find that ensembles of dorsomedial prefrontal cortex neurons differentiate between expected outcomes when action-outcome contingencies change. This predictive mode of signaling may be used to promote a new response strategy at the service of behavioral flexibility. Copyright © 2017 the authors 0270-6474/17/378363-11$15.00/0.

  16. Seasonal prediction of East Asian summer rainfall using a multi-model ensemble system

    Science.gov (United States)

    Ahn, Joong-Bae; Lee, Doo-Young; Yoo, Jin‑Ho

    2015-04-01

    Using the retrospective forecasts of seven state-of-the-art coupled models and their multi-model ensemble (MME) for boreal summers, the prediction skills of climate models in the western tropical Pacific (WTP) and East Asian region are assessed. The prediction of summer rainfall anomalies in East Asia is difficult, while the WTP has a strong correlation between model prediction and observation. We focus on developing a new approach to further enhance the seasonal prediction skill for summer rainfall in East Asia and investigate the influence of convective activity in the WTP on East Asian summer rainfall. By analyzing the characteristics of the WTP convection, two distinct patterns associated with El Niño-Southern Oscillation developing and decaying modes are identified. Based on the multiple linear regression method, the East Asia Rainfall Index (EARI) is developed by using the interannual variability of the normalized Maritime continent-WTP Indices (MPIs), as potentially useful predictors for rainfall prediction over East Asia, obtained from the above two main patterns. For East Asian summer rainfall, the EARI has superior performance to the East Asia summer monsoon index or each MPI. Therefore, the regressed rainfall from EARI also shows a strong relationship with the observed East Asian summer rainfall pattern. In addition, we evaluate the prediction skill of the East Asia reconstructed rainfall obtained by hybrid dynamical-statistical approach using the cross-validated EARI from the individual models and their MME. The results show that the rainfalls reconstructed from simulations capture the general features of observed precipitation in East Asia quite well. This study convincingly demonstrates that rainfall prediction skill is considerably improved by using a hybrid dynamical-statistical approach compared to the dynamical forecast alone. Acknowledgements This work was carried out with the support of Rural Development Administration Cooperative Research

  17. The Experimental Regional Ensemble Forecast System (ExREF): Its Use in NWS Forecast Operations and Preliminary Verification

    Science.gov (United States)

    Reynolds, David; Rasch, William; Kozlowski, Daniel; Burks, Jason; Zavodsky, Bradley; Bernardet, Ligia; Jankov, Isidora; Albers, Steve

    2014-01-01

    The Experimental Regional Ensemble Forecast (ExREF) system is a tool for the development and testing of new Numerical Weather Prediction (NWP) methodologies. ExREF is run in near-realtime by the Global Systems Division (GSD) of the NOAA Earth System Research Laboratory (ESRL) and its products are made available through a website, an ftp site, and via the Unidata Local Data Manager (LDM). The ExREF domain covers most of North America and has 9-km horizontal grid spacing. The ensemble has eight members, all employing WRF-ARW. The ensemble uses a variety of initial conditions from LAPS and the Global Forecasting System (GFS) and multiple boundary conditions from the GFS ensemble. Additionally, a diversity of physical parameterizations is used to increase ensemble spread and to account for the uncertainty in forecasting extreme precipitation events. ExREF has been a component of the Hydrometeorology Testbed (HMT) NWP suite in the 2012-2013 and 2013-2014 winters. A smaller domain covering just the West Coast was created to minimize band-width consumption for the NWS. This smaller domain has and is being distributed to the National Weather Service (NWS) Weather Forecast Office and California Nevada River Forecast Center in Sacramento, California, where it is ingested into the Advanced Weather Interactive Processing System (AWIPS I and II) to provide guidance on the forecasting of extreme precipitation events. This paper will review the cooperative effort employed by NOAA ESRL, NASA SPoRT (Short-term Prediction Research and Transition Center), and the NWS to facilitate the ingest and display of ExREF data utilizing the AWIPS I and II D2D and GFE (Graphical Software Editor) software. Within GFE is a very useful verification software package called BoiVer that allows the NWS to utilize the River Forecast Center's 4 km gridded QPE to compare with all operational NWP models 6-hr QPF along with the ExREF mean 6-hr QPF so the forecasters can build confidence in the use of the

  18. Intelligent and robust prediction of short term wind power using genetic programming based ensemble of neural networks

    International Nuclear Information System (INIS)

    Zameer, Aneela; Arshad, Junaid; Khan, Asifullah; Raja, Muhammad Asif Zahoor

    2017-01-01

    Highlights: • Genetic programming based ensemble of neural networks is employed for short term wind power prediction. • Proposed predictor shows resilience against abrupt changes in weather. • Genetic programming evolves nonlinear mapping between meteorological measures and wind-power. • Proposed approach gives mathematical expressions of wind power to its independent variables. • Proposed model shows relatively accurate and steady wind-power prediction performance. - Abstract: The inherent instability of wind power production leads to critical problems for smooth power generation from wind turbines, which then requires an accurate forecast of wind power. In this study, an effective short term wind power prediction methodology is presented, which uses an intelligent ensemble regressor that comprises Artificial Neural Networks and Genetic Programming. In contrast to existing series based combination of wind power predictors, whereby the error or variation in the leading predictor is propagated down the stream to the next predictors, the proposed intelligent ensemble predictor avoids this shortcoming by introducing Genetical Programming based semi-stochastic combination of neural networks. It is observed that the decision of the individual base regressors may vary due to the frequent and inherent fluctuations in the atmospheric conditions and thus meteorological properties. The novelty of the reported work lies in creating ensemble to generate an intelligent, collective and robust decision space and thereby avoiding large errors due to the sensitivity of the individual wind predictors. The proposed ensemble based regressor, Genetic Programming based ensemble of Artificial Neural Networks, has been implemented and tested on data taken from five different wind farms located in Europe. Obtained numerical results of the proposed model in terms of various error measures are compared with the recent artificial intelligence based strategies to demonstrate the

  19. Momentum distribution functions in ensembles: the inequivalence of microcannonical and canonical ensembles in a finite ultracold system.

    Science.gov (United States)

    Wang, Pei; Xianlong, Gao; Li, Haibin

    2013-08-01

    It is demonstrated in many thermodynamic textbooks that the equivalence of the different ensembles is achieved in the thermodynamic limit. In this present work we discuss the inequivalence of microcanonical and canonical ensembles in a finite ultracold system at low energies. We calculate the microcanonical momentum distribution function (MDF) in a system of identical fermions (bosons). We find that the microcanonical MDF deviates from the canonical one, which is the Fermi-Dirac (Bose-Einstein) function, in a finite system at low energies where the single-particle density of states and its inverse are finite.

  20. Combining 2-m temperature nowcasting and short range ensemble forecasting

    Directory of Open Access Journals (Sweden)

    A. Kann

    2011-12-01

    Full Text Available During recent years, numerical ensemble prediction systems have become an important tool for estimating the uncertainties of dynamical and physical processes as represented in numerical weather models. The latest generation of limited area ensemble prediction systems (LAM-EPSs allows for probabilistic forecasts at high resolution in both space and time. However, these systems still suffer from systematic deficiencies. Especially for nowcasting (0–6 h applications the ensemble spread is smaller than the actual forecast error. This paper tries to generate probabilistic short range 2-m temperature forecasts by combining a state-of-the-art nowcasting method and a limited area ensemble system, and compares the results with statistical methods. The Integrated Nowcasting Through Comprehensive Analysis (INCA system, which has been in operation at the Central Institute for Meteorology and Geodynamics (ZAMG since 2006 (Haiden et al., 2011, provides short range deterministic forecasts at high temporal (15 min–60 min and spatial (1 km resolution. An INCA Ensemble (INCA-EPS of 2-m temperature forecasts is constructed by applying a dynamical approach, a statistical approach, and a combined dynamic-statistical method. The dynamical method takes uncertainty information (i.e. ensemble variance from the operational limited area ensemble system ALADIN-LAEF (Aire Limitée Adaptation Dynamique Développement InterNational Limited Area Ensemble Forecasting which is running operationally at ZAMG (Wang et al., 2011. The purely statistical method assumes a well-calibrated spread-skill relation and applies ensemble spread according to the skill of the INCA forecast of the most recent past. The combined dynamic-statistical approach adapts the ensemble variance gained from ALADIN-LAEF with non-homogeneous Gaussian regression (NGR which yields a statistical mbox{correction} of the first and second moment (mean bias and dispersion for Gaussian distributed continuous

  1. DEWEPS - Development and Evaluation of new Wind forecasting tools with an Ensemble Prediction System

    Energy Technology Data Exchange (ETDEWEB)

    Moehrlen, C.; Joergensen, Jess

    2012-02-15

    There is an ongoing trend of increased privatization in the handling of renewable energy. This trend is required to ensure an efficient energy system, where improvements that make economic sense are prioritised. The reason why centralized forecasting can be a challenge in that matter is that the TSOs tend to optimize on physical error rather than cost. Consequently, the market is likely to speculate against the TSO, which in turn increases the cost of balancing. A privatized pool of wind and/or solar power is more difficult to speculate against, because the optimization criteria is unpredictable due to subjective risk considerations that may be taken into account at any time. Although there is and additional level of costs for the trading of the private volume, it can be argued that competition will accelerate efficiency from an economic perspective. The amount of power put into the market will become less predictable, when the wind power spot market bid takes place on the basis of a risk consideration in addition to the forecast information itself. The scope of this project is to contribute to more efficient wind power integration targeted both to centralised and decentralised cost efficient IT solutions, which will complement each other in market based energy systems. The DEWEPS project resulted in an extension of the number of Ensemble forecasts, an incremental trade strategy for balancing unpredictable power production, and an IT platform for efficient handling of power generation units. Together, these three elements contribute to less need for reserves, more capacity in the market, and thus more competition. (LN)

  2. Ensemble of classifiers based network intrusion detection system performance bound

    CSIR Research Space (South Africa)

    Mkuzangwe, Nenekazi NP

    2017-11-01

    Full Text Available This paper provides a performance bound of a network intrusion detection system (NIDS) that uses an ensemble of classifiers. Currently researchers rely on implementing the ensemble of classifiers based NIDS before they can determine the performance...

  3. Ensemble-based flash-flood modelling: Taking into account hydrodynamic parameters and initial soil moisture uncertainties

    Science.gov (United States)

    Edouard, Simon; Vincendon, Béatrice; Ducrocq, Véronique

    2018-05-01

    Intense precipitation events in the Mediterranean often lead to devastating flash floods (FF). FF modelling is affected by several kinds of uncertainties and Hydrological Ensemble Prediction Systems (HEPS) are designed to take those uncertainties into account. The major source of uncertainty comes from rainfall forcing and convective-scale meteorological ensemble prediction systems can manage it for forecasting purpose. But other sources are related to the hydrological modelling part of the HEPS. This study focuses on the uncertainties arising from the hydrological model parameters and initial soil moisture with aim to design an ensemble-based version of an hydrological model dedicated to Mediterranean fast responding rivers simulations, the ISBA-TOP coupled system. The first step consists in identifying the parameters that have the strongest influence on FF simulations by assuming perfect precipitation. A sensitivity study is carried out first using a synthetic framework and then for several real events and several catchments. Perturbation methods varying the most sensitive parameters as well as initial soil moisture allow designing an ensemble-based version of ISBA-TOP. The first results of this system on some real events are presented. The direct perspective of this work will be to drive this ensemble-based version with the members of a convective-scale meteorological ensemble prediction system to design a complete HEPS for FF forecasting.

  4. Predicting artificailly drained areas by means of selective model ensemble

    DEFF Research Database (Denmark)

    Møller, Anders Bjørn; Beucher, Amélie; Iversen, Bo Vangsø

    . The approaches employed include decision trees, discriminant analysis, regression models, neural networks and support vector machines amongst others. Several models are trained with each method, using variously the original soil covariates and principal components of the covariates. With a large ensemble...... out since the mid-19th century, and it has been estimated that half of the cultivated area is artificially drained (Olesen, 2009). A number of machine learning approaches can be used to predict artificially drained areas in geographic space. However, instead of choosing the most accurate model....... The study aims firstly to train a large number of models to predict the extent of artificially drained areas using various machine learning approaches. Secondly, the study will develop a method for selecting the models, which give a good prediction of artificially drained areas, when used in conjunction...

  5. Learning to REDUCE: A Reduced Electricity Consumption Prediction Ensemble

    Energy Technology Data Exchange (ETDEWEB)

    Aman, Saima; Chelmis, Charalampos; Prasanna, Viktor

    2016-02-12

    Utilities use Demand Response (DR) to balance supply and demand in the electric grid by involving customers in efforts to reduce electricity consumption during peak periods. To implement and adapt DR under dynamically changing conditions of the grid, reliable prediction of reduced consumption is critical. However, despite the wealth of research on electricity consumption prediction and DR being long in practice, the problem of reduced consumption prediction remains largely un-addressed. In this paper, we identify unique computational challenges associated with the prediction of reduced consumption and contrast this to that of normal consumption and DR baseline prediction.We propose a novel ensemble model that leverages different sequences of daily electricity consumption on DR event days as well as contextual attributes for reduced consumption prediction. We demonstrate the success of our model on a large, real-world, high resolution dataset from a university microgrid comprising of over 950 DR events across a diverse set of 32 buildings. Our model achieves an average error of 13.5%, an 8.8% improvement over the baseline. Our work is particularly relevant for buildings where electricity consumption is not tied to strict schedules. Our results and insights should prove useful to the researchers and practitioners working in the sustainable energy domain.

  6. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  7. Potentialities of ensemble strategies for flood forecasting over the Milano urban area

    Science.gov (United States)

    Ravazzani, Giovanni; Amengual, Arnau; Ceppi, Alessandro; Homar, Víctor; Romero, Romu; Lombardi, Gabriele; Mancini, Marco

    2016-08-01

    Analysis of ensemble forecasting strategies, which can provide a tangible backing for flood early warning procedures and mitigation measures over the Mediterranean region, is one of the fundamental motivations of the international HyMeX programme. Here, we examine two severe hydrometeorological episodes that affected the Milano urban area and for which the complex flood protection system of the city did not completely succeed. Indeed, flood damage have exponentially increased during the last 60 years, due to industrial and urban developments. Thus, the improvement of the Milano flood control system needs a synergism between structural and non-structural approaches. First, we examine how land-use changes due to urban development have altered the hydrological response to intense rainfalls. Second, we test a flood forecasting system which comprises the Flash-flood Event-based Spatially distributed rainfall-runoff Transformation, including Water Balance (FEST-WB) and the Weather Research and Forecasting (WRF) models. Accurate forecasts of deep moist convection and extreme precipitation are difficult to be predicted due to uncertainties arising from the numeric weather prediction (NWP) physical parameterizations and high sensitivity to misrepresentation of the atmospheric state; however, two hydrological ensemble prediction systems (HEPS) have been designed to explicitly cope with uncertainties in the initial and lateral boundary conditions (IC/LBCs) and physical parameterizations of the NWP model. No substantial differences in skill have been found between both ensemble strategies when considering an enhanced diversity of IC/LBCs for the perturbed initial conditions ensemble. Furthermore, no additional benefits have been found by considering more frequent LBCs in a mixed physics ensemble, as ensemble spread seems to be reduced. These findings could help to design the most appropriate ensemble strategies before these hydrometeorological extremes, given the computational

  8. A Unified Air-Sea Interface in Fully Coupled Atmosphere-Wave-Ocean Models for Data Assimilation and Ensemble Prediction

    Science.gov (United States)

    Chen, Shuyi; Curcic, Milan; Donelan, Mark; Campbell, Tim; Smith, Travis; Chen, Sue; Allard, Rick; Michalakes, John

    2014-05-01

    The goals of this study are to 1) better understand the physical processes controlling air-sea interaction and their impact on coastal marine and storm predictions, 2) explore the use of coupled atmosphere-ocean observations in model verification and data assimilation, and 3) develop a physically based and computationally efficient coupling at the air-sea interface that is flexible for use in a multi-model system and portable for transition to the next generation research and operational coupled atmosphere-wave-ocean-land models. We have developed a unified air-sea interface module that couples multiple atmosphere, wave, and ocean models using the Earth System Modeling Framework (ESMF). This standardized coupling framework allows researchers to develop and test air-sea coupling parameterizations and coupled data assimilation, and to better facilitate research-to-operation activities. It also allows for future ensemble forecasts using coupled models that can be used for coupled data assimilation and assessment of uncertainties in coupled model predictions. The current component models include two atmospheric models (WRF and COAMPS), two ocean models (HYCOM and NCOM), and two wave models (UMWM and SWAN). The coupled modeling systems have been tested and evaluated using the coupled air-sea observations (e.g., GPS dropsondes and AXBTs, drifters and floats) collected in recent field campaigns in the Gulf of Mexico and tropical cyclones in the Atlantic and Pacific basins. This talk will provide an overview of the unified air-sea interface model and fully coupled atmosphere-wave-ocean model predictions over various coastal regions and tropical cyclones in the Pacific and Atlantic basins including an example from coupled ensemble prediction of Superstorm Sandy (2012).

  9. A comparison of the performance of the 3-D super-ensemble and an ensemble Kalman filter for short-range regional ocean prediction

    Directory of Open Access Journals (Sweden)

    Baptiste Mourre

    2014-01-01

    Full Text Available This study compares the ability of two approaches integrating models and data to forecast the Ligurian Sea regional oceanographic conditions in the short-term range (0–72 hours when constrained by a common observation dataset. The post-processing 3-D super-ensemble (3DSE algorithm, which uses observations to optimally combine multi-model forecasts into a single prediction of the oceanic variable, is first considered. The 3DSE predictive skills are compared to those of the Regional Ocean Modeling System model in which observations are assimilated through a more conventional ensemble Kalman filter (EnKF approach. Assimilated measurements include sea surface temperature maps, and temperature and salinity subsurface observations from a fleet of five underwater gliders. Retrospective analyses are carried out to produce daily predictions during the 11-d period of the REP10 sea trial experiment. The forecast skill evaluation based on a distributed multi-sensor validation dataset indicates an overall superior performance of the EnKF, both at the surface and at depth. While the 3DSE and EnKF perform comparably well in the area spanned by the incorporated measurements, the 3DSE accuracy is found to rapidly decrease outside this area. In particular, the univariate formulation of the method combined with the absence of regular surface salinity measurements produces large errors in the 3DSE salinity forecast. On the contrary, the EnKF leads to more homogeneous forecast errors over the modelling domain for both temperature and salinity. The EnKF is found to consistently improve the predictions with respect to the control solution without assimilation and to be positively skilled when compared to the climatological estimate. For typical regional oceanographic applications with scarce subsurface observations, the lack of physical spatial and multivariate error covariances applicable to the individual model weights in the 3DSE formulation constitutes a major

  10. Uncertainty analysis of neural network based flood forecasting models: An ensemble based approach for constructing prediction interval

    Science.gov (United States)

    Kasiviswanathan, K.; Sudheer, K.

    2013-05-01

    Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived

  11. Ensemble-based Kalman Filters in Strongly Nonlinear Dynamics

    Institute of Scientific and Technical Information of China (English)

    Zhaoxia PU; Joshua HACKER

    2009-01-01

    This study examines the effectiveness of ensemble Kalman filters in data assimilation with the strongly nonlinear dynamics of the Lorenz-63 model, and in particular their use in predicting the regime transition that occurs when the model jumps from one basin of attraction to the other. Four configurations of the ensemble-based Kalman filtering data assimilation techniques, including the ensemble Kalman filter, ensemble adjustment Kalman filter, ensemble square root filter and ensemble transform Kalman filter, are evaluated with their ability in predicting the regime transition (also called phase transition) and also are compared in terms of their sensitivity to both observational and sampling errors. The sensitivity of each ensemble-based filter to the size of the ensemble is also examined.

  12. On the incidence of meteorological and hydrological processors: Effect of resolution, sharpness and reliability of hydrological ensemble forecasts

    Science.gov (United States)

    Abaza, Mabrouk; Anctil, François; Fortin, Vincent; Perreault, Luc

    2017-12-01

    Meteorological and hydrological ensemble prediction systems are imperfect. Their outputs could often be improved through the use of a statistical processor, opening up the question of the necessity of using both processors (meteorological and hydrological), only one of them, or none. This experiment compares the predictive distributions from four hydrological ensemble prediction systems (H-EPS) utilising the Ensemble Kalman filter (EnKF) probabilistic sequential data assimilation scheme. They differ in the inclusion or not of the Distribution Based Scaling (DBS) method for post-processing meteorological forecasts and the ensemble Bayesian Model Averaging (ensemble BMA) method for hydrological forecast post-processing. The experiment is implemented on three large watersheds and relies on the combination of two meteorological reforecast products: the 4-member Canadian reforecasts from the Canadian Centre for Meteorological and Environmental Prediction (CCMEP) and the 10-member American reforecasts from the National Oceanic and Atmospheric Administration (NOAA), leading to 14 members at each time step. Results show that all four tested H-EPS lead to resolution and sharpness values that are quite similar, with an advantage to DBS + EnKF. The ensemble BMA is unable to compensate for any bias left in the precipitation ensemble forecasts. On the other hand, it succeeds in calibrating ensemble members that are otherwise under-dispersed. If reliability is preferred over resolution and sharpness, DBS + EnKF + ensemble BMA performs best, making use of both processors in the H-EPS system. Conversely, for enhanced resolution and sharpness, DBS is the preferred method.

  13. Ensemble approach combining multiple methods improves human transcription start site prediction

    LENUS (Irish Health Repository)

    Dineen, David G

    2010-11-30

    Abstract Background The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier (\\'Profisi Ensemble\\') using predictions from 7 programs, along with 2 other data sources. Support vector machines using \\'full\\' and \\'reduced\\' data sets are combined in an either\\/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions Supervised learning methods are a useful way to combine predictions from diverse sources.

  14. Stochastic Prediction of Wind Generating Resources Using the Enhanced Ensemble Model for Jeju Island’s Wind Farms in South Korea

    OpenAIRE

    Deockho Kim; Jin Hur

    2017-01-01

    Due to the intermittency of wind power generation, it is very hard to manage its system operation and planning. In order to incorporate higher wind power penetrations into power systems that maintain secure and economic power system operation, an accurate and efficient estimation of wind power outputs is needed. In this paper, we propose the stochastic prediction of wind generating resources using an enhanced ensemble model for Jeju Island’s wind farms in South Korea. When selecting the poten...

  15. Experimental real-time multi-model ensemble (MME) prediction of ...

    Indian Academy of Sciences (India)

    calibration (training) has to be of good quality. Otherwise, it might degrade the MME results. Early works by ... ECMWF ensemble data (Evans et al 2000), and they showed the superiority of the multi-model system over the ..... eral idea of the quality of rainfall forecasts in terms of error statistics for monsoon for the member.

  16. Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting

    Directory of Open Access Journals (Sweden)

    Federico Divina

    2018-04-01

    Full Text Available The ability to predict short-term electric energy demand would provide several benefits, both at the economic and environmental level. For example, it would allow for an efficient use of resources in order to face the actual demand, reducing the costs associated to the production as well as the emission of CO 2 . To this aim, in this paper we propose a strategy based on ensemble learning in order to tackle the short-term load forecasting problem. In particular, our approach is based on a stacking ensemble learning scheme, where the predictions produced by three base learning methods are used by a top level method in order to produce final predictions. We tested the proposed scheme on a dataset reporting the energy consumption in Spain over more than nine years. The obtained experimental results show that an approach for short-term electricity consumption forecasting based on ensemble learning can help in combining predictions produced by weaker learning methods in order to obtain superior results. In particular, the system produces a lower error with respect to the existing state-of-the art techniques used on the same dataset. More importantly, this case study has shown that using an ensemble scheme can achieve very accurate predictions, and thus that it is a suitable approach for addressing the short-term load forecasting problem.

  17. HIGH-RESOLUTION ATMOSPHERIC ENSEMBLE MODELING AT SRNL

    Energy Technology Data Exchange (ETDEWEB)

    Buckley, R.; Werth, D.; Chiswell, S.; Etherton, B.

    2011-05-10

    The High-Resolution Mid-Atlantic Forecasting Ensemble (HME) is a federated effort to improve operational forecasts related to precipitation, convection and boundary layer evolution, and fire weather utilizing data and computing resources from a diverse group of cooperating institutions in order to create a mesoscale ensemble from independent members. Collaborating organizations involved in the project include universities, National Weather Service offices, and national laboratories, including the Savannah River National Laboratory (SRNL). The ensemble system is produced from an overlapping numerical weather prediction model domain and parameter subsets provided by each contributing member. The coordination, synthesis, and dissemination of the ensemble information are performed by the Renaissance Computing Institute (RENCI) at the University of North Carolina-Chapel Hill. This paper discusses background related to the HME effort, SRNL participation, and example results available from the RENCI website.

  18. Verification of Global Radiation Forecasts from the Ensemble Prediction System at DMI

    DEFF Research Database (Denmark)

    Lundholm, Sisse Camilla

    To comply with an increasing demand for sustainable energy sources, a solar heating unit is being developed at the Technical University of Denmark. To make optimal use — environmentally and economically —, this heating unit is equipped with an intelligent control system using forecasts of the heat...... consumption of the house and the amount of available solar energy. In order to make the most of this solar heating unit, accurate forecasts of the available solar radiation are esstential. However, because of its sensitivity to local meteorological conditions, the solar radiation received at the surface...... of the Earth can be highly fluctuating and challenging to forecast accurately. To comply with the accuracy requirements to forecasts of both global, direct, and diffuse radiation, the uncertainty of these forecasts is of interest. Forecast uncertainties can become accessible by running an ensemble of forecasts...

  19. Quantum Control of Open Systems and Dense Atomic Ensembles

    Science.gov (United States)

    DiLoreto, Christopher

    . This effect motivates the need for using multi-directional basis sets in theoretical analysis of dense quantum systems. My results demonstrate the shortcomings of short-pulse techniques used in many recent studies. Based on my numerical studies, I hypothesize that the dense ensemble can be modelled by an effective single quantum system that has a decoherence rate that changes over time. My effective single particle model provides a way in which computational time can be reduced, and also a model in which the underlying physical processes involved in the system's evolution are much easier to understand. I then use this model to provide an elegant theoretical explanation for an unusual experimental result called "transverse optical magnetism''. My effective single particle model's predictions match very well with experimental data.

  20. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2016-12-02

    In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/ .

  1. An ensemble prediction approach to weekly Dengue cases forecasting based on climatic and terrain conditions

    Directory of Open Access Journals (Sweden)

    Sougata Deb

    2017-11-01

    Full Text Available Introduction: Dengue fever has been one of the most concerning endemic diseases of recent times. Every year, 50-100 million people get infected by the dengue virus across the world. Historically, it has been most prevalent in Southeast Asia and the Pacific Islands. In recent years, frequent dengue epidemics have started occurring in Latin America as well. This study focused on assessing the impact of different short and long-term lagged climatic predictors on dengue cases. Additionally, it assessed the impact of building an ensemble model using multiple time series and regression models, in improving prediction accuracy. Materials and Methods: Experimental data were based on two Latin American cities, viz. San Juan (Puerto Rico and Iquitos (Peru. Due to weather and geographic differences, San Juan recorded higher dengue incidences than Iquitos. Using lagged cross-correlations, this study confirmed the impact of temperature and vegetation on the number of dengue cases for both cities, though in varied degrees and time lags. An ensemble of multiple predictive models using an elaborate set of derived predictors was built and validated. Results: The proposed ensemble prediction achieved a mean absolute error of 21.55, 4.26 points lower than the 25.81 obtained by a standard negative binomial model. Changes in climatic conditions and urbanization were found to be strong predictors as established empirically in other researches. Some of the predictors were new and informative, which have not been explored in any other relevant studies yet. Discussion and Conclusions: Two original contributions were made in this research. Firstly, a focused and extensive feature engineering aligned with the mosquito lifecycle. Secondly, a novel covariate pattern-matching based prediction approach using past time series trend of the predictor variables. Increased accuracy of the proposed model over the benchmark model proved the appropriateness of the analytical approach

  2. Finding diversity for building one-day ahead Hydrological Ensemble Prediction System based on artificial neural network stacks

    Science.gov (United States)

    Brochero, Darwin; Anctil, Francois; Gagné, Christian; López, Karol

    2013-04-01

    In this study, we addressed the application of Artificial Neural Networks (ANN) in the context of Hydrological Ensemble Prediction Systems (HEPS). Such systems have become popular in the past years as a tool to include the forecast uncertainty in the decision making process. HEPS considers fundamentally the uncertainty cascade model [4] for uncertainty representation. Analogously, the machine learning community has proposed models of multiple classifier systems that take into account the variability in datasets, input space, model structures, and parametric configuration [3]. This approach is based primarily on the well-known "no free lunch theorem" [1]. Consequently, we propose a framework based on two separate but complementary topics: data stratification and input variable selection (IVS). Thus, we promote an ANN prediction stack in which each predictor is trained based on input spaces defined by the IVS application on different stratified sub-samples. All this, added to the inherent variability of classical ANN optimization, leads us to our ultimate goal: diversity in the prediction, defined as the complementarity of the individual predictors. The stratification application on the 12 basins used in this study, which originate from the second and third workshop of the MOPEX project [2], shows that the informativeness of the data is far more important than the quantity used for ANN training. Additionally, the input space variability leads to ANN stacks that outperform an ANN stack model trained with 100% of the available information but with a random selection of dataset used in the early stopping method (scenario R100P). The results show that from a deterministic view, the main advantage focuses on the efficient selection of the training information, which is an equally important concept for the calibration of conceptual hydrological models. On the other hand, the diversity achieved is reflected in a substantial improvement in the scores that define the

  3. Universal critical wrapping probabilities in the canonical ensemble

    Directory of Open Access Journals (Sweden)

    Hao Hu

    2015-09-01

    Full Text Available Universal dimensionless quantities, such as Binder ratios and wrapping probabilities, play an important role in the study of critical phenomena. We study the finite-size scaling behavior of the wrapping probability for the Potts model in the random-cluster representation, under the constraint that the total number of occupied bonds is fixed, so that the canonical ensemble applies. We derive that, in the limit L→∞, the critical values of the wrapping probability are different from those of the unconstrained model, i.e. the model in the grand-canonical ensemble, but still universal, for systems with 2yt−d>0 where yt=1/ν is the thermal renormalization exponent and d is the spatial dimension. Similar modifications apply to other dimensionless quantities, such as Binder ratios. For systems with 2yt−d≤0, these quantities share same critical universal values in the two ensembles. It is also derived that new finite-size corrections are induced. These findings apply more generally to systems in the canonical ensemble, e.g. the dilute Potts model with a fixed total number of vacancies. Finally, we formulate an efficient cluster-type algorithm for the canonical ensemble, and confirm these predictions by extensive simulations.

  4. Development of web-based services for an ensemble flood forecasting and risk assessment system

    Science.gov (United States)

    Yaw Manful, Desmond; He, Yi; Cloke, Hannah; Pappenberger, Florian; Li, Zhijia; Wetterhall, Fredrik; Huang, Yingchun; Hu, Yuzhong

    2010-05-01

    Flooding is a wide spread and devastating natural disaster worldwide. Floods that took place in the last decade in China were ranked the worst amongst recorded floods worldwide in terms of the number of human fatalities and economic losses (Munich Re-Insurance). Rapid economic development and population expansion into low lying flood plains has worsened the situation. Current conventional flood prediction systems in China are neither suited to the perceptible climate variability nor the rapid pace of urbanization sweeping the country. Flood prediction, from short-term (a few hours) to medium-term (a few days), needs to be revisited and adapted to changing socio-economic and hydro-climatic realities. The latest technology requires implementation of multiple numerical weather prediction systems. The availability of twelve global ensemble weather prediction systems through the ‘THORPEX Interactive Grand Global Ensemble' (TIGGE) offers a good opportunity for an effective state-of-the-art early forecasting system. A prototype of a Novel Flood Early Warning System (NEWS) using the TIGGE database is tested in the Huai River basin in east-central China. It is the first early flood warning system in China that uses the massive TIGGE database cascaded with river catchment models, the Xinanjiang hydrologic model and a 1-D hydraulic model, to predict river discharge and flood inundation. The NEWS algorithm is also designed to provide web-based services to a broad spectrum of end-users. The latter presents challenges as both databases and proprietary codes reside in different locations and converge at dissimilar times. NEWS will thus make use of a ready-to-run grid system that makes distributed computing and data resources available in a seamless and secure way. An ability to run or function on different operating systems and provide an interface or front that is accessible to broad spectrum of end-users is additional requirement. The aim is to achieve robust interoperability

  5. Sub-Ensemble Coastal Flood Forecasting: A Case Study of Hurricane Sandy

    Directory of Open Access Journals (Sweden)

    Justin A. Schulte

    2017-12-01

    Full Text Available In this paper, it is proposed that coastal flood ensemble forecasts be partitioned into sub-ensemble forecasts using cluster analysis in order to produce representative statistics and to measure forecast uncertainty arising from the presence of clusters. After clustering the ensemble members, the ability to predict the cluster into which the observation will fall can be measured using a cluster skill score. Additional sub-ensemble and composite skill scores are proposed for assessing the forecast skill of a clustered ensemble forecast. A recently proposed method for statistically increasing the number of ensemble members is used to improve sub-ensemble probabilistic estimates. Through the application of the proposed methodology to Sandy coastal flood reforecasts, it is demonstrated that statistics computed using only ensemble members belonging to a specific cluster are more representative than those computed using all ensemble members simultaneously. A cluster skill-cluster uncertainty index relationship is identified, which is the cluster analog of the documented spread-skill relationship. Two sub-ensemble skill scores are shown to be positively correlated with cluster forecast skill, suggesting that skillfully forecasting the cluster into which the observation will fall is important to overall forecast skill. The identified relationships also suggest that the number of ensemble members within in each cluster can be used as guidance for assessing the potential for forecast error. The inevitable existence of ensemble member clusters in tidally dominated total water level prediction systems suggests that clustering is a necessary post-processing step for producing representative and skillful total water level forecasts.

  6. Real­-Time Ensemble Forecasting of Coronal Mass Ejections Using the Wsa-Enlil+Cone Model

    Science.gov (United States)

    Mays, M. L.; Taktakishvili, A.; Pulkkinen, A. A.; Odstrcil, D.; MacNeice, P. J.; Rastaetter, L.; LaSota, J. A.

    2014-12-01

    complete a parametric event case study of the sensitivity of the CME arrival time prediction to free parameters for ambient solar wind model and CME. The parameter sensitivity study suggests future directions for the system, such as running ensembles using various magnetogram inputs to the WSA model.

  7. The influence of the new ECMWF Ensemble Prediction System resolution on wind power forecast accuracy and uncertainty estimation

    DEFF Research Database (Denmark)

    Alessandrini, S.; Pinson, Pierre; Sperati, S.

    2011-01-01

    The importance of wind power forecasting (WPF) is nowadays commonly recognized because it represents a useful tool to reduce problems of grid integration and to facilitate energy trading. If on one side the prediction accuracy is fundamental to these scopes, on the other it has become also clear...... by a recalibration procedure that allowed obtaining a more uniform distribution among the 51 intervals, making the ensemble spread large enough to include the observations. After that it was observed that the EPS power spread seemed to have enough correlation with the error calculated on the deterministic forecast...

  8. On the v-representability of ensemble densities of electron systems

    Science.gov (United States)

    Gonis, A.; Däne, M.

    2018-05-01

    Analogously to the case at zero temperature, where the density of the ground state of an interacting many-particle system determines uniquely (within an arbitrary additive constant) the external potential acting on the system, the thermal average of the density over an ensemble defined by the Boltzmann distribution at the minimum of the thermodynamic potential, or the free energy, determines the external potential uniquely (and not just modulo a constant) acting on a system described by this thermodynamic potential or free energy. The paper describes a formal procedure that generates the domain of a constrained search over general ensembles (at zero or elevated temperatures) that lead to a given density, including as a special case a density thermally averaged at a given temperature, and in the case of a v-representable density determines the external potential leading to the ensemble density. As an immediate consequence of the general formalism, the concept of v-representability is extended beyond the hitherto discussed case of ground state densities to encompass excited states as well. Specific application to thermally averaged densities solves the v-representability problem in connection with the Mermin functional in a manner analogous to that in which this problem was recently settled with respect to the Hohenberg and Kohn functional. The main formalism is illustrated with numerical results for ensembles of one-dimensional, non-interacting systems of particles under a harmonic potential.

  9. Problems of a Statistical Ensemble Theory for Systems Far from Equilibrium

    Science.gov (United States)

    Ebeling, Werner

    The development of a general statistical physics of nonequilibrium systems was one of the main unfinished tasks of statistical physics of the 20th century. The aim of this work is the study of a special class of nonequilibrium systems where the formulation of an ensemble theory of some generality is possible. These are the so-called canonical-dissipative systems, where the driving terms are determined by invariants of motion. We construct canonical-dissipative systems which are ergodic on certain surfaces on the phase plane. These systems may be described by a non-equilibrium microcanocical ensemble, corresponding to an equal distribution on the target surface. Next we construct and solve Fokker-Planck equations; this leads to a kind of canonical-dissipative ensemble. In the last part we discuss the thoretical problem how to define bifurcations in the framework of nonequilibrium statistics and several possible applications.

  10. On the forecast skill of a convection-permitting ensemble

    Science.gov (United States)

    Schellander-Gorgas, Theresa; Wang, Yong; Meier, Florian; Weidle, Florian; Wittmann, Christoph; Kann, Alexander

    2017-01-01

    The 2.5 km convection-permitting (CP) ensemble AROME-EPS (Applications of Research to Operations at Mesoscale - Ensemble Prediction System) is evaluated by comparison with the regional 11 km ensemble ALADIN-LAEF (Aire Limitée Adaption dynamique Développement InterNational - Limited Area Ensemble Forecasting) to show whether a benefit is provided by a CP EPS. The evaluation focuses on the abilities of the ensembles to quantitatively predict precipitation during a 3-month convective summer period over areas consisting of mountains and lowlands. The statistical verification uses surface observations and 1 km × 1 km precipitation analyses, and the verification scores involve state-of-the-art statistical measures for deterministic and probabilistic forecasts as well as novel spatial verification methods. The results show that the convection-permitting ensemble with higher-resolution AROME-EPS outperforms its mesoscale counterpart ALADIN-LAEF for precipitation forecasts. The positive impact is larger for the mountainous areas than for the lowlands. In particular, the diurnal precipitation cycle is improved in AROME-EPS, which leads to a significant improvement of scores at the concerned times of day (up to approximately one-third of the scored verification measure). Moreover, there are advantages for higher precipitation thresholds at small spatial scales, which are due to the improved simulation of the spatial structure of precipitation.

  11. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    Science.gov (United States)

    Galelli, S.; Castelletti, A.

    2013-07-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  12. Ensembl 2004.

    Science.gov (United States)

    Birney, E; Andrews, D; Bevan, P; Caccamo, M; Cameron, G; Chen, Y; Clarke, L; Coates, G; Cox, T; Cuff, J; Curwen, V; Cutts, T; Down, T; Durbin, R; Eyras, E; Fernandez-Suarez, X M; Gane, P; Gibbins, B; Gilbert, J; Hammond, M; Hotz, H; Iyer, V; Kahari, A; Jekosch, K; Kasprzyk, A; Keefe, D; Keenan, S; Lehvaslaiho, H; McVicker, G; Melsopp, C; Meidl, P; Mongin, E; Pettett, R; Potter, S; Proctor, G; Rae, M; Searle, S; Slater, G; Smedley, D; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Storey, R; Ureta-Vidal, A; Woodwark, C; Clamp, M; Hubbard, T

    2004-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is a comprehensive and integrated source of annotation of large genome sequences, available via interactive website, web services or flat files. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. The facilities of the system range from sequence analysis to data storage and visualization and installations exist around the world both in companies and at academic sites. With a total of nine genome sequences available from Ensembl and more genomes to follow, recent developments have focused mainly on closer integration between genomes and external data.

  13. The use of different ensemble forecasting systems for wind power prediction on a real case in the South of Italy

    DEFF Research Database (Denmark)

    Alessandrini, Stefano; Sperati, Simone; Pinson, Pierre

    2012-01-01

    Short-term forecasting applied to wind energy is becoming increasingly important due to the constant growth of this renewable source, whose uncertainty requires a constant effort to meet the needs of the national electrical systems and their operators. Regarding to this, the probabilistic approach...... calibration performed on the wind speed EPS members allows an improvement from an over-confident situation observable from the rank histograms (in which the measurements fell quite always outside the bounds of the probability distribution) to a consistent ensemble spread. After that it is possible to convert...... the data to wind energy: the spread calculated on wind power can then be used as an accuracy predictor due to its level of correlation with the deterministic WPF error. In this presentation we investigate the performances for both wind power and accuracy prediction of the new EPS used at the ECMWF, whose...

  14. Neural Network Ensemble Based Approach for 2D-Interval Prediction of Solar Photovoltaic Power

    Directory of Open Access Journals (Sweden)

    Mashud Rana

    2016-10-01

    Full Text Available Solar energy generated from PhotoVoltaic (PV systems is one of the most promising types of renewable energy. However, it is highly variable as it depends on the solar irradiance and other meteorological factors. This variability creates difficulties for the large-scale integration of PV power in the electricity grid and requires accurate forecasting of the electricity generated by PV systems. In this paper we consider 2D-interval forecasts, where the goal is to predict summary statistics for the distribution of the PV power values in a future time interval. 2D-interval forecasts have been recently introduced, and they are more suitable than point forecasts for applications where the predicted variable has a high variability. We propose a method called NNE2D that combines variable selection based on mutual information and an ensemble of neural networks, to compute 2D-interval forecasts, where the two interval boundaries are expressed in terms of percentiles. NNE2D was evaluated for univariate prediction of Australian solar PV power data for two years. The results show that it is a promising method, outperforming persistence baselines and other methods used for comparison in terms of accuracy and coverage probability.

  15. A novel hybrid ensemble learning paradigm for nuclear energy consumption forecasting

    International Nuclear Information System (INIS)

    Tang, Ling; Yu, Lean; Wang, Shuai; Li, Jianping; Wang, Shouyang

    2012-01-01

    Highlights: ► A hybrid ensemble learning paradigm integrating EEMD and LSSVR is proposed. ► The hybrid ensemble method is useful to predict time series with high volatility. ► The ensemble method can be used for both one-step and multi-step ahead forecasting. - Abstract: In this paper, a novel hybrid ensemble learning paradigm integrating ensemble empirical mode decomposition (EEMD) and least squares support vector regression (LSSVR) is proposed for nuclear energy consumption forecasting, based on the principle of “decomposition and ensemble”. This hybrid ensemble learning paradigm is formulated specifically to address difficulties in modeling nuclear energy consumption, which has inherently high volatility, complexity and irregularity. In the proposed hybrid ensemble learning paradigm, EEMD, as a competitive decomposition method, is first applied to decompose original data of nuclear energy consumption (i.e. a difficult task) into a number of independent intrinsic mode functions (IMFs) of original data (i.e. some relatively easy subtasks). Then LSSVR, as a powerful forecasting tool, is implemented to predict all extracted IMFs independently. Finally, these predicted IMFs are aggregated into an ensemble result as final prediction, using another LSSVR. For illustration and verification purposes, the proposed learning paradigm is used to predict nuclear energy consumption in China. Empirical results demonstrate that the novel hybrid ensemble learning paradigm can outperform some other popular forecasting models in both level prediction and directional forecasting, indicating that it is a promising tool to predict complex time series with high volatility and irregularity.

  16. Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    Science.gov (United States)

    Ye, Qiang

    2010-01-01

    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…

  17. An ensemble method for predicting subnuclear localizations from primary protein structures.

    Directory of Open Access Journals (Sweden)

    Guo Sheng Han

    Full Text Available BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method

  18. Security Enrichment in Intrusion Detection System Using Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Uma R. Salunkhe

    2017-01-01

    Full Text Available In the era of Internet and with increasing number of people as its end users, a large number of attack categories are introduced daily. Hence, effective detection of various attacks with the help of Intrusion Detection Systems is an emerging trend in research these days. Existing studies show effectiveness of machine learning approaches in handling Intrusion Detection Systems. In this work, we aim to enhance detection rate of Intrusion Detection System by using machine learning technique. We propose a novel classifier ensemble based IDS that is constructed using hybrid approach which combines data level and feature level approach. Classifier ensembles combine the opinions of different experts and improve the intrusion detection rate. Experimental results show the improved detection rates of our system compared to reference technique.

  19. Operational water management of Rijnland water system and pilot of ensemble forecasting system for flood control

    Science.gov (United States)

    van der Zwan, Rene

    2013-04-01

    The Rijnland water system is situated in the western part of the Netherlands, and is a low-lying area of which 90% is below sea-level. The area covers 1,100 square kilometres, where 1.3 million people live, work, travel and enjoy leisure. The District Water Control Board of Rijnland is responsible for flood defence, water quantity and quality management. This includes design and maintenance of flood defence structures, control of regulating structures for an adequate water level management, and waste water treatment. For water quantity management Rijnland uses, besides an online monitoring network for collecting water level and precipitation data, a real time control decision support system. This decision support system consists of deterministic hydro-meteorological forecasts with a 24-hr forecast horizon, coupled with a control module that provides optimal operation schedules for the storage basin pumping stations. The uncertainty of the rainfall forecast is not forwarded in the hydrological prediction. At this moment 65% of the pumping capacity of the storage basin pumping stations can be automatically controlled by the decision control system. Within 5 years, after renovation of two other pumping stations, the total capacity of 200 m3/s will be automatically controlled. In critical conditions there is a need of both a longer forecast horizon and a probabilistic forecast. Therefore ensemble precipitation forecasts of the ECMWF are already consulted off-line during dry-spells, and Rijnland is running a pilot operational system providing 10-day water level ensemble forecasts. The use of EPS during dry-spells and the findings of the pilot will be presented. Challenges and next steps towards on-line implementation of ensemble forecasts for risk-based operational management of the Rijnland water system will be discussed. An important element in that discussion is the question: will policy and decision makers, operator and citizens adapt this Anticipatory Water

  20. Diversity in random subspacing ensembles

    NARCIS (Netherlands)

    Tsymbal, A.; Pechenizkiy, M.; Cunningham, P.; Kambayashi, Y.; Mohania, M.K.; Wöß, W.

    2004-01-01

    Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. It was shown experimentally and theoretically that in order for an ensemble to be effective, it should consist of classifiers having diversity in their predictions. A number of ways are

  1. Predicting Power Outages Using Multi-Model Ensemble Forecasts

    Science.gov (United States)

    Cerrai, D.; Anagnostou, E. N.; Yang, J.; Astitha, M.

    2017-12-01

    Power outages affect every year millions of people in the United States, affecting the economy and conditioning the everyday life. An Outage Prediction Model (OPM) has been developed at the University of Connecticut for helping utilities to quickly restore outages and to limit their adverse consequences on the population. The OPM, operational since 2015, combines several non-parametric machine learning (ML) models that use historical weather storm simulations and high-resolution weather forecasts, satellite remote sensing data, and infrastructure and land cover data to predict the number and spatial distribution of power outages. A new methodology, developed for improving the outage model performances by combining weather- and soil-related variables using three different weather models (WRF 3.7, WRF 3.8 and RAMS/ICLAMS), will be presented in this study. First, we will present a performance evaluation of each model variable, by comparing historical weather analyses with station data or reanalysis over the entire storm data set. Hence, each variable of the new outage model version is extracted from the best performing weather model for that variable, and sensitivity tests are performed for investigating the most efficient variable combination for outage prediction purposes. Despite that the final variables combination is extracted from different weather models, this ensemble based on multi-weather forcing and multi-statistical model power outage prediction outperforms the currently operational OPM version that is based on a single weather forcing variable (WRF 3.7), because each model component is the closest to the actual atmospheric state.

  2. Predictor-Year Subspace Clustering Based Ensemble Prediction of Indian Summer Monsoon

    Directory of Open Access Journals (Sweden)

    Moumita Saha

    2016-01-01

    Full Text Available Forecasting the Indian summer monsoon is a challenging task due to its complex and nonlinear behavior. A large number of global climatic variables with varying interaction patterns over years influence monsoon. Various statistical and neural prediction models have been proposed for forecasting monsoon, but many of them fail to capture variability over years. The skill of predictor variables of monsoon also evolves over time. In this article, we propose a joint-clustering of monsoon years and predictors for understanding and predicting the monsoon. This is achieved by subspace clustering algorithm. It groups the years based on prevailing global climatic condition using statistical clustering technique and subsequently for each such group it identifies significant climatic predictor variables which assist in better prediction. Prediction model is designed to frame individual cluster using random forest of regression tree. Prediction of aggregate and regional monsoon is attempted. Mean absolute error of 5.2% is obtained for forecasting aggregate Indian summer monsoon. Errors in predicting the regional monsoons are also comparable in comparison to the high variation of regional precipitation. Proposed joint-clustering based ensemble model is observed to be superior to existing monsoon prediction models and it also surpasses general nonclustering based prediction models.

  3. A Hybrid Computer-aided-diagnosis System for Prediction of Breast Cancer Recurrence (HPBCR) Using Optimized Ensemble Learning.

    Science.gov (United States)

    Mohebian, Mohammad R; Marateb, Hamid R; Mansourian, Marjan; Mañanas, Miguel Angel; Mokarian, Fariborz

    2017-01-01

    Cancer is a collection of diseases that involves growing abnormal cells with the potential to invade or spread to the body. Breast cancer is the second leading cause of cancer death among women. A method for 5-year breast cancer recurrence prediction is presented in this manuscript. Clinicopathologic characteristics of 579 breast cancer patients (recurrence prevalence of 19.3%) were analyzed and discriminative features were selected using statistical feature selection methods. They were further refined by Particle Swarm Optimization (PSO) as the inputs of the classification system with ensemble learning (Bagged Decision Tree: BDT). The proper combination of selected categorical features and also the weight (importance) of the selected interval-measurement-scale features were identified by the PSO algorithm. The performance of HPBCR (hybrid predictor of breast cancer recurrence) was assessed using the holdout and 4-fold cross-validation. Three other classifiers namely as supported vector machines, DT, and multilayer perceptron neural network were used for comparison. The selected features were diagnosis age, tumor size, lymph node involvement ratio, number of involved axillary lymph nodes, progesterone receptor expression, having hormone therapy and type of surgery. The minimum sensitivity, specificity, precision and accuracy of HPBCR were 77%, 93%, 95% and 85%, respectively in the entire cross-validation folds and the hold-out test fold. HPBCR outperformed the other tested classifiers. It showed excellent agreement with the gold standard (i.e. the oncologist opinion after blood tumor marker and imaging tests, and tissue biopsy). This algorithm is thus a promising online tool for the prediction of breast cancer recurrence.

  4. Ensemble Prediction Model with Expert Selection for Electricity Price Forecasting

    Directory of Open Access Journals (Sweden)

    Bijay Neupane

    2017-01-01

    Full Text Available Forecasting of electricity prices is important in deregulated electricity markets for all of the stakeholders: energy wholesalers, traders, retailers and consumers. Electricity price forecasting is an inherently difficult problem due to its special characteristic of dynamicity and non-stationarity. In this paper, we present a robust price forecasting mechanism that shows resilience towards the aggregate demand response effect and provides highly accurate forecasted electricity prices to the stakeholders in a dynamic environment. We employ an ensemble prediction model in which a group of different algorithms participates in forecasting 1-h ahead the price for each hour of a day. We propose two different strategies, namely, the Fixed Weight Method (FWM and the Varying Weight Method (VWM, for selecting each hour’s expert algorithm from the set of participating algorithms. In addition, we utilize a carefully engineered set of features selected from a pool of features extracted from the past electricity price data, weather data and calendar data. The proposed ensemble model offers better results than the Autoregressive Integrated Moving Average (ARIMA method, the Pattern Sequence-based Forecasting (PSF method and our previous work using Artificial Neural Networks (ANN alone on the datasets for New York, Australian and Spanish electricity markets.

  5. Improving wave forecasting by integrating ensemble modelling and machine learning

    Science.gov (United States)

    O'Donncha, F.; Zhang, Y.; James, S. C.

    2017-12-01

    Modern smart-grid networks use technologies to instantly relay information on supply and demand to support effective decision making. Integration of renewable-energy resources with these systems demands accurate forecasting of energy production (and demand) capacities. For wave-energy converters, this requires wave-condition forecasting to enable estimates of energy production. Current operational wave forecasting systems exhibit substantial errors with wave-height RMSEs of 40 to 60 cm being typical, which limits the reliability of energy-generation predictions thereby impeding integration with the distribution grid. In this study, we integrate physics-based models with statistical learning aggregation techniques that combine forecasts from multiple, independent models into a single "best-estimate" prediction of the true state. The Simulating Waves Nearshore physics-based model is used to compute wind- and currents-augmented waves in the Monterey Bay area. Ensembles are developed based on multiple simulations perturbing input data (wave characteristics supplied at the model boundaries and winds) to the model. A learning-aggregation technique uses past observations and past model forecasts to calculate a weight for each model. The aggregated forecasts are compared to observation data to quantify the performance of the model ensemble and aggregation techniques. The appropriately weighted ensemble model outperforms an individual ensemble member with regard to forecasting wave conditions.

  6. Dynamic Security Assessment of Western Danish Power System Based on Ensemble Decision Trees

    DEFF Research Database (Denmark)

    Liu, Leo; Bak, Claus Leth; Chen, Zhe

    2014-01-01

    With the increasing penetration of renewable energy resources and other forms of dispersed generation, more and more uncertainties will be brought to the dynamic security assessment (DSA) of power systems. This paper proposes an approach that uses ensemble decision trees (EDT) for online DSA. Fed...... with online wide-area measurement data, it is capable of not only predicting the security states of current operating conditions (OC) with high accuracy, but also indicating the confidence of the security states 1 minute ahead of the real time by an outlier identification method. The results of EDT together...

  7. Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses

    Science.gov (United States)

    Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong

    2017-04-01

    Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums

  8. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

    Science.gov (United States)

    S K, Somasundaram; P, Alli

    2017-11-09

    of DR screening system using Bagging Ensemble Classifier (BEC) is investigated. With the help of voting the process in ML-BEC, bagging minimizes the error due to variance of the base classifier. With the publicly available retinal image databases, our classifier is trained with 25% of RI. Results show that the ensemble classifier can achieve better classification accuracy (CA) than single classification models. Empirical experiments suggest that the machine learning-based ensemble classifier is efficient for further reducing DR classification time (CT).

  9. An Ensemble Nonlinear Model Predictive Control Algorithm in an Artificial Pancreas for People with Type 1 Diabetes

    DEFF Research Database (Denmark)

    Boiroux, Dimitri; Hagdrup, Morten; Mahmoudi, Zeinab

    2016-01-01

    patients with different physiological parameters and a time-varying insulin sensitivity using the Medtronic Virtual Patient (MVP) model. We augment the MVP model with stochastic diffusion terms, time-varying insulin sensitivity and noise-corrupted CGM measurements. We consider meal challenges where......This paper presents a novel ensemble nonlinear model predictive control (NMPC) algorithm for glucose regulation in type 1 diabetes. In this approach, we consider a number of scenarios describing different uncertainties, for instance meals or metabolic variations. We simulate a population of 9...... the uncertainty in meal size is ±50%. Numerical results show that the ensemble NMPC reduces the risk of hypoglycemia compared to standard NMPC in the case where the meal size is overestimated or correctly estimated at the expense of a slightly increased number of hyperglycemia. Therefore, ensemble MPC...

  10. Ensemble models of neutrophil trafficking in severe sepsis.

    Directory of Open Access Journals (Sweden)

    Sang Ok Song

    Full Text Available A hallmark of severe sepsis is systemic inflammation which activates leukocytes and can result in their misdirection. This leads to both impaired migration to the locus of infection and increased infiltration into healthy tissues. In order to better understand the pathophysiologic mechanisms involved, we developed a coarse-grained phenomenological model of the acute inflammatory response in CLP (cecal ligation and puncture-induced sepsis in rats. This model incorporates distinct neutrophil kinetic responses to the inflammatory stimulus and the dynamic interactions between components of a compartmentalized inflammatory response. Ensembles of model parameter sets consistent with experimental observations were statistically generated using a Markov-Chain Monte Carlo sampling. Prediction uncertainty in the model states was quantified over the resulting ensemble parameter sets. Forward simulation of the parameter ensembles successfully captured experimental features and predicted that systemically activated circulating neutrophils display impaired migration to the tissue and neutrophil sequestration in the lung, consequently contributing to tissue damage and mortality. Principal component and multiple regression analyses of the parameter ensembles estimated from survivor and non-survivor cohorts provide insight into pathologic mechanisms dictating outcome in sepsis. Furthermore, the model was extended to incorporate hypothetical mechanisms by which immune modulation using extracorporeal blood purification results in improved outcome in septic rats. Simulations identified a sub-population (about 18% of the treated population that benefited from blood purification. Survivors displayed enhanced neutrophil migration to tissue and reduced sequestration of lung neutrophils, contributing to improved outcome. The model ensemble presented herein provides a platform for generating and testing hypotheses in silico, as well as motivating further experimental

  11. Light localization in cold and dense atomic ensemble

    International Nuclear Information System (INIS)

    Sokolov, Igor

    2017-01-01

    We report on results of theoretical analysis of possibilities of light strong (Anderson) localization in a cold atomic ensemble. We predict appearance of localization in dense atomic systems in strong magnetic field. We prove that in absence of the field the light localization is impossible. (paper)

  12. A hydro-meteorological ensemble prediction system for real-time flood forecasting purposes in the Milano area

    Science.gov (United States)

    Ravazzani, Giovanni; Amengual, Arnau; Ceppi, Alessandro; Romero, Romualdo; Homar, Victor; Mancini, Marco

    2015-04-01

    Analysis of forecasting strategies that can provide a tangible basis for flood early warning procedures and mitigation measures over the Western Mediterranean region is one of the fundamental motivations of the European HyMeX programme. Here, we examine a set of hydro-meteorological episodes that affected the Milano urban area for which the complex flood protection system of the city did not completely succeed before the occurred flash-floods. Indeed, flood damages have exponentially increased in the area during the last 60 years, due to industrial and urban developments. Thus, the improvement of the Milano flood control system needs a synergism between structural and non-structural approaches. The flood forecasting system tested in this work comprises the Flash-flood Event-based Spatially distributed rainfall-runoff Transformation, including Water Balance (FEST-WB) and the Weather Research and Forecasting (WRF) models, in order to provide a hydrological ensemble prediction system (HEPS). Deterministic and probabilistic quantitative precipitation forecasts (QPFs) have been provided by WRF model in a set of 48-hours experiments. HEPS has been generated by combining different physical parameterizations (i.e. cloud microphysics, moist convection and boundary-layer schemes) of the WRF model in order to better encompass the atmospheric processes leading to high precipitation amounts. We have been able to test the value of a probabilistic versus a deterministic framework when driving Quantitative Discharge Forecasts (QDFs). Results highlight (i) the benefits of using a high-resolution HEPS in conveying uncertainties for this complex orographic area and (ii) a better simulation of the most of extreme precipitation events, potentially enabling valuable probabilistic QDFs. Hence, the HEPS copes with the significant deficiencies found in the deterministic QPFs. These shortcomings would prevent to correctly forecast the location and timing of high precipitation rates and

  13. On evaluation of ensemble precipitation forecasts with observation-based ensembles

    Directory of Open Access Journals (Sweden)

    S. Jaun

    2007-04-01

    Full Text Available Spatial interpolation of precipitation data is uncertain. How important is this uncertainty and how can it be considered in evaluation of high-resolution probabilistic precipitation forecasts? These questions are discussed by experimental evaluation of the COSMO consortium's limited-area ensemble prediction system COSMO-LEPS. The applied performance measure is the often used Brier skill score (BSS. The observational references in the evaluation are (a analyzed rain gauge data by ordinary Kriging and (b ensembles of interpolated rain gauge data by stochastic simulation. This permits the consideration of either a deterministic reference (the event is observed or not with 100% certainty or a probabilistic reference that makes allowance for uncertainties in spatial averaging. The evaluation experiments show that the evaluation uncertainties are substantial even for the large area (41 300 km2 of Switzerland with a mean rain gauge distance as good as 7 km: the one- to three-day precipitation forecasts have skill decreasing with forecast lead time but the one- and two-day forecast performances differ not significantly.

  14. Very short-term rainfall forecasting by effectively using the ensemble outputs of numerical weather prediction models

    Science.gov (United States)

    Wu, Ming-Chang; Lin, Gwo-Fong; Feng, Lei; Hwang, Gong-Do

    2017-04-01

    In Taiwan, heavy rainfall brought by typhoons often causes serious disasters and leads to loss of life and property. In order to reduce the impact of these disasters, accurate rainfall forecasts are always important for civil protection authorities to prepare proper measures in advance. In this study, a methodology is proposed for providing very short-term (1- to 6-h ahead) rainfall forecasts in a basin-scale area. The proposed methodology is developed based on the use of analogy reasoning approach to effectively integrate the ensemble precipitation forecasts from a numerical weather prediction system in Taiwan. To demonstrate the potential of the proposed methodology, an application to a basin-scale area (the Choshui River basin located in west-central Taiwan) during five typhoons is conducted. The results indicate that the proposed methodology yields more accurate hourly rainfall forecasts, especially the forecasts with a lead time of 1 to 3 hours. On average, improvement of the Nash-Sutcliffe efficiency coefficient is about 14% due to the effective use of the ensemble forecasts through the proposed methodology. The proposed methodology is expected to be useful for providing accurate very short-term rainfall forecasts during typhoons.

  15. Long-range hydrometeorological ensemble predictions of drought parameters

    Science.gov (United States)

    Fundel, F.; Jörg-Hess, S.; Zappa, M.

    2012-06-01

    Low streamflow as consequence of a drought event affects numerous aspects of life. Economic sectors that may be impacted by drought are, e.g. power production, agriculture, tourism and water quality management. Numerical models have increasingly been used to forecast low-flow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the low-flow indices duration, severity and magnitude, with a forecast lead-time of one month, to assess their potential usefulness for predictions. The ECMWF VarEPS 5 member reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification shows that, compared to peak flow, probabilistic low-flow forecasts are skillful for longer lead-times, low-flow index forecasts could also be beneficially included in a decision-making process. The results suggest monthly runoff forecasts are useful for accessing the risk of hydrological droughts.

  16. Ensemble Data Mining Methods

    Science.gov (United States)

    Oza, Nikunj C.

    2004-01-01

    Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve better prediction accuracy than any of the individual models could on their own. The basic goal when designing an ensemble is the same as when establishing a committee of people: each member of the committee should be as competent as possible, but the members should be complementary to one another. If the members are not complementary, Le., if they always agree, then the committee is unnecessary---any one member is sufficient. If the members are complementary, then when one or a few members make an error, the probability is high that the remaining members can correct this error. Research in ensemble methods has largely revolved around designing ensembles consisting of competent yet complementary models.

  17. Impact of Representing Model Error in a Hybrid Ensemble-Variational Data Assimilation System for Track Forecast of Tropical Cyclones over the Bay of Bengal

    Science.gov (United States)

    Kutty, Govindan; Muraleedharan, Rohit; Kesarkar, Amit P.

    2018-03-01

    Uncertainties in the numerical weather prediction models are generally not well-represented in ensemble-based data assimilation (DA) systems. The performance of an ensemble-based DA system becomes suboptimal, if the sources of error are undersampled in the forecast system. The present study examines the effect of accounting for model error treatments in the hybrid ensemble transform Kalman filter—three-dimensional variational (3DVAR) DA system (hybrid) in the track forecast of two tropical cyclones viz. Hudhud and Thane, formed over the Bay of Bengal, using Advanced Research Weather Research and Forecasting (ARW-WRF) model. We investigated the effect of two types of model error treatment schemes and their combination on the hybrid DA system; (i) multiphysics approach, which uses different combination of cumulus, microphysics and planetary boundary layer schemes, (ii) stochastic kinetic energy backscatter (SKEB) scheme, which perturbs the horizontal wind and potential temperature tendencies, (iii) a combination of both multiphysics and SKEB scheme. Substantial improvements are noticed in the track positions of both the cyclones, when flow-dependent ensemble covariance is used in 3DVAR framework. Explicit model error representation is found to be beneficial in treating the underdispersive ensembles. Among the model error schemes used in this study, a combination of multiphysics and SKEB schemes has outperformed the other two schemes with improved track forecast for both the tropical cyclones.

  18. A Hybrid Computer-aided-diagnosis System for Prediction of Breast Cancer Recurrence (HPBCR Using Optimized Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Mohammad R. Mohebian

    Full Text Available Cancer is a collection of diseases that involves growing abnormal cells with the potential to invade or spread to the body. Breast cancer is the second leading cause of cancer death among women. A method for 5-year breast cancer recurrence prediction is presented in this manuscript. Clinicopathologic characteristics of 579 breast cancer patients (recurrence prevalence of 19.3% were analyzed and discriminative features were selected using statistical feature selection methods. They were further refined by Particle Swarm Optimization (PSO as the inputs of the classification system with ensemble learning (Bagged Decision Tree: BDT. The proper combination of selected categorical features and also the weight (importance of the selected interval-measurement-scale features were identified by the PSO algorithm. The performance of HPBCR (hybrid predictor of breast cancer recurrence was assessed using the holdout and 4-fold cross-validation. Three other classifiers namely as supported vector machines, DT, and multilayer perceptron neural network were used for comparison. The selected features were diagnosis age, tumor size, lymph node involvement ratio, number of involved axillary lymph nodes, progesterone receptor expression, having hormone therapy and type of surgery. The minimum sensitivity, specificity, precision and accuracy of HPBCR were 77%, 93%, 95% and 85%, respectively in the entire cross-validation folds and the hold-out test fold. HPBCR outperformed the other tested classifiers. It showed excellent agreement with the gold standard (i.e. the oncologist opinion after blood tumor marker and imaging tests, and tissue biopsy. This algorithm is thus a promising online tool for the prediction of breast cancer recurrence. Keywords: Breast cancer, Cancer recurrence, Computer-assisted diagnosis, Machine learning, Prognosis

  19. Examining dynamic interactions among experimental factors influencing hydrologic data assimilation with the ensemble Kalman filter

    Science.gov (United States)

    Wang, S.; Huang, G. H.; Baetz, B. W.; Cai, X. M.; Ancell, B. C.; Fan, Y. R.

    2017-11-01

    The ensemble Kalman filter (EnKF) is recognized as a powerful data assimilation technique that generates an ensemble of model variables through stochastic perturbations of forcing data and observations. However, relatively little guidance exists with regard to the proper specification of the magnitude of the perturbation and the ensemble size, posing a significant challenge in optimally implementing the EnKF. This paper presents a robust data assimilation system (RDAS), in which a multi-factorial design of the EnKF experiments is first proposed for hydrologic ensemble predictions. A multi-way analysis of variance is then used to examine potential interactions among factors affecting the EnKF experiments, achieving optimality of the RDAS with maximized performance of hydrologic predictions. The RDAS is applied to the Xiangxi River watershed which is the most representative watershed in China's Three Gorges Reservoir region to demonstrate its validity and applicability. Results reveal that the pairwise interaction between perturbed precipitation and streamflow observations has the most significant impact on the performance of the EnKF system, and their interactions vary dynamically across different settings of the ensemble size and the evapotranspiration perturbation. In addition, the interactions among experimental factors vary greatly in magnitude and direction depending on different statistical metrics for model evaluation including the Nash-Sutcliffe efficiency and the Box-Cox transformed root-mean-square error. It is thus necessary to test various evaluation metrics in order to enhance the robustness of hydrologic prediction systems.

  20. IASI Radiance Data Assimilation in Local Ensemble Transform Kalman Filter

    Science.gov (United States)

    Cho, K.; Hyoung-Wook, C.; Jo, Y.

    2016-12-01

    Korea institute of Atmospheric Prediction Systems (KIAPS) is developing NWP model with data assimilation systems. Local Ensemble Transform Kalman Filter (LETKF) system, one of the data assimilation systems, has been developed for KIAPS Integrated Model (KIM) based on cubed-sphere grid and has successfully assimilated real data. LETKF data assimilation system has been extended to 4D- LETKF which considers time-evolving error covariance within assimilation window and IASI radiance data assimilation using KPOP (KIAPS package for observation processing) with RTTOV (Radiative Transfer for TOVS). The LETKF system is implementing semi operational prediction including conventional (sonde, aircraft) observation and AMSU-A (Advanced Microwave Sounding Unit-A) radiance data from April. Recently, the semi operational prediction system updated radiance observations including GPS-RO, AMV, IASI (Infrared Atmospheric Sounding Interferometer) data at July. A set of simulation of KIM with ne30np4 and 50 vertical levels (of top 0.3hPa) were carried out for short range forecast (10days) within semi operation prediction LETKF system with ensemble forecast 50 members. In order to only IASI impact, our experiments used only conventional and IAIS radiance data to same semi operational prediction set. We carried out sensitivity test for IAIS thinning method (3D and 4D). IASI observation number was increased by temporal (4D) thinning and the improvement of IASI radiance data impact on the forecast skill of model will expect.

  1. Wave ensemble forecast system for tropical cyclones in the Australian region

    Science.gov (United States)

    Zieger, Stefan; Greenslade, Diana; Kepert, Jeffrey D.

    2018-05-01

    Forecasting of waves under extreme conditions such as tropical cyclones is vitally important for many offshore industries, but there remain many challenges. For Northwest Western Australia (NW WA), wave forecasts issued by the Australian Bureau of Meteorology have previously been limited to products from deterministic operational wave models forced by deterministic atmospheric models. The wave models are run over global (resolution 1/4∘) and regional (resolution 1/10∘) domains with forecast ranges of + 7 and + 3 day respectively. Because of this relatively coarse resolution (both in the wave models and in the forcing fields), the accuracy of these products is limited under tropical cyclone conditions. Given this limited accuracy, a new ensemble-based wave forecasting system for the NW WA region has been developed. To achieve this, a new dedicated 8-km resolution grid was nested in the global wave model. Over this grid, the wave model is forced with winds from a bias-corrected European Centre for Medium Range Weather Forecast atmospheric ensemble that comprises 51 ensemble members to take into account the uncertainties in location, intensity and structure of a tropical cyclone system. A unique technique is used to select restart files for each wave ensemble member. The system is designed to operate in real time during the cyclone season providing + 10-day forecasts. This paper will describe the wave forecast components of this system and present the verification metrics and skill for specific events.

  2. A hybrid nudging-ensemble Kalman filter approach to data assimilation. Part I: application in the Lorenz system

    Directory of Open Access Journals (Sweden)

    Lili Lei

    2012-05-01

    Full Text Available A hybrid data assimilation approach combining nudging and the ensemble Kalman filter (EnKF for dynamic analysis and numerical weather prediction is explored here using the non-linear Lorenz three-variable model system with the goal of a smooth, continuous and accurate data assimilation. The hybrid nudging-EnKF (HNEnKF computes the hybrid nudging coefficients from the flow-dependent, time-varying error covariance matrix from the EnKF's ensemble forecasts. It extends the standard diagonal nudging terms to additional off-diagonal statistical correlation terms for greater inter-variable influence of the innovations in the model's predictive equations to assist in the data assimilation process. The HNEnKF promotes a better fit of an analysis to data compared to that achieved by either nudging or incremental analysis update (IAU. When model error is introduced, it produces similar or better root mean square errors compared to the EnKF while minimising the error spikes/discontinuities created by the intermittent EnKF. It provides a continuous data assimilation with better inter-variable consistency and improved temporal smoothness than that of the EnKF. Data assimilation experiments are also compared to the ensemble Kalman smoother (EnKS. The HNEnKF has similar or better temporal smoothness than that of the EnKS, and with much smaller central processing unit (CPU time and data storage requirements.

  3. A study of fuzzy logic ensemble system performance on face recognition problem

    Science.gov (United States)

    Polyakova, A.; Lipinskiy, L.

    2017-02-01

    Some problems are difficult to solve by using a single intelligent information technology (IIT). The ensemble of the various data mining (DM) techniques is a set of models which are able to solve the problem by itself, but the combination of which allows increasing the efficiency of the system as a whole. Using the IIT ensembles can improve the reliability and efficiency of the final decision, since it emphasizes on the diversity of its components. The new method of the intellectual informational technology ensemble design is considered in this paper. It is based on the fuzzy logic and is designed to solve the classification and regression problems. The ensemble consists of several data mining algorithms: artificial neural network, support vector machine and decision trees. These algorithms and their ensemble have been tested by solving the face recognition problems. Principal components analysis (PCA) is used for feature selection.

  4. Evaluation of bias-correction methods for ensemble streamflow volume forecasts

    Directory of Open Access Journals (Sweden)

    T. Hashino

    2007-01-01

    Full Text Available Ensemble prediction systems are used operationally to make probabilistic streamflow forecasts for seasonal time scales. However, hydrological models used for ensemble streamflow prediction often have simulation biases that degrade forecast quality and limit the operational usefulness of the forecasts. This study evaluates three bias-correction methods for ensemble streamflow volume forecasts. All three adjust the ensemble traces using a transformation derived with simulated and observed flows from a historical simulation. The quality of probabilistic forecasts issued when using the three bias-correction methods is evaluated using a distributions-oriented verification approach. Comparisons are made of retrospective forecasts of monthly flow volumes for a north-central United States basin (Des Moines River, Iowa, issued sequentially for each month over a 48-year record. The results show that all three bias-correction methods significantly improve forecast quality by eliminating unconditional biases and enhancing the potential skill. Still, subtle differences in the attributes of the bias-corrected forecasts have important implications for their use in operational decision-making. Diagnostic verification distinguishes these attributes in a context meaningful for decision-making, providing criteria to choose among bias-correction methods with comparable skill.

  5. The canonical ensemble redefined - 1: Formalism

    International Nuclear Information System (INIS)

    Venkataraman, R.

    1984-12-01

    For studying the thermodynamic properties of systems we propose an ensemble that lies in between the familiar canonical and microcanonical ensembles. We point out the transition from the canonical to microcanonical ensemble and prove from a comparative study that all these ensembles do not yield the same results even in the thermodynamic limit. An investigation of the coupling between two or more systems with these ensembles suggests that the state of thermodynamical equilibrium is a special case of statistical equilibrium. (author)

  6. Performance of the FV3-powered Next Generation Global Prediction System for Harvey and Irma, and a vision for a "beyond weather timescale" prediction system for long-range hurricane track and intensity predictions

    Science.gov (United States)

    Lin, S. J.; Bender, M.; Harris, L.; Hazelton, A.

    2017-12-01

    The performance of a GFDL developed FV3-based Next Generation Global Prediction System (NGGPS) for Harvey and Irma will be reported. We will report on aspects of track and intensity errors (vs operational models), heavy precipitation (Harvey), rapid intensification, and simulated structure (in comparison with ground based radar), and point to a need of a future long-range (from day-5 up to 30 days) physically based ensemble hurricane prediction system for providing useful information to the forecasters, beyond the usual weather timescale.

  7. Ensemble of different approaches for a reliable person re-identification system

    Directory of Open Access Journals (Sweden)

    Loris Nanni

    2016-07-01

    Full Text Available An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance. To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR, keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/.

  8. Estimating predictive hydrological uncertainty by dressing deterministic and ensemble forecasts; a comparison, with application to Meuse and Rhine

    Science.gov (United States)

    Verkade, J. S.; Brown, J. D.; Davids, F.; Reggiani, P.; Weerts, A. H.

    2017-12-01

    Two statistical post-processing approaches for estimation of predictive hydrological uncertainty are compared: (i) 'dressing' of a deterministic forecast by adding a single, combined estimate of both hydrological and meteorological uncertainty and (ii) 'dressing' of an ensemble streamflow forecast by adding an estimate of hydrological uncertainty to each individual streamflow ensemble member. Both approaches aim to produce an estimate of the 'total uncertainty' that captures both the meteorological and hydrological uncertainties. They differ in the degree to which they make use of statistical post-processing techniques. In the 'lumped' approach, both sources of uncertainty are lumped by post-processing deterministic forecasts using their verifying observations. In the 'source-specific' approach, the meteorological uncertainties are estimated by an ensemble of weather forecasts. These ensemble members are routed through a hydrological model and a realization of the probability distribution of hydrological uncertainties (only) is then added to each ensemble member to arrive at an estimate of the total uncertainty. The techniques are applied to one location in the Meuse basin and three locations in the Rhine basin. Resulting forecasts are assessed for their reliability and sharpness, as well as compared in terms of multiple verification scores including the relative mean error, Brier Skill Score, Mean Continuous Ranked Probability Skill Score, Relative Operating Characteristic Score and Relative Economic Value. The dressed deterministic forecasts are generally more reliable than the dressed ensemble forecasts, but the latter are sharper. On balance, however, they show similar quality across a range of verification metrics, with the dressed ensembles coming out slightly better. Some additional analyses are suggested. Notably, these include statistical post-processing of the meteorological forecasts in order to increase their reliability, thus increasing the reliability

  9. Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins.

    Directory of Open Access Journals (Sweden)

    Suyu Mei

    Full Text Available Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM, where support vector machine (SVM is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.

  10. Modelling machine ensembles with discrete event dynamical system theory

    Science.gov (United States)

    Hunter, Dan

    1990-01-01

    Discrete Event Dynamical System (DEDS) theory can be utilized as a control strategy for future complex machine ensembles that will be required for in-space construction. The control strategy involves orchestrating a set of interactive submachines to perform a set of tasks for a given set of constraints such as minimum time, minimum energy, or maximum machine utilization. Machine ensembles can be hierarchically modeled as a global model that combines the operations of the individual submachines. These submachines are represented in the global model as local models. Local models, from the perspective of DEDS theory , are described by the following: a set of system and transition states, an event alphabet that portrays actions that takes a submachine from one state to another, an initial system state, a partial function that maps the current state and event alphabet to the next state, and the time required for the event to occur. Each submachine in the machine ensemble is presented by a unique local model. The global model combines the local models such that the local models can operate in parallel under the additional logistic and physical constraints due to submachine interactions. The global model is constructed from the states, events, event functions, and timing requirements of the local models. Supervisory control can be implemented in the global model by various methods such as task scheduling (open-loop control) or implementing a feedback DEDS controller (closed-loop control).

  11. Modified ensemble Kalman filter for nuclear accident atmospheric dispersion: prediction improved and source estimated.

    Science.gov (United States)

    Zhang, X L; Su, G F; Yuan, H Y; Chen, J G; Huang, Q Y

    2014-09-15

    Atmospheric dispersion models play an important role in nuclear power plant accident management. A reliable estimation of radioactive material distribution in short range (about 50 km) is in urgent need for population sheltering and evacuation planning. However, the meteorological data and the source term which greatly influence the accuracy of the atmospheric dispersion models are usually poorly known at the early phase of the emergency. In this study, a modified ensemble Kalman filter data assimilation method in conjunction with a Lagrangian puff-model is proposed to simultaneously improve the model prediction and reconstruct the source terms for short range atmospheric dispersion using the off-site environmental monitoring data. Four main uncertainty parameters are considered: source release rate, plume rise height, wind speed and wind direction. Twin experiments show that the method effectively improves the predicted concentration distribution, and the temporal profiles of source release rate and plume rise height are also successfully reconstructed. Moreover, the time lag in the response of ensemble Kalman filter is shortened. The method proposed here can be a useful tool not only in the nuclear power plant accident emergency management but also in other similar situation where hazardous material is released into the atmosphere. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. The role of model dynamics in ensemble Kalman filter performance for chaotic systems

    Science.gov (United States)

    Ng, G.-H.C.; McLaughlin, D.; Entekhabi, D.; Ahanin, A.

    2011-01-01

    The ensemble Kalman filter (EnKF) is susceptible to losing track of observations, or 'diverging', when applied to large chaotic systems such as atmospheric and ocean models. Past studies have demonstrated the adverse impact of sampling error during the filter's update step. We examine how system dynamics affect EnKF performance, and whether the absence of certain dynamic features in the ensemble may lead to divergence. The EnKF is applied to a simple chaotic model, and ensembles are checked against singular vectors of the tangent linear model, corresponding to short-term growth and Lyapunov vectors, corresponding to long-term growth. Results show that the ensemble strongly aligns itself with the subspace spanned by unstable Lyapunov vectors. Furthermore, the filter avoids divergence only if the full linearized long-term unstable subspace is spanned. However, short-term dynamics also become important as non-linearity in the system increases. Non-linear movement prevents errors in the long-term stable subspace from decaying indefinitely. If these errors then undergo linear intermittent growth, a small ensemble may fail to properly represent all important modes, causing filter divergence. A combination of long and short-term growth dynamics are thus critical to EnKF performance. These findings can help in developing practical robust filters based on model dynamics. ?? 2011 The Authors Tellus A ?? 2011 John Wiley & Sons A/S.

  13. Ensemble methods for seasonal limited area forecasts

    DEFF Research Database (Denmark)

    Arritt, Raymond W.; Anderson, Christopher J.; Takle, Eugene S.

    2004-01-01

    The ensemble prediction methods used for seasonal limited area forecasts were examined by comparing methods for generating ensemble simulations of seasonal precipitation. The summer 1993 model over the north-central US was used as a test case. The four methods examined included the lagged-average...

  14. Wind Power Prediction using Ensembles

    DEFF Research Database (Denmark)

    Giebel, Gregor; Badger, Jake; Landberg, Lars

    2005-01-01

    offshore wind farm and the whole Jutland/Funen area. The utilities used these forecasts for maintenance planning, fuel consumption estimates and over-the-weekend trading on the Leipzig power exchange. Othernotable scientific results include the better accuracy of forecasts made up from a simple...... superposition of two NWP provider (in our case, DMI and DWD), an investigation of the merits of a parameterisation of the turbulent kinetic energy within thedelivered wind speed forecasts, and the finding that a “naïve” downscaling of each of the coarse ECMWF ensemble members with higher resolution HIRLAM did...

  15. Drug-target interaction prediction via class imbalance-aware ensemble learning.

    Science.gov (United States)

    Ezzat, Ali; Wu, Min; Li, Xiao-Li; Kwoh, Chee-Keong

    2016-12-22

    Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types. We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was

  16. An ensemble approach to the evolution of complex systems

    Indian Academy of Sciences (India)

    2014-03-15

    Mar 15, 2014 ... [Arpağ G and Erzan A 2014 An ensemble approach to the evolution of complex systems. J. Biosci. ... almost nothing about all the different ways in which your ...... energy cost to the organism of the maintenance, replication,.

  17. MSEBAG: a dynamic classifier ensemble generation based on `minimum-sufficient ensemble' and bagging

    Science.gov (United States)

    Chen, Lei; Kamel, Mohamed S.

    2016-01-01

    In this paper, we propose a dynamic classifier system, MSEBAG, which is characterised by searching for the 'minimum-sufficient ensemble' and bagging at the ensemble level. It adopts an 'over-generation and selection' strategy and aims to achieve a good bias-variance trade-off. In the training phase, MSEBAG first searches for the 'minimum-sufficient ensemble', which maximises the in-sample fitness with the minimal number of base classifiers. Then, starting from the 'minimum-sufficient ensemble', a backward stepwise algorithm is employed to generate a collection of ensembles. The objective is to create a collection of ensembles with a descending fitness on the data, as well as a descending complexity in the structure. MSEBAG dynamically selects the ensembles from the collection for the decision aggregation. The extended adaptive aggregation (EAA) approach, a bagging-style algorithm performed at the ensemble level, is employed for this task. EAA searches for the competent ensembles using a score function, which takes into consideration both the in-sample fitness and the confidence of the statistical inference, and averages the decisions of the selected ensembles to label the test pattern. The experimental results show that the proposed MSEBAG outperforms the benchmarks on average.

  18. Flow-dependent empirical singular vector with an ensemble Kalman filter data assimilation for El Nino prediction

    Energy Technology Data Exchange (ETDEWEB)

    Ham, Yoo-Geun [NASA/GSFC Code 610.1, Global Modeling and Assimilation Office, Greenbelt, MD (United States); Universities Space Research Association, Goddard Earth Sciences Technology and Research Studies and Investigations, Baltimore, MD (United States); Rienecker, Michele M. [NASA/GSFC Code 610.1, Global Modeling and Assimilation Office, Greenbelt, MD (United States)

    2012-10-15

    In this study, a new approach for extracting flow-dependent empirical singular vectors (FESVs) for seasonal prediction using ensemble perturbations obtained from an ensemble Kalman filter (EnKF) assimilation is presented. Due to the short interval between analyses, EnKF perturbations primarily contain instabilities related to fast weather variability. To isolate slower, coupled instabilities that would be more suitable for seasonal prediction, an empirical linear operator for seasonal time-scales (i.e. several months) is formulated using a causality hypothesis; then, the most unstable mode from the linear operator is extracted for seasonal time-scales. It is shown that the flow-dependent operator represents nonlinear integration results better than a conventional empirical linear operator static in time. Through 20 years of retrospective seasonal predictions, it is shown that the skill of forecasting equatorial SST anomalies using the FESV is systematically improved over that using Conventional ESV (CESV). For example, the correlation skill of the NINO3 SST index using FESV is higher, by about 0.1, than that of CESV at 8-month leads. In addition, the forecast skill improvement is significant over the locations where the correlation skill of conventional methods is relatively low, indicating that the FESV is effective where the initial uncertainty is large. (orig.)

  19. Evaluation of quantitative precipitation forecasts by TIGGE ensembles for south China during the presummer rainy season

    Science.gov (United States)

    Huang, Ling; Luo, Yali

    2017-08-01

    Based on The Observing System Research and Predictability Experiment Interactive Grand Global Ensemble (TIGGE) data set, this study evaluates the ability of global ensemble prediction systems (EPSs) from the European Centre for Medium-Range Weather Forecasts (ECMWF), U.S. National Centers for Environmental Prediction, Japan Meteorological Agency (JMA), Korean Meteorological Administration, and China Meteorological Administration (CMA) to predict presummer rainy season (April-June) precipitation in south China. Evaluation of 5 day forecasts in three seasons (2013-2015) demonstrates the higher skill of probability matching forecasts compared to simple ensemble mean forecasts and shows that the deterministic forecast is a close second. The EPSs overestimate light-to-heavy rainfall (0.1 to 30 mm/12 h) and underestimate heavier rainfall (>30 mm/12 h), with JMA being the worst. By analyzing the synoptic situations predicted by the identified more skillful (ECMWF) and less skillful (JMA and CMA) EPSs and the ensemble sensitivity for four representative cases of torrential rainfall, the transport of warm-moist air into south China by the low-level southwesterly flow, upstream of the torrential rainfall regions, is found to be a key synoptic factor that controls the quantitative precipitation forecast. The results also suggest that prediction of locally produced torrential rainfall is more challenging than prediction of more extensively distributed torrential rainfall. A slight improvement in the performance is obtained by shortening the forecast lead time from 30-36 h to 18-24 h to 6-12 h for the cases with large-scale forcing, but not for the locally produced cases.

  20. Using ensemble forecasting for wind power

    Energy Technology Data Exchange (ETDEWEB)

    Giebel, G.; Landberg, L.; Badger, J. [Risoe National Lab., Roskilde (Denmark); Sattler, K.

    2003-07-01

    Short-term prediction of wind power has a long tradition in Denmark. It is an essential tool for the operators to keep the grid from becoming unstable in a region like Jutland, where more than 27% of the electricity consumption comes from wind power. This means that the minimum load is already lower than the maximum production from wind energy alone. Danish utilities have therefore used short-term prediction of wind energy since the mid-90ies. However, the accuracy is still far from being sufficient in the eyes of the utilities (used to have load forecasts accurate to within 5% on a one-week horizon). The Ensemble project tries to alleviate the dependency of the forecast quality on one model by using multiple models, and also will investigate the possibilities of using the model spread of multiple models or of dedicated ensemble runs for a prediction of the uncertainty of the forecast. Usually, short-term forecasting works (especially for the horizon beyond 6 hours) by gathering input from a Numerical Weather Prediction (NWP) model. This input data is used together with online data in statistical models (this is the case eg in Zephyr/WPPT) to yield the output of the wind farms or of a whole region for the next 48 hours (only limited by the NWP model horizon). For the accuracy of the final production forecast, the accuracy of the NWP prediction is paramount. While many efforts are underway to increase the accuracy of the NWP forecasts themselves (which ultimately are limited by the amount of computing power available, the lack of a tight observational network on the Atlantic and limited physics modelling), another approach is to use ensembles of different models or different model runs. This can be either an ensemble of different models output for the same area, using different data assimilation schemes and different model physics, or a dedicated ensemble run by a large institution, where the same model is run with slight variations in initial conditions and

  1. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    Directory of Open Access Journals (Sweden)

    Joeri Ruyssinck

    Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made

  2. Multi-model analysis in hydrological prediction

    Science.gov (United States)

    Lanthier, M.; Arsenault, R.; Brissette, F.

    2017-12-01

    Hydrologic modelling, by nature, is a simplification of the real-world hydrologic system. Therefore ensemble hydrological predictions thus obtained do not present the full range of possible streamflow outcomes, thereby producing ensembles which demonstrate errors in variance such as under-dispersion. Past studies show that lumped models used in prediction mode can return satisfactory results, especially when there is not enough information available on the watershed to run a distributed model. But all lumped models greatly simplify the complex processes of the hydrologic cycle. To generate more spread in the hydrologic ensemble predictions, multi-model ensembles have been considered. In this study, the aim is to propose and analyse a method that gives an ensemble streamflow prediction that properly represents the forecast probabilities and reduced ensemble bias. To achieve this, three simple lumped models are used to generate an ensemble. These will also be combined using multi-model averaging techniques, which generally generate a more accurate hydrogram than the best of the individual models in simulation mode. This new predictive combined hydrogram is added to the ensemble, thus creating a large ensemble which may improve the variability while also improving the ensemble mean bias. The quality of the predictions is then assessed on different periods: 2 weeks, 1 month, 3 months and 6 months using a PIT Histogram of the percentiles of the real observation volumes with respect to the volumes of the ensemble members. Initially, the models were run using historical weather data to generate synthetic flows. This worked for individual models, but not for the multi-model and for the large ensemble. Consequently, by performing data assimilation at each prediction period and thus adjusting the initial states of the models, the PIT Histogram could be constructed using the observed flows while allowing the use of the multi-model predictions. The under-dispersion has been

  3. WE-E-BRE-05: Ensemble of Graphical Models for Predicting Radiation Pneumontis Risk

    Energy Technology Data Exchange (ETDEWEB)

    Lee, S; Ybarra, N; Jeyaseelan, K; El Naqa, I [McGill University, Montreal, Quebec (Canada); Faria, S; Kopek, N [Montreal General Hospital, Montreal, Quebec (Canada)

    2014-06-15

    Purpose: We propose a prior knowledge-based approach to construct an interaction graph of biological and dosimetric radiation pneumontis (RP) covariates for the purpose of developing a RP risk classifier. Methods: We recruited 59 NSCLC patients who received curative radiotherapy with minimum 6 month follow-up. 16 RP events was observed (CTCAE grade ≥2). Blood serum was collected from every patient before (pre-RT) and during RT (mid-RT). From each sample the concentration of the following five candidate biomarkers were taken as covariates: alpha-2-macroglobulin (α2M), angiotensin converting enzyme (ACE), transforming growth factor β (TGF-β), interleukin-6 (IL-6), and osteopontin (OPN). Dose-volumetric parameters were also included as covariates. The number of biological and dosimetric covariates was reduced by a variable selection scheme implemented by L1-regularized logistic regression (LASSO). Posterior probability distribution of interaction graphs between the selected variables was estimated from the data under the literature-based prior knowledge to weight more heavily the graphs that contain the expected associations. A graph ensemble was formed by averaging the most probable graphs weighted by their posterior, creating a Bayesian Network (BN)-based RP risk classifier. Results: The LASSO selected the following 7 RP covariates: (1) pre-RT concentration level of α2M, (2) α2M level mid- RT/pre-RT, (3) pre-RT IL6 level, (4) IL6 level mid-RT/pre-RT, (5) ACE mid-RT/pre-RT, (6) PTV volume, and (7) mean lung dose (MLD). The ensemble BN model achieved the maximum sensitivity/specificity of 81%/84% and outperformed univariate dosimetric predictors as shown by larger AUC values (0.78∼0.81) compared with MLD (0.61), V20 (0.65) and V30 (0.70). The ensembles obtained by incorporating the prior knowledge improved classification performance for the ensemble size 5∼50. Conclusion: We demonstrated a probabilistic ensemble method to detect robust associations between

  4. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units

    International Nuclear Information System (INIS)

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios, Martha; Baechler, Sébastien

    2015-01-01

    Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables

  5. Visualization and classification of physiological failure modes in ensemble hemorrhage simulation

    Science.gov (United States)

    Zhang, Song; Pruett, William Andrew; Hester, Robert

    2015-01-01

    In an emergency situation such as hemorrhage, doctors need to predict which patients need immediate treatment and care. This task is difficult because of the diverse response to hemorrhage in human population. Ensemble physiological simulations provide a means to sample a diverse range of subjects and may have a better chance of containing the correct solution. However, to reveal the patterns and trends from the ensemble simulation is a challenging task. We have developed a visualization framework for ensemble physiological simulations. The visualization helps users identify trends among ensemble members, classify ensemble member into subpopulations for analysis, and provide prediction to future events by matching a new patient's data to existing ensembles. We demonstrated the effectiveness of the visualization on simulated physiological data. The lessons learned here can be applied to clinically-collected physiological data in the future.

  6. Improved ensemble-mean forecast skills of ENSO events by a zero-mean stochastic model-error model of an intermediate coupled model

    Science.gov (United States)

    Zheng, F.; Zhu, J.

    2015-12-01

    To perform an ensemble-based ENSO probabilistic forecast, the crucial issue is to design a reliable ensemble prediction strategy that should include the major uncertainties of a forecast system. In this study, we developed a new general ensemble perturbation technique to improve the ensemble-mean predictive skill of forecasting ENSO using an intermediate coupled model (ICM). The model uncertainties are first estimated and analyzed from EnKF analysis results through assimilating observed SST. Then, based on the pre-analyzed properties of the model errors, a zero-mean stochastic model-error model is developed to mainly represent the model uncertainties induced by some important physical processes missed in the coupled model (i.e., stochastic atmospheric forcing/MJO, extra-tropical cooling and warming, Indian Ocean Dipole mode, etc.). Each member of an ensemble forecast is perturbed by the stochastic model-error model at each step during the 12-month forecast process, and the stochastical perturbations are added into the modeled physical fields to mimic the presence of these high-frequency stochastic noises and model biases and their effect on the predictability of the coupled system. The impacts of stochastic model-error perturbations on ENSO deterministic predictions are examined by performing two sets of 21-yr retrospective forecast experiments. The two forecast schemes are differentiated by whether they considered the model stochastic perturbations, with both initialized by the ensemble-mean analysis states from EnKF. The comparison results suggest that the stochastic model-error perturbations have significant and positive impacts on improving the ensemble-mean prediction skills during the entire 12-month forecast process. Because the nonlinear feature of the coupled model can induce the nonlinear growth of the added stochastic model errors with model integration, especially through the nonlinear heating mechanism with the vertical advection term of the model, the

  7. Optimized expanded ensembles for simulations involving molecular insertions and deletions. II. Open systems

    Science.gov (United States)

    Escobedo, Fernando A.

    2007-11-01

    In the Grand Canonical, osmotic, and Gibbs ensembles, chemical potential equilibrium is attained via transfers of molecules between the system and either a reservoir or another subsystem. In this work, the expanded ensemble (EXE) methods described in part I [F. A. Escobedo and F. J. Martínez-Veracoechea, J. Chem. Phys. 127, 174103 (2007)] of this series are extended to these ensembles to overcome the difficulties associated with implementing such whole-molecule transfers. In EXE, such moves occur via a target molecule that undergoes transitions through a number of intermediate coupling states. To minimize the tunneling time between the fully coupled and fully decoupled states, the intermediate states could be either: (i) sampled with an optimal frequency distribution (the sampling problem) or (ii) selected with an optimal spacing distribution (staging problem). The sampling issue is addressed by determining the biasing weights that would allow generating an optimal ensemble; discretized versions of this algorithm (well suited for small number of coupling stages) are also presented. The staging problem is addressed by selecting the intermediate stages in such a way that a flat histogram is the optimized ensemble. The validity of the advocated methods is demonstrated by their application to two model problems, the solvation of large hard spheres into a fluid of small and large spheres, and the vapor-liquid equilibrium of a chain system.

  8. A real-time evaluation and demonstration of strategies for 'Over-The-Loop' ensemble streamflow forecasting in US watersheds

    Science.gov (United States)

    Wood, Andy; Clark, Elizabeth; Mendoza, Pablo; Nijssen, Bart; Newman, Andy; Clark, Martyn; Nowak, Kenneth; Arnold, Jeffrey

    2017-04-01

    Many if not most national operational streamflow prediction systems rely on a forecaster-in-the-loop approach that require the hands-on-effort of an experienced human forecaster. This approach evolved from the need to correct for long-standing deficiencies in the models and datasets used in forecasting, and the practice often leads to skillful flow predictions despite the use of relatively simple, conceptual models. Yet the 'in-the-loop' forecast process is not reproducible, which limits opportunities to assess and incorporate new techniques systematically, and the effort required to make forecasts in this way is an obstacle to expanding forecast services - e.g., though adding new forecast locations or more frequent forecast updates, running more complex models, or producing forecast and hindcasts that can support verification. In the last decade, the hydrologic forecasting community has begun develop more centralized, 'over-the-loop' systems. The quality of these new forecast products will depend on their ability to leverage research in areas including earth system modeling, parameter estimation, data assimilation, statistical post-processing, weather and climate prediction, verification, and uncertainty estimation through the use of ensembles. Currently, many national operational streamflow forecasting and water management communities have little experience with the strengths and weaknesses of over-the-loop approaches, even as such systems are beginning to be deployed operationally in centers such as ECMWF. There is thus a need both to evaluate these forecasting advances and to demonstrate their potential in a public arena, raising awareness in forecast user communities and development programs alike. To address this need, the US National Center for Atmospheric Research is collaborating with the University of Washington, the Bureau of Reclamation and the US Army Corps of Engineers, using the NCAR 'System for Hydromet Analysis Research and Prediction Applications

  9. Ensemble-based forecasting at Horns Rev: Ensemble conversion and kernel dressing

    DEFF Research Database (Denmark)

    Pinson, Pierre; Madsen, Henrik

    . The obtained ensemble forecasts of wind power are then converted into predictive distributions with an original adaptive kernel dressing method. The shape of the kernels is driven by a mean-variance model, the parameters of which are recursively estimated in order to maximize the overall skill of obtained...

  10. A Novel Multiscale Ensemble Carbon Price Prediction Model Integrating Empirical Mode Decomposition, Genetic Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Bangzhu Zhu

    2012-02-01

    Full Text Available Due to the movement and complexity of the carbon market, traditional monoscale forecasting approaches often fail to capture its nonstationary and nonlinear properties and accurately describe its moving tendencies. In this study, a multiscale ensemble forecasting model integrating empirical mode decomposition (EMD, genetic algorithm (GA and artificial neural network (ANN is proposed to forecast carbon price. Firstly, the proposed model uses EMD to decompose carbon price data into several intrinsic mode functions (IMFs and one residue. Then, the IMFs and residue are composed into a high frequency component, a low frequency component and a trend component which have similar frequency characteristics, simple components and strong regularity using the fine-to-coarse reconstruction algorithm. Finally, those three components are predicted using an ANN trained by GA, i.e., a GAANN model, and the final forecasting results can be obtained by the sum of these three forecasting results. For verification and testing, two main carbon future prices with different maturity in the European Climate Exchange (ECX are used to test the effectiveness of the proposed multiscale ensemble forecasting model. Empirical results obtained demonstrate that the proposed multiscale ensemble forecasting model can outperform the single random walk (RW, ARIMA, ANN and GAANN models without EMD preprocessing and the ensemble ARIMA model with EMD preprocessing.

  11. Pre- and post-processing of hydro-meteorological ensembles for the Norwegian flood forecasting system in 145 basins.

    Science.gov (United States)

    Jahr Hegdahl, Trine; Steinsland, Ingelin; Merete Tallaksen, Lena; Engeland, Kolbjørn

    2016-04-01

    Probabilistic flood forecasting has an added value for decision making. The Norwegian flood forecasting service is based on a flood forecasting model that run for 145 basins. Covering all of Norway the basins differ in both size and hydrological regime. Currently the flood forecasting is based on deterministic meteorological forecasts, and an auto-regressive procedure is used to achieve probabilistic forecasts. An alternative approach is to use meteorological and hydrological ensemble forecasts to quantify the uncertainty in forecasted streamflow. The hydrological ensembles are based on forcing a hydrological model with meteorological ensemble forecasts of precipitation and temperature. However, the ensembles of precipitation are often biased and the spread is too small, especially for the shortest lead times, i.e. they are not calibrated. These properties will, to some extent, propagate to hydrological ensembles, that most likely will be uncalibrated as well. Pre- and post-processing methods are commonly used to obtain calibrated meteorological and hydrological ensembles respectively. Quantitative studies showing the effect of the combined processing of the meteorological (pre-processing) and the hydrological (post-processing) ensembles are however few. The aim of this study is to evaluate the influence of pre- and post-processing on the skill of streamflow predictions, and we will especially investigate if the forecasting skill depends on lead-time, basin size and hydrological regime. This aim is achieved by applying the 51 medium-range ensemble forecast of precipitation and temperature provided by the European Center of Medium-Range Weather Forecast (ECMWF). These ensembles are used as input to the operational Norwegian flood forecasting model, both raw and pre-processed. Precipitation ensembles are calibrated using a zero-adjusted gamma distribution. Temperature ensembles are calibrated using a Gaussian distribution and altitude corrected by a constant gradient

  12. Fire spread estimation on forest wildfire using ensemble kalman filter

    Science.gov (United States)

    Syarifah, Wardatus; Apriliani, Erna

    2018-04-01

    Wildfire is one of the most frequent disasters in the world, for example forest wildfire, causing population of forest decrease. Forest wildfire, whether naturally occurring or prescribed, are potential risks for ecosystems and human settlements. These risks can be managed by monitoring the weather, prescribing fires to limit available fuel, and creating firebreaks. With computer simulations we can predict and explore how fires may spread. The model of fire spread on forest wildfire was established to determine the fire properties. The fire spread model is prepared based on the equation of the diffusion reaction model. There are many methods to estimate the spread of fire. The Kalman Filter Ensemble Method is a modified estimation method of the Kalman Filter algorithm that can be used to estimate linear and non-linear system models. In this research will apply Ensemble Kalman Filter (EnKF) method to estimate the spread of fire on forest wildfire. Before applying the EnKF method, the fire spread model will be discreted using finite difference method. At the end, the analysis obtained illustrated by numerical simulation using software. The simulation results show that the Ensemble Kalman Filter method is closer to the system model when the ensemble value is greater, while the covariance value of the system model and the smaller the measurement.

  13. Entropy of network ensembles

    Science.gov (United States)

    Bianconi, Ginestra

    2009-03-01

    In this paper we generalize the concept of random networks to describe network ensembles with nontrivial features by a statistical mechanics approach. This framework is able to describe undirected and directed network ensembles as well as weighted network ensembles. These networks might have nontrivial community structure or, in the case of networks embedded in a given space, they might have a link probability with a nontrivial dependence on the distance between the nodes. These ensembles are characterized by their entropy, which evaluates the cardinality of networks in the ensemble. In particular, in this paper we define and evaluate the structural entropy, i.e., the entropy of the ensembles of undirected uncorrelated simple networks with given degree sequence. We stress the apparent paradox that scale-free degree distributions are characterized by having small structural entropy while they are so widely encountered in natural, social, and technological complex systems. We propose a solution to the paradox by proving that scale-free degree distributions are the most likely degree distribution with the corresponding value of the structural entropy. Finally, the general framework we present in this paper is able to describe microcanonical ensembles of networks as well as canonical or hidden-variable network ensembles with significant implications for the formulation of network-constructing algorithms.

  14. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models.

    Directory of Open Access Journals (Sweden)

    David Henriques

    2017-02-01

    Full Text Available Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models, which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks: it builds dynamic (based on ordinary differential equation models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training. For this task, SELDOM's ensemble prediction is not only consistently better

  15. Operational hydrological forecasting in Bavaria. Part II: Ensemble forecasting

    Science.gov (United States)

    Ehret, U.; Vogelbacher, A.; Moritz, K.; Laurent, S.; Meyer, I.; Haag, I.

    2009-04-01

    In part I of this study, the operational flood forecasting system in Bavaria and an approach to identify and quantify forecast uncertainty was introduced. The approach is split into the calculation of an empirical 'overall error' from archived forecasts and the calculation of an empirical 'model error' based on hydrometeorological forecast tests, where rainfall observations were used instead of forecasts. The 'model error' can especially in upstream catchments where forecast uncertainty is strongly dependent on the current predictability of the atrmosphere be superimposed on the spread of a hydrometeorological ensemble forecast. In Bavaria, two meteorological ensemble prediction systems are currently tested for operational use: the 16-member COSMO-LEPS forecast and a poor man's ensemble composed of DWD GME, DWD Cosmo-EU, NCEP GFS, Aladin-Austria, MeteoSwiss Cosmo-7. The determination of the overall forecast uncertainty is dependent on the catchment characteristics: 1. Upstream catchment with high influence of weather forecast a) A hydrological ensemble forecast is calculated using each of the meteorological forecast members as forcing. b) Corresponding to the characteristics of the meteorological ensemble forecast, each resulting forecast hydrograph can be regarded as equally likely. c) The 'model error' distribution, with parameters dependent on hydrological case and lead time, is added to each forecast timestep of each ensemble member d) For each forecast timestep, the overall (i.e. over all 'model error' distribution of each ensemble member) error distribution is calculated e) From this distribution, the uncertainty range on a desired level (here: the 10% and 90% percentile) is extracted and drawn as forecast envelope. f) As the mean or median of an ensemble forecast does not necessarily exhibit meteorologically sound temporal evolution, a single hydrological forecast termed 'lead forecast' is chosen and shown in addition to the uncertainty bounds. This can be

  16. Ensemble Forecasts with Useful Skill-Spread Relationships for African meningitis and Asia Streamflow Forecasting

    Science.gov (United States)

    Hopson, T. M.

    2014-12-01

    One potential benefit of an ensemble prediction system (EPS) is its capacity to forecast its own forecast error through the ensemble spread-error relationship. In practice, an EPS is often quite limited in its ability to represent the variable expectation of forecast error through the variable dispersion of the ensemble, and perhaps more fundamentally, in its ability to provide enough variability in the ensembles dispersion to make the skill-spread relationship even potentially useful (irrespective of whether the EPS is well-calibrated or not). In this paper we examine the ensemble skill-spread relationship of an ensemble constructed from the TIGGE (THORPEX Interactive Grand Global Ensemble) dataset of global forecasts and a combination of multi-model and post-processing approaches. Both of the multi-model and post-processing techniques are based on quantile regression (QR) under a step-wise forward selection framework leading to ensemble forecasts with both good reliability and sharpness. The methodology utilizes the ensemble's ability to self-diagnose forecast instability to produce calibrated forecasts with informative skill-spread relationships. A context for these concepts is provided by assessing the constructed ensemble in forecasting district-level humidity impacting the incidence of meningitis in the meningitis belt of Africa, and in forecasting flooding events in the Brahmaputra and Ganges basins of South Asia.

  17. Ensemble-Based Data Assimilation in Reservoir Characterization: A Review

    Directory of Open Access Journals (Sweden)

    Seungpil Jung

    2018-02-01

    Full Text Available This paper presents a review of ensemble-based data assimilation for strongly nonlinear problems on the characterization of heterogeneous reservoirs with different production histories. It concentrates on ensemble Kalman filter (EnKF and ensemble smoother (ES as representative frameworks, discusses their pros and cons, and investigates recent progress to overcome their drawbacks. The typical weaknesses of ensemble-based methods are non-Gaussian parameters, improper prior ensembles and finite population size. Three categorized approaches, to mitigate these limitations, are reviewed with recent accomplishments; improvement of Kalman gains, add-on of transformation functions, and independent evaluation of observed data. The data assimilation in heterogeneous reservoirs, applying the improved ensemble methods, is discussed on predicting unknown dynamic data in reservoir characterization.

  18. Data assimilation for groundwater flow modelling using Unbiased Ensemble Square Root Filter: Case study in Guantao, North China Plain

    Science.gov (United States)

    Li, N.; Kinzelbach, W.; Li, H.; Li, W.; Chen, F.; Wang, L.

    2017-12-01

    Data assimilation techniques are widely used in hydrology to improve the reliability of hydrological models and to reduce model predictive uncertainties. This provides critical information for decision makers in water resources management. This study aims to evaluate a data assimilation system for the Guantao groundwater flow model coupled with a one-dimensional soil column simulation (Hydrus 1D) using an Unbiased Ensemble Square Root Filter (UnEnSRF) originating from the Ensemble Kalman Filter (EnKF) to update parameters and states, separately or simultaneously. To simplify the coupling between unsaturated and saturated zone, a linear relationship obtained from analyzing inputs to and outputs from Hydrus 1D is applied in the data assimilation process. Unlike EnKF, the UnEnSRF updates parameter ensemble mean and ensemble perturbations separately. In order to keep the ensemble filter working well during the data assimilation, two factors are introduced in the study. One is called damping factor to dampen the update amplitude of the posterior ensemble mean to avoid nonrealistic values. The other is called inflation factor to relax the posterior ensemble perturbations close to prior to avoid filter inbreeding problems. The sensitivities of the two factors are studied and their favorable values for the Guantao model are determined. The appropriate observation error and ensemble size were also determined to facilitate the further analysis. This study demonstrated that the data assimilation of both model parameters and states gives a smaller model prediction error but with larger uncertainty while the data assimilation of only model states provides a smaller predictive uncertainty but with a larger model prediction error. Data assimilation in a groundwater flow model will improve model prediction and at the same time make the model converge to the true parameters, which provides a successful base for applications in real time modelling or real time controlling strategies

  19. A new strategy for snow-cover mapping using remote sensing data and ensemble based systems techniques

    Science.gov (United States)

    Roberge, S.; Chokmani, K.; De Sève, D.

    2012-04-01

    The snow cover plays an important role in the hydrological cycle of Quebec (Eastern Canada). Consequently, evaluating its spatial extent interests the authorities responsible for the management of water resources, especially hydropower companies. The main objective of this study is the development of a snow-cover mapping strategy using remote sensing data and ensemble based systems techniques. Planned to be tested in a near real-time operational mode, this snow-cover mapping strategy has the advantage to provide the probability of a pixel to be snow covered and its uncertainty. Ensemble systems are made of two key components. First, a method is needed to build an ensemble of classifiers that is diverse as much as possible. Second, an approach is required to combine the outputs of individual classifiers that make up the ensemble in such a way that correct decisions are amplified, and incorrect ones are cancelled out. In this study, we demonstrate the potential of ensemble systems to snow-cover mapping using remote sensing data. The chosen classifier is a sequential thresholds algorithm using NOAA-AVHRR data adapted to conditions over Eastern Canada. Its special feature is the use of a combination of six sequential thresholds varying according to the day in the winter season. Two versions of the snow-cover mapping algorithm have been developed: one is specific for autumn (from October 1st to December 31st) and the other for spring (from March 16th to May 31st). In order to build the ensemble based system, different versions of the algorithm are created by varying randomly its parameters. One hundred of the versions are included in the ensemble. The probability of a pixel to be snow, no-snow or cloud covered corresponds to the amount of votes the pixel has been classified as such by all classifiers. The overall performance of ensemble based mapping is compared to the overall performance of the chosen classifier, and also with ground observations at meteorological

  20. Optimal Initial Perturbations for Ensemble Prediction of the Madden-Julian Oscillation during Boreal Winter

    Science.gov (United States)

    Ham, Yoo-Geun; Schubert, Siegfried; Chang, Yehui

    2012-01-01

    An initialization strategy, tailored to the prediction of the Madden-Julian oscillation (MJO), is evaluated using the Goddard Earth Observing System Model, version 5 (GEOS-5), coupled general circulation model (CGCM). The approach is based on the empirical singular vectors (ESVs) of a reduced-space statistically determined linear approximation of the full nonlinear CGCM. The initial ESV, extracted using 10 years (1990-99) of boreal winter hindcast data, has zonal wind anomalies over the western Indian Ocean, while the final ESV (at a forecast lead time of 10 days) reflects a propagation of the zonal wind anomalies to the east over the Maritime Continent an evolution that is characteristic of the MJO. A new set of ensemble hindcasts are produced for the boreal winter season from 1990 to 1999 in which the leading ESV provides the initial perturbations. The results are compared with those from a set of control hindcasts generated using random perturbations. It is shown that the ESV-based predictions have a systematically higher bivariate correlation skill in predicting the MJO compared to those using the random perturbations. Furthermore, the improvement in the skill depends on the phase of the MJO. The ESV is particularly effective in increasing the forecast skill during those phases of the MJO in which the control has low skill (with correlations increasing by as much as 0.2 at 20 25-day lead times), as well as during those times in which the MJO is weak.

  1. MODEL SISTEM PREDIKSI ENSEMBLE TOTAL HUJAN BULANAN DENGAN NILAI PEMBOBOT (KASUS WILAYAH KABUPATEN INDRAMAYU

    Directory of Open Access Journals (Sweden)

    Yunus Subagyo Swarinoto

    2014-08-01

    Full Text Available Manajemen air menjadi sangat penting khususnya di wilayah yang rentan terhadap ketersediaan air. Mengingat hujan di atas normal dapat mengakibatkan banjir, sedangkan hujan di bawah normal mengakibatkan kekeringan. Untuk itu prediksi unsur iklim hujan ini menjadi penting. Model sistem prediksi ensemble berbasis model sistem prediksi tunggal ANFIS, Wavelet-ANFIS, Wavelet ARIMA, dan ARIMA total hujan bulanan telah disimulasikan di wilayah Kabupaten Indramayu. Model sistem prediksi ensemble total hujan bulanan ini dibentuk dengan teknik pembobotan. Nilai pembobot didasarkan pada nilai koefisien korelasi Pearson (r yang diperoleh selama masa pelatihan dengan series data 1991-2000. Hasil pengolahan data 2001-2009 menunjukkan kisaran nilai r didapat 0,45-0,83 untuk ANFIS; 0,20-0,53 untuk Wavelet-ANFIS; 0,50-0,95 untuk Wavelet-ARIMA; 0,14-0,66 untuk ARIMA; dan 0,58-0,94 untuk Ensemble. Secara spasial, luaran model sistem prediksi ensemble total hujan bulanan di wilayah Kabupaten Indramayu menunjukkan hasil yang konsisten lebih baik daripada luaran model sistem prediksi tunggal pembentuknya.   Water management is very important especially for region which is vulnarable to the water availability. Above normal rainfal condition causes flood, meanwhile below normal one triggers to the drought occurences. Coping with this situation, the rainfall prediction output is needed. The ensemble prediction system model (EPSM based on several single prediction system models (SPSMs such as ANFIS, Wavelet-ANFIS, Wavelet ARIMA, and ARIMA on monthly rainfall total, has been simulated within Indramayu district. The EPSM was developed and based on the weighting technique. This weighting is computed based on the value of Pearson correlation coefficient (r which has been gained during the training period of 1991-2000. Results of 2001-2009 model running show the value of r are 0,45-0,83 for ANFIS; 0,20-0,53 for Wavelet- ANFIS;  0,50-0,95 for Wavelet-ARIMA; 0,14-0,66 for

  2. An ensemble approach to predicting the impact of vaccination on rotavirus disease in Niger.

    Science.gov (United States)

    Park, Jaewoo; Goldstein, Joshua; Haran, Murali; Ferrari, Matthew

    2017-10-13

    Recently developed vaccines provide a new way of controlling rotavirus in sub-Saharan Africa. Models for the transmission dynamics of rotavirus are critical both for estimating current burden from imperfect surveillance and for assessing potential effects of vaccine intervention strategies. We examine rotavirus infection in the Maradi area in southern Niger using hospital surveillance data provided by Epicentre collected over two years. Additionally, a cluster survey of households in the region allows us to estimate the proportion of children with diarrhea who consulted at a health structure. Model fit and future projections are necessarily particular to a given model; thus, where there are competing models for the underlying epidemiology an ensemble approach can account for that uncertainty. We compare our results across several variants of Susceptible-Infectious-Recovered (SIR) compartmental models to quantify the impact of modeling assumptions on our estimates. Model-specific parameters are estimated by Bayesian inference using Markov chain Monte Carlo. We then use Bayesian model averaging to generate ensemble estimates of the current dynamics, including estimates of R 0 , the burden of infection in the region, as well as the impact of vaccination on both the short-term dynamics and the long-term reduction of rotavirus incidence under varying levels of coverage. The ensemble of models predicts that the current burden of severe rotavirus disease is 2.6-3.7% of the population each year and that a 2-dose vaccine schedule achieving 70% coverage could reduce burden by 39-42%. Copyright © 2017. Published by Elsevier Ltd.

  3. Predictability over the North Atlantic ocean in hindcast ensembles of MPI-ESM initialized by EnKF and three nudging systems

    Science.gov (United States)

    Brune, Sebastian; Pohlmann, Holger; Düsterhus, Andre; Kröger, Jürgen; Müller, Wolfgang; Baehr, Johanna

    2016-04-01

    We investigate hindcast skill for surface air temperature and upper ocean heat content (0-700m) in the North Atlantic for yearly mean values from 1960 to 2014 in four prediction systems based on the global coupled Max Planck Institute for Meteorology Earth System Model (MPI-ESM). We find that in the North Atlantic and within the four prediction systems under consideration only the EnKF initialized hindcasts reproduce the variability of the reference data well both in terms of anomaly correlation and representation of the probability density function. The systems under consideration only differ in the method how they incorporate surface and sub-surface oceanic temperatures and salinities during assimilation: ensemble Kalman Filter (EnKF), anomaly nudging of ORA reanalysis (BS-1), full field nudging of ORA and GECCO reanalysis, respectively (PT-ORA, PT-GEC). We assess the hindcast skill of each prediction system with reference to HadCRUT4 near surface air temperature data (Morice et al. 2012) and NOAA OC5 upper ocean heat content data (Levitus et al. 2012) using anomaly correlation (ACC) and by analysing the interquartile range (IQR) of the probability density function (PDF). Firstly, we calculate hindcast skill in terms of ACC and IQR against reference data over the whole time period. Here, the hindcast skills of EnKF and BS-1 are better for both ACC and IQR in lead years 2 to 5 when compared to PT-ORA and PT-GEC, their hindcast skill drops off after lead year 1. Secondly, the PDF of the reference data is not uniformly distributed over time. We therefore calculate ACC and IQR for a 20 year moving window. We find hindcast skill in terms of ACC for EnKF and BS-1 in the 1960s and from the 1990s onwards, up to eight lead years in advance, with almost no skill for the time period inbetween. In contrast, there is no skill for PT-ORA and PT-GEC in any period after lead year one. The IQR of reference data is best captured by the EnKF, in the 1960s and 1990s up to lead year

  4. Prediction of N-Methyl-D-Aspartate Receptor GluN1-Ligand Binding Affinity by a Novel SVM-Pose/SVM-Score Combinatorial Ensemble Docking Scheme.

    Science.gov (United States)

    Leong, Max K; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng

    2017-01-06

    The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r 2  = 0.928-0.988,  = 0.894-0.954, RMSE = 0.002-0.412, s = 0.001-0.214), and the predicted pK i values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r 2  = 0.967,  = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q 2  = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.

  5. An ensemble model of QSAR tools for regulatory risk assessment.

    Science.gov (United States)

    Pradeep, Prachi; Povinelli, Richard J; White, Shannon; Merrill, Stephen J

    2016-01-01

    Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa ( κ ): 0

  6. Thermodynamics and kinetics of a molecular motor ensemble.

    Science.gov (United States)

    Baker, J E; Thomas, D D

    2000-10-01

    If, contrary to conventional models of muscle, it is assumed that molecular forces equilibrate among rather than within molecular motors, an equation of state and an expression for energy output can be obtained for a near-equilibrium, coworking ensemble of molecular motors. These equations predict clear, testable relationships between motor structure, motor biochemistry, and ensemble motor function, and we discuss these relationships in the context of various experimental studies. In this model, net work by molecular motors is performed with the relaxation of a near-equilibrium intermediate step in a motor-catalyzed reaction. The free energy available for work is localized to this step, and the rate at which this free energy is transferred to work is accelerated by the free energy of a motor-catalyzed reaction. This thermodynamic model implicitly deals with a motile cell system as a dynamic network (not a rigid lattice) of molecular motors within which the mechanochemistry of one motor influences and is influenced by the mechanochemistry of other motors in the ensemble.

  7. The Ensembl REST API: Ensembl Data for Any Language.

    Science.gov (United States)

    Yates, Andrew; Beal, Kathryn; Keenan, Stephen; McLaren, William; Pignatelli, Miguel; Ritchie, Graham R S; Ruffier, Magali; Taylor, Kieron; Vullo, Alessandro; Flicek, Paul

    2015-01-01

    We present a Web service to access Ensembl data using Representational State Transfer (REST). The Ensembl REST server enables the easy retrieval of a wide range of Ensembl data by most programming languages, using standard formats such as JSON and FASTA while minimizing client work. We also introduce bindings to the popular Ensembl Variant Effect Predictor tool permitting large-scale programmatic variant analysis independent of any specific programming language. The Ensembl REST API can be accessed at http://rest.ensembl.org and source code is freely available under an Apache 2.0 license from http://github.com/Ensembl/ensembl-rest. © The Author 2014. Published by Oxford University Press.

  8. AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.

    Science.gov (United States)

    Zhao, X G; Dai, W; Li, Y; Tian, L

    2011-11-01

    The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.

  9. Ensembl variation resources

    Directory of Open Access Journals (Sweden)

    Marin-Garcia Pablo

    2010-05-01

    Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.

  10. Revealing skill of the MiKlip decadal prediction system by three-dimensional probabilistic evaluation

    Directory of Open Access Journals (Sweden)

    Sophie Stolzenberger

    2016-12-01

    Full Text Available Decadal climate predictions and their verification are part of ongoing research. This article studies different methods applied to decadal hindcasts of three-dimensional atmospheric variables to evaluate the MiKlip (Mittelfristige Klimaprognosen prediction system. Variables such as upper air temperature are tight to the core of the prediction system and hence help to reveal its power and deficiencies. The verification uses both, necessary and sufficient probabilistic measures. We analyze annual and multi-year averages of air temperature and geopotential height and the parametrized quantity net water flux at the ocean surface, the so-called freshwater flux, also known as E‑P (evaporation minus precipitation, as an important variable for atmosphere-ocean coupling. The model data stem from various versions of the MiKlip prediction system and constitute different sets of ensemble hindcasts covering 1979–2012. The results reveal that the freshwater flux is far more sensitive to model deficiencies than the basic dynamical variables and the predictability decays much earlier with prediction lead time. Initializing the atmospheric component is more important for the predictability than the difference in resolution between two model versions. The combined initialization of atmosphere and ocean has the effect of increasing the predictability in the inner tropics from 1 to 2 years compared to the ocean only initialization. For prediction year 7–10, the hindcasts are still closer to each other than to the uninitialized historical runs indicating that the prediction system is still influenced by the initial conditions. The skill for prediction year 7–10 is, however, only marginally larger than the skill of the uninitialized ensemble. The three-dimensional skill analysis reveals a clear indication of a mid-tropospheric temperature error developing in the tropical Pacific area.

  11. Exploring uncertainty of Amazon dieback in a perturbed parameter Earth system ensemble.

    Science.gov (United States)

    Boulton, Chris A; Booth, Ben B B; Good, Peter

    2017-12-01

    The future of the Amazon rainforest is unknown due to uncertainties in projected climate change and the response of the forest to this change (forest resiliency). Here, we explore the effect of some uncertainties in climate and land surface processes on the future of the forest, using a perturbed physics ensemble of HadCM3C. This is the first time Amazon forest changes are presented using an ensemble exploring both land vegetation processes and physical climate feedbacks in a fully coupled modelling framework. Under three different emissions scenarios, we measure the change in the forest coverage by the end of the 21st century (the transient response) and make a novel adaptation to a previously used method known as "dry-season resilience" to predict the long-term committed response of the forest, should the state of the climate remain constant past 2100. Our analysis of this ensemble suggests that there will be a high chance of greater forest loss on longer timescales than is realized by 2100, especially for mid-range and low emissions scenarios. In both the transient and predicted committed responses, there is an increasing uncertainty in the outcome of the forest as the strength of the emissions scenarios increases. It is important to note however, that very few of the simulations produce future forest loss of the magnitude previously shown under the standard model configuration. We find that low optimum temperatures for photosynthesis and a high minimum leaf area index needed for the forest to compete for space appear to be precursors for dieback. We then decompose the uncertainty into that associated with future climate change and that associated with forest resiliency, finding that it is important to reduce the uncertainty in both of these if we are to better determine the Amazon's outcome. © 2017 John Wiley & Sons Ltd.

  12. System size effects on the mechanical response of cohesive-frictional granular ensembles

    Directory of Open Access Journals (Sweden)

    Singh Saurabh

    2017-01-01

    Full Text Available Shear resistance in granular ensembles is a result of interparticle interaction and friction. However, even the presence of small amounts of cohesion between the particles changes the landscape of the mechanical response considerably. Very often such cohesive frictional (c-ϕ granular ensembles are encountered in nature as well as while handling and storage of granular materials in the pharmaceutical, construction and mining industries. Modeling of these c-ϕ materials, especially in engineering applications have relied on the oft-made assumption of a “continua” and have utilized the popular tenets of continuum plasticity theory. We present an experimental investigation on the fundamental mechanics of c-ϕ materials specifically; we investigate if there exists a system size effect and any additional length scales beyond the continuum length scale on their mechanical response. For this purpose, we conduct a series of 1-D compression (UC tests on cylindrical specimens reconstituted in the laboratory with a range of model particle–binder combinations such as sandcement, sand-epoxy, and glass ballotini-epoxy mixtures. Specimens are reconstituted to various diameters ranging from 10 mm to 150 mm (with an aspect ratio of 2 to a predefined packing fraction. In addition to the effect of the type of binder (cement, epoxy and system size, the mean particle size is also varied from 0.5 to 2.5 mm. The peak strength of these materials is significant as it signals the initiation of the cohesive-bond breaking and onset of mobilization of the inter particle frictional resistance. For these model systems, the peak strength is a strong function of the system size of the ensemble as well as the mean particle size. This intriguing observation is counter to the traditional notion of a continuum plastic typical granular ensemble. Microstructure studies in a computed-tomograph have revealed the existence of a web patterned ‘entangled-chain’ like structure

  13. System size effects on the mechanical response of cohesive-frictional granular ensembles

    Science.gov (United States)

    Singh, Saurabh; Kandasami, Ramesh Kannan; Mahendran, Rupesh Kumar; Murthy, Tejas

    2017-06-01

    Shear resistance in granular ensembles is a result of interparticle interaction and friction. However, even the presence of small amounts of cohesion between the particles changes the landscape of the mechanical response considerably. Very often such cohesive frictional (c-ϕ) granular ensembles are encountered in nature as well as while handling and storage of granular materials in the pharmaceutical, construction and mining industries. Modeling of these c-ϕ materials, especially in engineering applications have relied on the oft-made assumption of a "continua" and have utilized the popular tenets of continuum plasticity theory. We present an experimental investigation on the fundamental mechanics of c-ϕ materials specifically; we investigate if there exists a system size effect and any additional length scales beyond the continuum length scale on their mechanical response. For this purpose, we conduct a series of 1-D compression (UC) tests on cylindrical specimens reconstituted in the laboratory with a range of model particle-binder combinations such as sandcement, sand-epoxy, and glass ballotini-epoxy mixtures. Specimens are reconstituted to various diameters ranging from 10 mm to 150 mm (with an aspect ratio of 2) to a predefined packing fraction. In addition to the effect of the type of binder (cement, epoxy) and system size, the mean particle size is also varied from 0.5 to 2.5 mm. The peak strength of these materials is significant as it signals the initiation of the cohesive-bond breaking and onset of mobilization of the inter particle frictional resistance. For these model systems, the peak strength is a strong function of the system size of the ensemble as well as the mean particle size. This intriguing observation is counter to the traditional notion of a continuum plastic typical granular ensemble. Microstructure studies in a computed-tomograph have revealed the existence of a web patterned `entangled-chain' like structure, we argue that this ushers

  14. Ensemble-free configurational temperature for spin systems

    Science.gov (United States)

    Palma, G.; Gutiérrez, G.; Davis, S.

    2016-12-01

    An estimator for the dynamical temperature in an arbitrary ensemble is derived in the framework of the conjugate variables theorem. We prove directly that its average indeed gives the inverse temperature and that it is independent of the ensemble. We test this estimator numerically by a simulation of the two-dimensional X Y model in the canonical ensemble. As this model is critical in the whole region of temperatures below the Berezinski-Kosterlitz-Thouless critical temperature TBKT, we use a generalization of Wolff's unicluster algorithm. The numerical results allow us to confirm the robustness of the analytical expression for the microscopic estimator of the temperature. This microscopic estimator has also the advantage that it gives a direct measure of the thermalization process and can be used to compute absolute errors associated with statistical fluctuations. In consequence, this estimator allows for a direct, absolute, and stringent test of the ergodicity of the underlying Markov process, which encodes the algorithm used in a numerical simulation.

  15. Bayesian energy landscape tilting: towards concordant models of molecular ensembles.

    Science.gov (United States)

    Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju

    2014-03-18

    Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  16. Modeling task-specific neuronal ensembles improves decoding of grasp

    Science.gov (United States)

    Smith, Ryan J.; Soares, Alcimar B.; Rouse, Adam G.; Schieber, Marc H.; Thakor, Nitish V.

    2018-06-01

    Objective. Dexterous movement involves the activation and coordination of networks of neuronal populations across multiple cortical regions. Attempts to model firing of individual neurons commonly treat the firing rate as directly modulating with motor behavior. However, motor behavior may additionally be associated with modulations in the activity and functional connectivity of neurons in a broader ensemble. Accounting for variations in neural ensemble connectivity may provide additional information about the behavior being performed. Approach. In this study, we examined neural ensemble activity in primary motor cortex (M1) and premotor cortex (PM) of two male rhesus monkeys during performance of a center-out reach, grasp and manipulate task. We constructed point process encoding models of neuronal firing that incorporated task-specific variations in the baseline firing rate as well as variations in functional connectivity with the neural ensemble. Models were evaluated both in terms of their encoding capabilities and their ability to properly classify the grasp being performed. Main results. Task-specific ensemble models correctly predicted the performed grasp with over 95% accuracy and were shown to outperform models of neuronal activity that assume only a variable baseline firing rate. Task-specific ensemble models exhibited superior decoding performance in 82% of units in both monkeys (p  <  0.01). Inclusion of ensemble activity also broadly improved the ability of models to describe observed spiking. Encoding performance of task-specific ensemble models, measured by spike timing predictability, improved upon baseline models in 62% of units. Significance. These results suggest that additional discriminative information about motor behavior found in the variations in functional connectivity of neuronal ensembles located in motor-related cortical regions is relevant to decode complex tasks such as grasping objects, and may serve the basis for more

  17. Short-range ensemble predictions based on convection perturbations in the Eta Model for the Serra do Mar region in Brazil

    Science.gov (United States)

    Bustamante, J. F. F.; Chou, S. C.; Gomes, J. L.

    2009-04-01

    The Southeast Brazil, in the coastal and mountain region called Serra do Mar, between Sao Paulo and Rio de Janeiro, is subject to frequent events of landslides and floods. The Eta Model has been producing good quality forecasts over South America at about 40-km horizontal resolution. For that type of hazards, however, more detailed and probabilistic information on the risks should be provided with the forecasts. Thus, a short-range ensemble prediction system (SREPS) based on the Eta Model is being constructed. Ensemble members derived from perturbed initial and lateral boundary conditions did not provide enough spread for the forecasts. Members with model physics perturbation are being included and tested. The objective of this work is to construct more members for the Eta SREPS by adding physics perturbed members. The Eta Model is configured at 10-km resolution and 38 layers in the vertical. The domain covered is most of Southeast Brazil, centered over the Serra do Mar region. The constructed members comprise variations of the cumulus parameterization Betts-Miller-Janjic (BMJ) and Kain-Fritsch (KF) schemes. Three members were constructed from the BMJ scheme by varying the deficit of saturation pressure profile over land and sea, and 2 members of the KF scheme were included using the standard KF and a momentum flux added to KF scheme version. One of the runs with BMJ scheme is the control run as it was used for the initial condition perturbation SREPS. The forecasts were tested for 6 cases of South America Convergence Zone (SACZ) events. The SACZ is a common summer season feature of Southern Hemisphere that causes persistent rain for a few days over the Southeast Brazil and it frequently organizes over Serra do Mar region. These events are particularly interesting because of the persistent rains that can accumulate large amounts and cause generalized landslides and death. With respect to precipitation, the KF scheme versions have shown to be able to reach the

  18. Efficient multi-scenario Model Predictive Control for water resources management with ensemble streamflow forecasts

    Science.gov (United States)

    Tian, Xin; Negenborn, Rudy R.; van Overloop, Peter-Jules; María Maestre, José; Sadowska, Anna; van de Giesen, Nick

    2017-11-01

    Model Predictive Control (MPC) is one of the most advanced real-time control techniques that has been widely applied to Water Resources Management (WRM). MPC can manage the water system in a holistic manner and has a flexible structure to incorporate specific elements, such as setpoints and constraints. Therefore, MPC has shown its versatile performance in many branches of WRM. Nonetheless, with the in-depth understanding of stochastic hydrology in recent studies, MPC also faces the challenge of how to cope with hydrological uncertainty in its decision-making process. A possible way to embed the uncertainty is to generate an Ensemble Forecast (EF) of hydrological variables, rather than a deterministic one. The combination of MPC and EF results in a more comprehensive approach: Multi-scenario MPC (MS-MPC). In this study, we will first assess the model performance of MS-MPC, considering an ensemble streamflow forecast. Noticeably, the computational inefficiency may be a critical obstacle that hinders applicability of MS-MPC. In fact, with more scenarios taken into account, the computational burden of solving an optimization problem in MS-MPC accordingly increases. To deal with this challenge, we propose the Adaptive Control Resolution (ACR) approach as a computationally efficient scheme to practically reduce the number of control variables in MS-MPC. In brief, the ACR approach uses a mixed-resolution control time step from the near future to the distant future. The ACR-MPC approach is tested on a real-world case study: an integrated flood control and navigation problem in the North Sea Canal of the Netherlands. Such an approach reduces the computation time by 18% and up in our case study. At the same time, the model performance of ACR-MPC remains close to that of conventional MPC.

  19. Applying a Multi-Model Ensemble Method for Long-Term Runoff Prediction under Climate Change Scenarios for the Yellow River Basin, China

    Directory of Open Access Journals (Sweden)

    Linus Zhang

    2018-03-01

    Full Text Available Given the substantial impacts that are expected due to climate change, it is crucial that accurate rainfall–runoff results are provided for various decision-making purposes. However, these modeling results often generate uncertainty or bias due to the imperfect character of individual models. In this paper, a genetic algorithm together with a Bayesian model averaging method are employed to provide a multi-model ensemble (MME and combined runoff prediction under climate change scenarios produced from eight rainfall–runoff models for the Yellow River Basin. The results show that the multi-model ensemble method, especially the genetic algorithm method, can produce more reliable predictions than the other considered rainfall–runoff models. These results show that it is possible to reduce the uncertainty and thus improve the accuracy for future projections using different models because an MME approach evens out the bias involved in the individual model. For the study area, the final combined predictions reveal that less runoff is expected under most climatic scenarios, which will threaten water security of the basin.

  20. JEnsembl: a version-aware Java API to Ensembl data systems.

    Science.gov (United States)

    Paterson, Trevor; Law, Andy

    2012-11-01

    The Ensembl Project provides release-specific Perl APIs for efficient high-level programmatic access to data stored in various Ensembl database schema. Although Perl scripts are perfectly suited for processing large volumes of text-based data, Perl is not ideal for developing large-scale software applications nor embedding in graphical interfaces. The provision of a novel Java API would facilitate type-safe, modular, object-orientated development of new Bioinformatics tools with which to access, analyse and visualize Ensembl data. The JEnsembl API implementation provides basic data retrieval and manipulation functionality from the Core, Compara and Variation databases for all species in Ensembl and EnsemblGenomes and is a platform for the development of a richer API to Ensembl datasources. The JEnsembl architecture uses a text-based configuration module to provide evolving, versioned mappings from database schema to code objects. A single installation of the JEnsembl API can therefore simultaneously and transparently connect to current and previous database instances (such as those in the public archive) thus facilitating better analysis repeatability and allowing 'through time' comparative analyses to be performed. Project development, released code libraries, Maven repository and documentation are hosted at SourceForge (http://jensembl.sourceforge.net).

  1. Advanced Atmospheric Ensemble Modeling Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Buckley, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Chiswell, S. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Kurzeja, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Maze, G. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Viner, B. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Werth, D. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL)

    2017-09-29

    Ensemble modeling (EM), the creation of multiple atmospheric simulations for a given time period, has become an essential tool for characterizing uncertainties in model predictions. We explore two novel ensemble modeling techniques: (1) perturbation of model parameters (Adaptive Programming, AP), and (2) data assimilation (Ensemble Kalman Filter, EnKF). The current research is an extension to work from last year and examines transport on a small spatial scale (<100 km) in complex terrain, for more rigorous testing of the ensemble technique. Two different release cases were studied, a coastal release (SF6) and an inland release (Freon) which consisted of two release times. Observations of tracer concentration and meteorology are used to judge the ensemble results. In addition, adaptive grid techniques have been developed to reduce required computing resources for transport calculations. Using a 20- member ensemble, the standard approach generated downwind transport that was quantitatively good for both releases; however, the EnKF method produced additional improvement for the coastal release where the spatial and temporal differences due to interior valley heating lead to the inland movement of the plume. The AP technique showed improvements for both release cases, with more improvement shown in the inland release. This research demonstrated that transport accuracy can be improved when models are adapted to a particular location/time or when important local data is assimilated into the simulation and enhances SRNL’s capability in atmospheric transport modeling in support of its current customer base and local site missions, as well as our ability to attract new customers within the intelligence community.

  2. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

    Energy Technology Data Exchange (ETDEWEB)

    Singh, Kunwar P., E-mail: kpsingh_52@yahoo.com; Gupta, Shikha

    2014-03-15

    Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R{sup 2}) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R{sup 2} and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the

  3. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

    International Nuclear Information System (INIS)

    Singh, Kunwar P.; Gupta, Shikha

    2014-01-01

    Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R 2 ) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R 2 and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the

  4. Method of collective variables with reference system for the grand canonical ensemble

    International Nuclear Information System (INIS)

    Yukhnovskii, I.R.

    1989-01-01

    A method of collective variables with special reference system for the grand canonical ensemble is presented. An explicit form is obtained for the basis sixth-degree measure density needed to describe the liquid-gas phase transition. Here the author presents the fundamentals of the method, which are as follows: (1) the functional form for the partition function in the grand canonical ensemble; (2) derivation of thermodynamic relations for the coefficients of the Jacobian; (3) transition to the problem on an adequate lattice; and (4) obtaining of the explicit form for the functional of the partition function

  5. Squeezing of Collective Excitations in Spin Ensembles

    DEFF Research Database (Denmark)

    Kraglund Andersen, Christian; Mølmer, Klaus

    2012-01-01

    We analyse the possibility to create two-mode spin squeezed states of two separate spin ensembles by inverting the spins in one ensemble and allowing spin exchange between the ensembles via a near resonant cavity field. We investigate the dynamics of the system using a combination of numerical an...

  6. 3-D visualization of ensemble weather forecasts - Part 2: Forecasting warm conveyor belt situations for aircraft-based field campaigns

    Science.gov (United States)

    Rautenhaus, M.; Grams, C. M.; Schäfler, A.; Westermann, R.

    2015-02-01

    We present the application of interactive 3-D visualization of ensemble weather predictions to forecasting warm conveyor belt situations during aircraft-based atmospheric research campaigns. Motivated by forecast requirements of the T-NAWDEX-Falcon 2012 campaign, a method to predict 3-D probabilities of the spatial occurrence of warm conveyor belts has been developed. Probabilities are derived from Lagrangian particle trajectories computed on the forecast wind fields of the ECMWF ensemble prediction system. Integration of the method into the 3-D ensemble visualization tool Met.3D, introduced in the first part of this study, facilitates interactive visualization of WCB features and derived probabilities in the context of the ECMWF ensemble forecast. We investigate the sensitivity of the method with respect to trajectory seeding and forecast wind field resolution. Furthermore, we propose a visual analysis method to quantitatively analyse the contribution of ensemble members to a probability region and, thus, to assist the forecaster in interpreting the obtained probabilities. A case study, revisiting a forecast case from T-NAWDEX-Falcon, illustrates the practical application of Met.3D and demonstrates the use of 3-D and uncertainty visualization for weather forecasting and for planning flight routes in the medium forecast range (three to seven days before take-off).

  7. Ensemble Architecture for Prediction of Enzyme-ligand Binding Residues Using Evolutionary Information.

    Science.gov (United States)

    Pai, Priyadarshini P; Dattatreya, Rohit Kadam; Mondal, Sukanta

    2017-11-01

    Enzyme interactions with ligands are crucial for various biochemical reactions governing life. Over many years attempts to identify these residues for biotechnological manipulations have been made using experimental and computational techniques. The computational approaches have gathered impetus with the accruing availability of sequence and structure information, broadly classified into template-based and de novo methods. One of the predominant de novo methods using sequence information involves application of biological properties for supervised machine learning. Here, we propose a support vector machines-based ensemble for prediction of protein-ligand interacting residues using one of the most important discriminative contributing properties in the interacting residue neighbourhood, i. e., evolutionary information in the form of position-specific- scoring matrix (PSSM). The study has been performed on a non-redundant dataset comprising of 9269 interacting and 91773 non-interacting residues for prediction model generation and further evaluation. Of the various PSSM-based models explored, the proposed method named ROBBY (pRediction Of Biologically relevant small molecule Binding residues on enzYmes) shows an accuracy of 84.0 %, Matthews Correlation Coefficient of 0.343 and F-measure of 39.0 % on 78 test enzymes. Further, scope of adding domain knowledge such as pocket information has also been investigated; results showed significant enhancement in method precision. Findings are hoped to boost the reliability of small-molecule ligand interaction prediction for enzyme applications and drug design. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Improvement of disease prediction and modeling through the use of meteorological ensembles: human plague in Uganda.

    Directory of Open Access Journals (Sweden)

    Sean M Moore

    Full Text Available Climate and weather influence the occurrence, distribution, and incidence of infectious diseases, particularly those caused by vector-borne or zoonotic pathogens. Thus, models based on meteorological data have helped predict when and where human cases are most likely to occur. Such knowledge aids in targeting limited prevention and control resources and may ultimately reduce the burden of diseases. Paradoxically, localities where such models could yield the greatest benefits, such as tropical regions where morbidity and mortality caused by vector-borne diseases is greatest, often lack high-quality in situ local meteorological data. Satellite- and model-based gridded climate datasets can be used to approximate local meteorological conditions in data-sparse regions, however their accuracy varies. Here we investigate how the selection of a particular dataset can influence the outcomes of disease forecasting models. Our model system focuses on plague (Yersinia pestis infection in the West Nile region of Uganda. The majority of recent human cases have been reported from East Africa and Madagascar, where meteorological observations are sparse and topography yields complex weather patterns. Using an ensemble of meteorological datasets and model-averaging techniques we find that the number of suspected cases in the West Nile region was negatively associated with dry season rainfall (December-February and positively with rainfall prior to the plague season. We demonstrate that ensembles of available meteorological datasets can be used to quantify climatic uncertainty and minimize its impacts on infectious disease models. These methods are particularly valuable in regions with sparse observational networks and high morbidity and mortality from vector-borne diseases.

  9. Improvement of Disease Prediction and Modeling through the Use of Meteorological Ensembles: Human Plague in Uganda

    Science.gov (United States)

    Moore, Sean M.; Monaghan, Andrew; Griffith, Kevin S.; Apangu, Titus; Mead, Paul S.; Eisen, Rebecca J.

    2012-01-01

    Climate and weather influence the occurrence, distribution, and incidence of infectious diseases, particularly those caused by vector-borne or zoonotic pathogens. Thus, models based on meteorological data have helped predict when and where human cases are most likely to occur. Such knowledge aids in targeting limited prevention and control resources and may ultimately reduce the burden of diseases. Paradoxically, localities where such models could yield the greatest benefits, such as tropical regions where morbidity and mortality caused by vector-borne diseases is greatest, often lack high-quality in situ local meteorological data. Satellite- and model-based gridded climate datasets can be used to approximate local meteorological conditions in data-sparse regions, however their accuracy varies. Here we investigate how the selection of a particular dataset can influence the outcomes of disease forecasting models. Our model system focuses on plague (Yersinia pestis infection) in the West Nile region of Uganda. The majority of recent human cases have been reported from East Africa and Madagascar, where meteorological observations are sparse and topography yields complex weather patterns. Using an ensemble of meteorological datasets and model-averaging techniques we find that the number of suspected cases in the West Nile region was negatively associated with dry season rainfall (December-February) and positively with rainfall prior to the plague season. We demonstrate that ensembles of available meteorological datasets can be used to quantify climatic uncertainty and minimize its impacts on infectious disease models. These methods are particularly valuable in regions with sparse observational networks and high morbidity and mortality from vector-borne diseases. PMID:23024750

  10. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2006-06-01

    Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  11. Ensembl 2002: accommodating comparative genomics.

    Science.gov (United States)

    Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

    2003-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.

  12. Noodles: a tool for visualization of numerical weather model ensemble uncertainty.

    Science.gov (United States)

    Sanyal, Jibonananda; Zhang, Song; Dyer, Jamie; Mercer, Andrew; Amburn, Philip; Moorhead, Robert J

    2010-01-01

    Numerical weather prediction ensembles are routinely used for operational weather forecasting. The members of these ensembles are individual simulations with either slightly perturbed initial conditions or different model parameterizations, or occasionally both. Multi-member ensemble output is usually large, multivariate, and challenging to interpret interactively. Forecast meteorologists are interested in understanding the uncertainties associated with numerical weather prediction; specifically variability between the ensemble members. Currently, visualization of ensemble members is mostly accomplished through spaghetti plots of a single mid-troposphere pressure surface height contour. In order to explore new uncertainty visualization methods, the Weather Research and Forecasting (WRF) model was used to create a 48-hour, 18 member parameterization ensemble of the 13 March 1993 "Superstorm". A tool was designed to interactively explore the ensemble uncertainty of three important weather variables: water-vapor mixing ratio, perturbation potential temperature, and perturbation pressure. Uncertainty was quantified using individual ensemble member standard deviation, inter-quartile range, and the width of the 95% confidence interval. Bootstrapping was employed to overcome the dependence on normality in the uncertainty metrics. A coordinated view of ribbon and glyph-based uncertainty visualization, spaghetti plots, iso-pressure colormaps, and data transect plots was provided to two meteorologists for expert evaluation. They found it useful in assessing uncertainty in the data, especially in finding outliers in the ensemble run and therefore avoiding the WRF parameterizations that lead to these outliers. Additionally, the meteorologists could identify spatial regions where the uncertainty was significantly high, allowing for identification of poorly simulated storm environments and physical interpretation of these model issues.

  13. A retrospective streamflow ensemble forecast for an extreme hydrologic event: a case study of Hurricane Irene and on the Hudson River basin

    Science.gov (United States)

    Saleh, Firas; Ramaswamy, Venkatsundar; Georgas, Nickitas; Blumberg, Alan F.; Pullen, Julie

    2016-07-01

    This paper investigates the uncertainties in hourly streamflow ensemble forecasts for an extreme hydrological event using a hydrological model forced with short-range ensemble weather prediction models. A state-of-the art, automated, short-term hydrologic prediction framework was implemented using GIS and a regional scale hydrological model (HEC-HMS). The hydrologic framework was applied to the Hudson River basin ( ˜ 36 000 km2) in the United States using gridded precipitation data from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) and was validated against streamflow observations from the United States Geologic Survey (USGS). Finally, 21 precipitation ensemble members of the latest Global Ensemble Forecast System (GEFS/R) were forced into HEC-HMS to generate a retrospective streamflow ensemble forecast for an extreme hydrological event, Hurricane Irene. The work shows that ensemble stream discharge forecasts provide improved predictions and useful information about associated uncertainties, thus improving the assessment of risks when compared with deterministic forecasts. The uncertainties in weather inputs may result in false warnings and missed river flooding events, reducing the potential to effectively mitigate flood damage. The findings demonstrate how errors in the ensemble median streamflow forecast and time of peak, as well as the ensemble spread (uncertainty) are reduced 48 h pre-event by utilizing the ensemble framework. The methodology and implications of this work benefit efforts of short-term streamflow forecasts at regional scales, notably regarding the peak timing of an extreme hydrologic event when combined with a flood threshold exceedance diagram. Although the modeling framework was implemented on the Hudson River basin, it is flexible and applicable in other parts of the world where atmospheric reanalysis products and streamflow data are available.

  14. Visualizing uncertainties in a storm surge ensemble data assimilation and forecasting system

    KAUST Repository

    Hollt, Thomas; Altaf, Muhammad; Mandli, Kyle T.; Hadwiger, Markus; Dawson, Clint N.; Hoteit, Ibrahim

    2015-01-01

    allows the user to browse through the simulation ensembles in real time, view specific parameter settings or simulation models and move between different spatial and temporal regions without delay. In addition, our system provides advanced visualizations

  15. Sequential ensemble-based optimal design for parameter estimation: SEQUENTIAL ENSEMBLE-BASED OPTIMAL DESIGN

    Energy Technology Data Exchange (ETDEWEB)

    Man, Jun [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Zhang, Jiangjiang [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Li, Weixuan [Pacific Northwest National Laboratory, Richland Washington USA; Zeng, Lingzao [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Wu, Laosheng [Department of Environmental Sciences, University of California, Riverside California USA

    2016-10-01

    The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees of freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.

  16. New approach to information fusion for Lipschitz classifiers ensembles: Application in multi-channel C-OTDR-monitoring systems

    Energy Technology Data Exchange (ETDEWEB)

    Timofeev, Andrey V.; Egorov, Dmitry V. [LPP “EqualiZoom”, Astana, 010000 (Kazakhstan)

    2016-06-08

    This paper presents new results concerning selection of an optimal information fusion formula for an ensemble of Lipschitz classifiers. The goal of information fusion is to create an integral classificatory which could provide better generalization ability of the ensemble while achieving a practically acceptable level of effectiveness. The problem of information fusion is very relevant for data processing in multi-channel C-OTDR-monitoring systems. In this case we have to effectively classify targeted events which appear in the vicinity of the monitored object. Solution of this problem is based on usage of an ensemble of Lipschitz classifiers each of which corresponds to a respective channel. We suggest a brand new method for information fusion in case of ensemble of Lipschitz classifiers. This method is called “The Weighing of Inversely as Lipschitz Constants” (WILC). Results of WILC-method practical usage in multichannel C-OTDR monitoring systems are presented.

  17. Modality-Driven Classification and Visualization of Ensemble Variance

    Energy Technology Data Exchange (ETDEWEB)

    Bensema, Kevin; Gosink, Luke; Obermaier, Harald; Joy, Kenneth I.

    2016-10-01

    Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no information about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.

  18. Evaluation of the Plant-Craig stochastic convection scheme (v2.0) in the ensemble forecasting system MOGREPS-R (24 km) based on the Unified Model (v7.3)

    Science.gov (United States)

    Keane, Richard J.; Plant, Robert S.; Tennant, Warren J.

    2016-05-01

    The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic scheme only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.

  19. Ensemble Deep Learning for Biomedical Time Series Classification

    Directory of Open Access Journals (Sweden)

    Lin-peng Jin

    2016-01-01

    Full Text Available Ensemble learning has been proved to improve the generalization ability effectively in both theory and practice. In this paper, we briefly outline the current status of research on it first. Then, a new deep neural network-based ensemble method that integrates filtering views, local views, distorted views, explicit training, implicit training, subview prediction, and Simple Average is proposed for biomedical time series classification. Finally, we validate its effectiveness on the Chinese Cardiovascular Disease Database containing a large number of electrocardiogram recordings. The experimental results show that the proposed method has certain advantages compared to some well-known ensemble methods, such as Bagging and AdaBoost.

  20. An automatic counting and recording system (1963); Ensemble de comptage a enregistrement automatique (1963)

    Energy Technology Data Exchange (ETDEWEB)

    Pierre, B [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1961-09-15

    An automatic control, counting and programing system for the collection of single crystal diffractometry data was designed by the author for a neutron diffractometer in 1958 at C.E.N - Grenoble. A part of the whole instrument, 'The Automatic Counting and Recording System', is described in this paper. Its applications are numerous and extensive, e.g.: the system has been designed for neutron diffractometer, but it can easily be adapted either for use with X-rays or measurement of mean life in {beta} decay analysis. (author) [French] Un ensemble automatique de telecommande, comptage et programmation pour la diffractometrie a cristal unique a ete etudie et realise par l'auteur pour la diffraction des neutrons en 1958 au C.E.N - Grenoble. Le present rapport decrit a ''l'Ensemble de Comptage a Enregistrement Automatique'' qui est une partie de l'appareillage complet. Ses applications sont nombreuses et peuvent s'etendre a de nouveaux domaines. En effet cet ensemble qui a ete etudie pour fonctionner avec un diffractometre a neutron, peut facilement s'adapter a la technique de diffraction des rayons X ou par exemple a celle de decroisasnce d'activite {beta}. (auteur)

  1. Identification of Protein Pupylation Sites Using Bi-Profile Bayes Feature Extraction and Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Xiaowei Zhao

    2013-01-01

    Full Text Available Pupylation, one of the most important posttranslational modifications of proteins, typically takes place when prokaryotic ubiquitin-like protein (Pup is attached to specific lysine residues on a target protein. Identification of pupylation substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of pupylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of pupylation sites is much desirable for their convenience and fast speed. In this study, a new bioinformatics tool named EnsemblePup was developed that used an ensemble of support vector machine classifiers to predict pupylation sites. The highlight of EnsemblePup was to utilize the Bi-profile Bayes feature extraction as the encoding scheme. The performance of EnsemblePup was measured with a sensitivity of 79.49%, a specificity of 82.35%, an accuracy of 85.43%, and a Matthews correlation coefficient of 0.617 using the 5-fold cross validation on the training dataset. When compared with other existing methods on a benchmark dataset, the EnsemblePup provided better predictive performance, with a sensitivity of 80.00%, a specificity of 83.33%, an accuracy of 82.00%, and a Matthews correlation coefficient of 0.629. The experimental results suggested that EnsemblePup presented here might be useful to identify and annotate potential pupylation sites in proteins of interest. A web server for predicting pupylation sites was developed.

  2. An analog ensemble for short-term probabilistic solar power forecast

    International Nuclear Information System (INIS)

    Alessandrini, S.; Delle Monache, L.; Sperati, S.; Cervone, G.

    2015-01-01

    Highlights: • A novel method for solar power probabilistic forecasting is proposed. • The forecast accuracy does not depend on the nominal power. • The impact of climatology on forecast accuracy is evaluated. - Abstract: The energy produced by photovoltaic farms has a variable nature depending on astronomical and meteorological factors. The former are the solar elevation and the solar azimuth, which are easily predictable without any uncertainty. The amount of liquid water met by the solar radiation within the troposphere is the main meteorological factor influencing the solar power production, as a fraction of short wave solar radiation is reflected by the water particles and cannot reach the earth surface. The total cloud cover is a meteorological variable often used to indicate the presence of liquid water in the troposphere and has a limited predictability, which is also reflected on the global horizontal irradiance and, as a consequence, on solar photovoltaic power prediction. This lack of predictability makes the solar energy integration into the grid challenging. A cost-effective utilization of solar energy over a grid strongly depends on the accuracy and reliability of the power forecasts available to the Transmission System Operators (TSOs). Furthermore, several countries have in place legislation requiring solar power producers to pay penalties proportional to the errors of day-ahead energy forecasts, which makes the accuracy of such predictions a determining factor for producers to reduce their economic losses. Probabilistic predictions can provide accurate deterministic forecasts along with a quantification of their uncertainty, as well as a reliable estimate of the probability to overcome a certain production threshold. In this paper we propose the application of an analog ensemble (AnEn) method to generate probabilistic solar power forecasts (SPF). The AnEn is based on an historical set of deterministic numerical weather prediction (NWP) model

  3. Comparison of surface freshwater fluxes from different climate forecasts produced through different ensemble generation schemes.

    Science.gov (United States)

    Romanova, Vanya; Hense, Andreas; Wahl, Sabrina; Brune, Sebastian; Baehr, Johanna

    2016-04-01

    The decadal variability and its predictability of the surface net freshwater fluxes is compared in a set of retrospective predictions, all using the same model setup, and only differing in the implemented ocean initialisation method and ensemble generation method. The basic aim is to deduce the differences between the initialization/ensemble generation methods in view of the uncertainty of the verifying observational data sets. The analysis will give an approximation of the uncertainties of the net freshwater fluxes, which up to now appear to be one of the most uncertain products in observational data and model outputs. All ensemble generation methods are implemented into the MPI-ESM earth system model in the framework of the ongoing MiKlip project (www.fona-miklip.de). Hindcast experiments are initialised annually between 2000-2004, and from each start year 10 ensemble members are initialized for 5 years each. Four different ensemble generation methods are compared: (i) a method based on the Anomaly Transform method (Romanova and Hense, 2015) in which the initial oceanic perturbations represent orthogonal and balanced anomaly structures in space and time and between the variables taken from a control run, (ii) one-day-lagged ocean states from the MPI-ESM-LR baseline system (iii) one-day-lagged of ocean and atmospheric states with preceding full-field nudging to re-analysis in both the atmospheric and the oceanic component of the system - the baseline one MPI-ESM-LR system, (iv) an Ensemble Kalman Filter (EnKF) implemented into oceanic part of MPI-ESM (Brune et al. 2015), assimilating monthly subsurface oceanic temperature and salinity (EN3) using the Parallel Data Assimilation Framework (PDAF). The hindcasts are evaluated probabilistically using fresh water flux data sets from four different reanalysis data sets: MERRA, NCEP-R1, GFDL ocean reanalysis and GECCO2. The assessments show no clear differences in the evaluations scores on regional scales. However, on the

  4. Rainfall downscaling of weekly ensemble forecasts using self-organising maps

    Directory of Open Access Journals (Sweden)

    Masamichi Ohba

    2016-03-01

    Full Text Available This study presents an application of self-organising maps (SOMs to downscaling medium-range ensemble forecasts and probabilistic prediction of local precipitation in Japan. SOM was applied to analyse and connect the relationship between atmospheric patterns over Japan and local high-resolution precipitation data. Multiple SOM was simultaneously employed on four variables derived from the JRA-55 reanalysis over the area of study (south-western Japan, and a two-dimensional lattice of weather patterns (WPs was obtained. Weekly ensemble forecasts can be downscaled to local precipitation using the obtained multiple SOM. The downscaled precipitation is derived by the five SOM lattices based on the WPs of the global model ensemble forecasts for a particular day in 2009–2011. Because this method effectively handles the stochastic uncertainties from the large number of ensemble members, a probabilistic local precipitation is easily and quickly obtained from the ensemble forecasts. This downscaling of ensemble forecasts provides results better than those from a 20-km global spectral model (i.e. capturing the relatively detailed precipitation distribution over the region. To capture the effect of the detailed pattern differences in each SOM node, a statistical model is additionally concreted for each SOM node. The predictability skill of the ensemble forecasts is significantly improved under the neural network-statistics hybrid-downscaling technique, which then brings a much better skill score than the traditional method. It is expected that the results of this study will provide better guidance to the user community and contribute to the future development of dam-management models.

  5. Potential predictability and forecast skill in ensemble climate forecast: a skill-persistence rule

    Science.gov (United States)

    Jin, Yishuai; Rong, Xinyao; Liu, Zhengyu

    2017-12-01

    This study investigates the factors relationship between the forecast skills for the real world (actual skill) and perfect model (perfect skill) in ensemble climate model forecast with a series of fully coupled general circulation model forecast experiments. It is found that the actual skill for sea surface temperature (SST) in seasonal forecast is substantially higher than the perfect skill on a large part of the tropical oceans, especially the tropical Indian Ocean and the central-eastern Pacific Ocean. The higher actual skill is found to be related to the higher observational SST persistence, suggesting a skill-persistence rule: a higher SST persistence in the real world than in the model could overwhelm the model bias to produce a higher forecast skill for the real world than for the perfect model. The relation between forecast skill and persistence is further proved using a first-order autoregressive model (AR1) analytically for theoretical solutions and numerically for analogue experiments. The AR1 model study shows that the skill-persistence rule is strictly valid in the case of infinite ensemble size, but could be distorted by sampling errors and non-AR1 processes. This study suggests that the so called "perfect skill" is model dependent and cannot serve as an accurate estimate of the true upper limit of real world prediction skill, unless the model can capture at least the persistence property of the observation.

  6. MVL spatiotemporal analysis for model intercomparison in EPS: application to the DEMETER multi-model ensemble

    Science.gov (United States)

    Fernández, J.; Primo, C.; Cofiño, A. S.; Gutiérrez, J. M.; Rodríguez, M. A.

    2009-08-01

    In a recent paper, Gutiérrez et al. (Nonlinear Process Geophys 15(1):109-114, 2008) introduced a new characterization of spatiotemporal error growth—the so called mean-variance logarithmic (MVL) diagram—and applied it to study ensemble prediction systems (EPS); in particular, they analyzed single-model ensembles obtained by perturbing the initial conditions. In the present work, the MVL diagram is applied to multi-model ensembles analyzing also the effect of model formulation differences. To this aim, the MVL diagram is systematically applied to the multi-model ensemble produced in the EU-funded DEMETER project. It is shown that the shared building blocks (atmospheric and ocean components) impose similar dynamics among different models and, thus, contribute to poorly sampling the model formulation uncertainty. This dynamical similarity should be taken into account, at least as a pre-screening process, before applying any objective weighting method.

  7. Three-dimensional visualization of ensemble weather forecasts - Part 2: Forecasting warm conveyor belt situations for aircraft-based field campaigns

    Science.gov (United States)

    Rautenhaus, M.; Grams, C. M.; Schäfler, A.; Westermann, R.

    2015-07-01

    We present the application of interactive three-dimensional (3-D) visualization of ensemble weather predictions to forecasting warm conveyor belt situations during aircraft-based atmospheric research campaigns. Motivated by forecast requirements of the T-NAWDEX-Falcon 2012 (THORPEX - North Atlantic Waveguide and Downstream Impact Experiment) campaign, a method to predict 3-D probabilities of the spatial occurrence of warm conveyor belts (WCBs) has been developed. Probabilities are derived from Lagrangian particle trajectories computed on the forecast wind fields of the European Centre for Medium Range Weather Forecasts (ECMWF) ensemble prediction system. Integration of the method into the 3-D ensemble visualization tool Met.3D, introduced in the first part of this study, facilitates interactive visualization of WCB features and derived probabilities in the context of the ECMWF ensemble forecast. We investigate the sensitivity of the method with respect to trajectory seeding and grid spacing of the forecast wind field. Furthermore, we propose a visual analysis method to quantitatively analyse the contribution of ensemble members to a probability region and, thus, to assist the forecaster in interpreting the obtained probabilities. A case study, revisiting a forecast case from T-NAWDEX-Falcon, illustrates the practical application of Met.3D and demonstrates the use of 3-D and uncertainty visualization for weather forecasting and for planning flight routes in the medium forecast range (3 to 7 days before take-off).

  8. Modeling polydispersive ensembles of diamond nanoparticles

    International Nuclear Information System (INIS)

    Barnard, Amanda S

    2013-01-01

    While significant progress has been made toward production of monodispersed samples of a variety of nanoparticles, in cases such as diamond nanoparticles (nanodiamonds) a significant degree of polydispersivity persists, so scaling-up of laboratory applications to industrial levels has its challenges. In many cases, however, monodispersivity is not essential for reliable application, provided that the inevitable uncertainties are just as predictable as the functional properties. As computational methods of materials design are becoming more widespread, there is a growing need for robust methods for modeling ensembles of nanoparticles, that capture the structural complexity characteristic of real specimens. In this paper we present a simple statistical approach to modeling of ensembles of nanoparticles, and apply it to nanodiamond, based on sets of individual simulations that have been carefully selected to describe specific structural sources that are responsible for scattering of fundamental properties, and that are typically difficult to eliminate experimentally. For the purposes of demonstration we show how scattering in the Fermi energy and the electronic band gap are related to different structural variations (sources), and how these results can be combined strategically to yield statistically significant predictions of the properties of an entire ensemble of nanodiamonds, rather than merely one individual ‘model’ particle or a non-representative sub-set. (paper)

  9. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting

    Science.gov (United States)

    Niu, Mingfei; Wang, Yufang; Sun, Shaolong; Li, Yongwu

    2016-06-01

    To enhance prediction reliability and accuracy, a hybrid model based on the promising principle of "decomposition and ensemble" and a recently proposed meta-heuristic called grey wolf optimizer (GWO) is introduced for daily PM2.5 concentration forecasting. Compared with existing PM2.5 forecasting methods, this proposed model has improved the prediction accuracy and hit rates of directional prediction. The proposed model involves three main steps, i.e., decomposing the original PM2.5 series into several intrinsic mode functions (IMFs) via complementary ensemble empirical mode decomposition (CEEMD) for simplifying the complex data; individually predicting each IMF with support vector regression (SVR) optimized by GWO; integrating all predicted IMFs for the ensemble result as the final prediction by another SVR optimized by GWO. Seven benchmark models, including single artificial intelligence (AI) models, other decomposition-ensemble models with different decomposition methods and models with the same decomposition-ensemble method but optimized by different algorithms, are considered to verify the superiority of the proposed hybrid model. The empirical study indicates that the proposed hybrid decomposition-ensemble model is remarkably superior to all considered benchmark models for its higher prediction accuracy and hit rates of directional prediction.

  10. JuPOETs: a constrained multiobjective optimization approach to estimate biochemical model ensembles in the Julia programming language.

    Science.gov (United States)

    Bassen, David M; Vilkhovoy, Michael; Minot, Mason; Butcher, Jonathan T; Varner, Jeffrey D

    2017-01-25

    Ensemble modeling is a promising approach for obtaining robust predictions and coarse grained population behavior in deterministic mathematical models. Ensemble approaches address model uncertainty by using parameter or model families instead of single best-fit parameters or fixed model structures. Parameter ensembles can be selected based upon simulation error, along with other criteria such as diversity or steady-state performance. Simulations using parameter ensembles can estimate confidence intervals on model variables, and robustly constrain model predictions, despite having many poorly constrained parameters. In this software note, we present a multiobjective based technique to estimate parameter or models ensembles, the Pareto Optimal Ensemble Technique in the Julia programming language (JuPOETs). JuPOETs integrates simulated annealing with Pareto optimality to estimate ensembles on or near the optimal tradeoff surface between competing training objectives. We demonstrate JuPOETs on a suite of multiobjective problems, including test functions with parameter bounds and system constraints as well as for the identification of a proof-of-concept biochemical model with four conflicting training objectives. JuPOETs identified optimal or near optimal solutions approximately six-fold faster than a corresponding implementation in Octave for the suite of test functions. For the proof-of-concept biochemical model, JuPOETs produced an ensemble of parameters that gave both the mean of the training data for conflicting data sets, while simultaneously estimating parameter sets that performed well on each of the individual objective functions. JuPOETs is a promising approach for the estimation of parameter and model ensembles using multiobjective optimization. JuPOETs can be adapted to solve many problem types, including mixed binary and continuous variable types, bilevel optimization problems and constrained problems without altering the base algorithm. JuPOETs is open

  11. Nuclear multifragmentation within the framework of different statistical ensembles

    International Nuclear Information System (INIS)

    Aguiar, C.E.; Donangelo, R.; Souza, S.R.

    2006-01-01

    The sensitivity of the statistical multifragmentation model to the underlying statistical assumptions is investigated. We concentrate on its microcanonical, canonical, and isobaric formulations. As far as average values are concerned, our results reveal that all the ensembles make very similar predictions, as long as the relevant macroscopic variables (such as temperature, excitation energy, and breakup volume) are the same in all statistical ensembles. It also turns out that the multiplicity dependence of the breakup volume in the microcanonical version of the model mimics a system at (approximately) constant pressure, at least in the plateau region of the caloric curve. However, in contrast to average values, our results suggest that the distributions of physical observables are quite sensitive to the statistical assumptions. This finding may help in deciding which hypothesis corresponds to the best picture for the freeze-out stage

  12. Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 2: Generalization in time and space

    Directory of Open Access Journals (Sweden)

    D. Brochero

    2011-11-01

    Full Text Available An uncertainty cascade model applied to stream flow forecasting seeks to evaluate the different sources of uncertainty of the complex rainfall-runoff process. The current trend focuses on the combination of Meteorological Ensemble Prediction Systems (MEPS and hydrological model(s. However, the number of members of such a HEPS may rapidly increase to a level that may not be operationally sustainable. This paper evaluates the generalization ability of a simplification scheme of a 800-member HEPS formed by the combination of 16 lumped rainfall-runoff models with the 50 perturbed members from the European Centre for Medium-range Weather Forecasts (ECMWF EPS. Tests are made at two levels. At the local level, the transferability of the 9th day hydrological member selection for the other 8 forecast horizons exhibits an 82% success rate. The other evaluation is made at the regional or cluster level, the transferability from one catchment to another from within a cluster of watersheds also leads to a good performance (85% success rate, especially for forecast time horizons above 3 days and when the basins that formed the cluster presented themselves a good performance on an individual basis. Diversity, defined as hydrological model complementarity addressing different aspects of a forecast, was identified as the critical factor for proper selection applications.

  13. Downscaling Satellite Data for Predicting Catchment-scale Root Zone Soil Moisture with Ground-based Sensors and an Ensemble Kalman Filter

    Science.gov (United States)

    Lin, H.; Baldwin, D. C.; Smithwick, E. A. H.

    2015-12-01

    Predicting root zone (0-100 cm) soil moisture (RZSM) content at a catchment-scale is essential for drought and flood predictions, irrigation planning, weather forecasting, and many other applications. Satellites, such as the NASA Soil Moisture Active Passive (SMAP), can estimate near-surface (0-5 cm) soil moisture content globally at coarse spatial resolutions. We develop a hierarchical Ensemble Kalman Filter (EnKF) data assimilation modeling system to downscale satellite-based near-surface soil moisture and to estimate RZSM content across the Shale Hills Critical Zone Observatory at a 1-m resolution in combination with ground-based soil moisture sensor data. In this example, a simple infiltration model within the EnKF-model has been parameterized for 6 soil-terrain units to forecast daily RZSM content in the catchment from 2009 - 2012 based on AMSRE. LiDAR-derived terrain variables define intra-unit RZSM variability using a novel covariance localization technique. This method also allows the mapping of uncertainty with our RZSM estimates for each time-step. A catchment-wide satellite-to-surface downscaling parameter, which nudges the satellite measurement closer to in situ near-surface data, is also calculated for each time-step. We find significant differences in predicted root zone moisture storage for different terrain units across the experimental time-period. Root mean square error from a cross-validation analysis of RZSM predictions using an independent dataset of catchment-wide in situ Time-Domain Reflectometry (TDR) measurements ranges from 0.060-0.096 cm3 cm-3, and the RZSM predictions are significantly (p < 0.05) correlated with TDR measurements [r = 0.47-0.68]. The predictive skill of this data assimilation system is similar to the Penn State Integrated Hydrologic Modeling (PIHM) system. Uncertainty estimates are significantly (p < 0.05) correlated to cross validation error during wet and dry conditions, but more so in dry summer seasons. Developing an

  14. Random matrix ensembles for PT-symmetric systems

    International Nuclear Information System (INIS)

    Graefe, Eva-Maria; Mudute-Ndumbe, Steve; Taylor, Matthew

    2015-01-01

    Recently much effort has been made towards the introduction of non-Hermitian random matrix models respecting PT-symmetry. Here we show that there is a one-to-one correspondence between complex PT-symmetric matrices and split-complex and split-quaternionic versions of Hermitian matrices. We introduce two new random matrix ensembles of (a) Gaussian split-complex Hermitian; and (b) Gaussian split-quaternionic Hermitian matrices, of arbitrary sizes. We conjecture that these ensembles represent universality classes for PT-symmetric matrices. For the case of 2 × 2 matrices we derive analytic expressions for the joint probability distributions of the eigenvalues, the one-level densities and the level spacings in the case of real eigenvalues. (fast track communication)

  15. Impacto da utilização de previsões "defasadas" no sistema de previsão de tempo por conjunto do CPTEC/INPE The impact of using lagged forecasts on the CPTEC/INPE ensemble prediction system

    Directory of Open Access Journals (Sweden)

    Lúcia Helena Ribas Machado

    2010-03-01

    improves the performance of the operational ensemble contributing to increase the ensemble spreading and, consequently, to reduce the under-dispersion of the system. Also we observed that lagged average forecast (LAF shows similar performance of the operational EPS-CPTEC/INPE and that there is a tendency to higher performance when spread forecast is low, for 5 and 7 day forecast. These results provide the basis for the operational implementation of the LAF technique, which has low computational cost, and contribute to a more efficient utilization of the CPTEC/INPE ensemble predictions.

  16. The Advantage of Using International Multimodel Ensemble for Seasonal Precipitation Forecast over Israel

    Directory of Open Access Journals (Sweden)

    Amir Givati

    2017-01-01

    Full Text Available This study analyzes the results of monthly and seasonal precipitation forecasting from seven different global climate forecast models for major basins in Israel within October–April 1982–2010. The six National Multimodel Ensemble (NMME models and the ECMWF seasonal model were used to calculate an International Multimodel Ensemble (IMME. The study presents the performance of both monthly and seasonal predictions of precipitation accumulated over three months, with respect to different lead times for the ensemble mean values, one per individual model. Additionally, we analyzed the performance of different combinations of models. We present verification of seasonal forecasting using real forecasts, focusing on a small domain characterized by complex terrain, high annual precipitation variability, and a sharp precipitation gradient from west to east as well as from south to north. The results in this study show that, in general, the monthly analysis does not provide very accurate results, even when using the IMME for one-month lead time. We found that the IMME outperformed any single model prediction. Our analysis indicates that the optimal combinations with the high correlation values contain at least three models. Moreover, prediction with larger number of models in the ensemble produces more robust predictions. The results obtained in this study highlight the advantages of using an ensemble of global models over single models for small domain.

  17. An operational hydrological ensemble prediction system for the city of Zurich (Switzerland: skill, case studies and scenarios

    Directory of Open Access Journals (Sweden)

    N. Addor

    2011-07-01

    Full Text Available The Sihl River flows through Zurich, Switzerland's most populated city, for which it represents the largest flood threat. To anticipate extreme discharge events and provide decision support in case of flood risk, a hydrometeorological ensemble prediction system (HEPS was launched operationally in 2008. This model chain relies on limited-area atmospheric forecasts provided by the deterministic model COSMO-7 and the probabilistic model COSMO-LEPS. These atmospheric forecasts are used to force a semi-distributed hydrological model (PREVAH, coupled to a hydraulic model (FLORIS. The resulting hydrological forecasts are eventually communicated to the stakeholders involved in the Sihl discharge management. This fully operational setting provides a real framework with which to compare the potential of deterministic and probabilistic discharge forecasts for flood mitigation.

    To study the suitability of HEPS for small-scale basins and to quantify the added-value conveyed by the probability information, a reforecast was made for the period June 2007 to December 2009 for the Sihl catchment (336 km2. Several metrics support the conclusion that the performance gain can be of up to 2 days lead time for the catchment considered. Brier skill scores show that overall COSMO-LEPS-based hydrological forecasts outperforms their COSMO-7-based counterparts for all the lead times and event intensities considered. The small size of the Sihl catchment does not prevent skillful discharge forecasts, but makes them particularly dependent on correct precipitation forecasts, as shown by comparisons with a reference run driven by observed meteorological parameters. Our evaluation stresses that the capacity of the model to provide confident and reliable mid-term probability forecasts for high discharges is limited. The two most intense events of the study period are investigated utilising a novel graphical representation of probability forecasts, and are used

  18. Visualization of uncertainty and ensemble data: Exploration of climate modeling and weather forecast data with integrated ViSUS-CDAT systems

    International Nuclear Information System (INIS)

    Potter, Kristin; Pascucci, Valerio; Johhson, Chris; Wilson, Andrew; Bremer, Peer-Timo; Williams, Dean; Doutriaux, Charles

    2009-01-01

    Climate scientists and meteorologists are working towards a better understanding of atmospheric conditions and global climate change. To explore the relationships present in numerical predictions of the atmosphere, ensemble datasets are produced that combine time- and spatially-varying simulations generated using multiple numeric models, sampled input conditions, and perturbed parameters. These data sets mitigate as well as describe the uncertainty present in the data by providing insight into the effects of parameter perturbation, sensitivity to initial conditions, and inconsistencies in model outcomes. As such, massive amounts of data are produced, creating challenges both in data analysis and in visualization. This work presents an approach to understanding ensembles by using a collection of statistical descriptors to summarize the data, and displaying these descriptors using variety of visualization techniques which are familiar to domain experts. The resulting techniques are integrated into the ViSUS/Climate Data and Analysis Tools (CDAT) system designed to provide a directly accessible, complex visualization framework to atmospheric researchers.

  19. Surface drift prediction in the Adriatic Sea using hyper-ensemble statistics on atmospheric, ocean and wave models: Uncertainties and probability distribution areas

    Science.gov (United States)

    Rixen, M.; Ferreira-Coelho, E.; Signell, R.

    2008-01-01

    Despite numerous and regular improvements in underlying models, surface drift prediction in the ocean remains a challenging task because of our yet limited understanding of all processes involved. Hence, deterministic approaches to the problem are often limited by empirical assumptions on underlying physics. Multi-model hyper-ensemble forecasts, which exploit the power of an optimal local combination of available information including ocean, atmospheric and wave models, may show superior forecasting skills when compared to individual models because they allow for local correction and/or bias removal. In this work, we explore in greater detail the potential and limitations of the hyper-ensemble method in the Adriatic Sea, using a comprehensive surface drifter database. The performance of the hyper-ensembles and the individual models are discussed by analyzing associated uncertainties and probability distribution maps. Results suggest that the stochastic method may reduce position errors significantly for 12 to 72??h forecasts and hence compete with pure deterministic approaches. ?? 2007 NATO Undersea Research Centre (NURC).

  20. Monthly ENSO Forecast Skill and Lagged Ensemble Size

    Science.gov (United States)

    Trenary, L.; DelSole, T.; Tippett, M. K.; Pegion, K.

    2018-04-01

    The mean square error (MSE) of a lagged ensemble of monthly forecasts of the Niño 3.4 index from the Climate Forecast System (CFSv2) is examined with respect to ensemble size and configuration. Although the real-time forecast is initialized 4 times per day, it is possible to infer the MSE for arbitrary initialization frequency and for burst ensembles by fitting error covariances to a parametric model and then extrapolating to arbitrary ensemble size and initialization frequency. Applying this method to real-time forecasts, we find that the MSE consistently reaches a minimum for a lagged ensemble size between one and eight days, when four initializations per day are included. This ensemble size is consistent with the 8-10 day lagged ensemble configuration used operationally. Interestingly, the skill of both ensemble configurations is close to the estimated skill of the infinite ensemble. The skill of the weighted, lagged, and burst ensembles are found to be comparable. Certain unphysical features of the estimated error growth were tracked down to problems with the climatology and data discontinuities.

  1. A meteo-hydrological prediction system based on a multi-model approach for precipitation forecasting

    Directory of Open Access Journals (Sweden)

    S. Davolio

    2008-02-01

    Full Text Available The precipitation forecasted by a numerical weather prediction model, even at high resolution, suffers from errors which can be considerable at the scales of interest for hydrological purposes. In the present study, a fraction of the uncertainty related to meteorological prediction is taken into account by implementing a multi-model forecasting approach, aimed at providing multiple precipitation scenarios driving the same hydrological model. Therefore, the estimation of that uncertainty associated with the quantitative precipitation forecast (QPF, conveyed by the multi-model ensemble, can be exploited by the hydrological model, propagating the error into the hydrological forecast.

    The proposed meteo-hydrological forecasting system is implemented and tested in a real-time configuration for several episodes of intense precipitation affecting the Reno river basin, a medium-sized basin located in northern Italy (Apennines. These episodes are associated with flood events of different intensity and are representative of different meteorological configurations responsible for severe weather affecting northern Apennines.

    The simulation results show that the coupled system is promising in the prediction of discharge peaks (both in terms of amount and timing for warning purposes. The ensemble hydrological forecasts provide a range of possible flood scenarios that proved to be useful for the support of civil protection authorities in their decision.

  2. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source.

    Science.gov (United States)

    Wan, Shixiang; Duan, Yucong; Zou, Quan

    2017-09-01

    Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time-consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state-of-the-art prediction methods. First, most of the existing techniques are designed to deal with multi-class rather than multi-label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi-label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi-label classifier called HPSLPred, which can be applied for multi-label classification with an imbalanced protein source. For convenience, a user-friendly webserver has been established at http://server.malab.cn/HPSLPred. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. An automatic counting and recording system (1963); Ensemble de comptage a enregistrement automatique (1963)

    Energy Technology Data Exchange (ETDEWEB)

    Pierre, B. [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1961-09-15

    An automatic control, counting and programing system for the collection of single crystal diffractometry data was designed by the author for a neutron diffractometer in 1958 at C.E.N - Grenoble. A part of the whole instrument, 'The Automatic Counting and Recording System', is described in this paper. Its applications are numerous and extensive, e.g.: the system has been designed for neutron diffractometer, but it can easily be adapted either for use with X-rays or measurement of mean life in {beta} decay analysis. (author) [French] Un ensemble automatique de telecommande, comptage et programmation pour la diffractometrie a cristal unique a ete etudie et realise par l'auteur pour la diffraction des neutrons en 1958 au C.E.N - Grenoble. Le present rapport decrit a ''l'Ensemble de Comptage a Enregistrement Automatique'' qui est une partie de l'appareillage complet. Ses applications sont nombreuses et peuvent s'etendre a de nouveaux domaines. En effet cet ensemble qui a ete etudie pour fonctionner avec un diffractometre a neutron, peut facilement s'adapter a la technique de diffraction des rayons X ou par exemple a celle de decroisasnce d'activite {beta}. (auteur)

  4. Online sequential condition prediction method of natural circulation systems based on EOS-ELM and phase space reconstruction

    International Nuclear Information System (INIS)

    Chen, Hanying; Gao, Puzhen; Tan, Sichao; Tang, Jiguo; Yuan, Hongsheng

    2017-01-01

    Highlights: •An online condition prediction method for natural circulation systems in NPP was proposed based on EOS-ELM. •The proposed online prediction method was validated using experimental data. •The training speed of the proposed method is significantly fast. •The proposed method can achieve good accuracy in wide parameter range. -- Abstract: Natural circulation design is widely used in the passive safety systems of advanced nuclear power reactors. The irregular and chaotic flow oscillations are often observed in boiling natural circulation systems so it is difficult for operators to monitor and predict the condition of these systems. An online condition forecasting method for natural circulation system is proposed in this study as an assisting technique for plant operators. The proposed prediction approach was developed based on Ensemble of Online Sequential Extreme Learning Machine (EOS-ELM) and phase space reconstruction. Online Sequential Extreme Learning Machine (OS-ELM) is an online sequential learning neural network algorithm and EOS-ELM is the ensemble method of it. The proposed condition prediction method can be initiated by a small chunk of monitoring data and it can be updated by newly arrived data at very fast speed during the online prediction. Simulation experiments were conducted on the data of two natural circulation loops to validate the performance of the proposed method. The simulation results show that the proposed predication model can successfully recognize different types of flow oscillations and accurately forecast the trend of monitored plant variables. The influence of the number of hidden nodes and neural network inputs on prediction performance was studied and the proposed model can achieve good accuracy in a wide parameter range. Moreover, the comparison results show that the proposed condition prediction method has much faster online learning speed and better prediction accuracy than conventional neural network model.

  5. Modelling Laser Milling of Microcavities for the Manufacturing of DES with Ensembles

    Directory of Open Access Journals (Sweden)

    Pedro Santos

    2014-01-01

    Full Text Available A set of designed experiments, involving the use of a pulsed Nd:YAG laser system milling 316L Stainless Steel, serve to study the laser-milling process of microcavities in the manufacture of drug-eluting stents (DES. Diameter, depth, and volume error are considered to be optimized as functions of the process parameters, which include laser intensity, pulse frequency, and scanning speed. Two different DES shapes are studied that combine semispheres and cylinders. Process inputs and outputs are defined by considering the process parameters that can be changed under industrial conditions and the industrial requirements of this manufacturing process. In total, 162 different conditions are tested in a process that is modeled with the following state-of-the-art data-mining regression techniques: Support Vector Regression, Ensembles, Artificial Neural Networks, Linear Regression, and Nearest Neighbor Regression. Ensemble regression emerged as the most suitable technique for studying this industrial problem. Specifically, Iterated Bagging ensembles with unpruned model trees outperformed the other methods in the tests. This method can predict the geometrical dimensions of the machined microcavities with relative errors related to the main average value in the range of 3 to 23%, which are considered very accurate predictions, in view of the characteristics of this innovative industrial task.

  6. Mechanisms of appearance of amplitude and phase chimera states in ensembles of nonlocally coupled chaotic systems

    Science.gov (United States)

    Bogomolov, Sergey A.; Slepnev, Andrei V.; Strelkova, Galina I.; Schöll, Eckehard; Anishchenko, Vadim S.

    2017-02-01

    We explore the bifurcation transition from coherence to incoherence in ensembles of nonlocally coupled chaotic systems. It is firstly shown that two types of chimera states, namely, amplitude and phase, can be found in a network of coupled logistic maps, while only amplitude chimera states can be observed in a ring of continuous-time chaotic systems. We reveal a bifurcation mechanism by analyzing the evolution of space-time profiles and the coupling function with varying coupling coefficient and formulate the necessary and sufficient conditions for realizing the chimera states in the ensembles.

  7. Evaluation of ensemble precipitation forecasts generated through post-processing in a Canadian catchment

    Science.gov (United States)

    Jha, Sanjeev K.; Shrestha, Durga L.; Stadnyk, Tricia A.; Coulibaly, Paulin

    2018-03-01

    Flooding in Canada is often caused by heavy rainfall during the snowmelt period. Hydrologic forecast centers rely on precipitation forecasts obtained from numerical weather prediction (NWP) models to enforce hydrological models for streamflow forecasting. The uncertainties in raw quantitative precipitation forecasts (QPFs) are enhanced by physiography and orography effects over a diverse landscape, particularly in the western catchments of Canada. A Bayesian post-processing approach called rainfall post-processing (RPP), developed in Australia (Robertson et al., 2013; Shrestha et al., 2015), has been applied to assess its forecast performance in a Canadian catchment. Raw QPFs obtained from two sources, Global Ensemble Forecasting System (GEFS) Reforecast 2 project, from the National Centers for Environmental Prediction, and Global Deterministic Forecast System (GDPS), from Environment and Climate Change Canada, are used in this study. The study period from January 2013 to December 2015 covered a major flood event in Calgary, Alberta, Canada. Post-processed results show that the RPP is able to remove the bias and reduce the errors of both GEFS and GDPS forecasts. Ensembles generated from the RPP reliably quantify the forecast uncertainty.

  8. Verification and process oriented validation of the MiKlip decadal prediction system

    Directory of Open Access Journals (Sweden)

    Frank Kaspar

    2016-12-01

    Full Text Available Decadal prediction systems are designed to become a valuable tool for decision making in different sectors of economy, administration or politics. Progress in decadal predictions is also expected to improve our scientific understanding of the climate system. The German Federal Ministry for Education and Research (BMBF therefore funds the German national research project MiKlip (Mittelfristige Klimaprognosen. A network of German research institutions contributes to the development of the system by conducting individual research projects. This special issue presents a collection of papers with results of the evaluation activities within the first phase of MiKlip. They document the improvements of the MiKlip decadal prediction system which were achieved during the first phase. Key aspects are the role of initialization strategies, model resolution or ensemble size. Additional topics are the evaluation of specific weather parameters in selected regions and the use of specific observational datasets for the evaluation.

  9. Lessons from Climate Modeling on the Design and Use of Ensembles for Crop Modeling

    Science.gov (United States)

    Wallach, Daniel; Mearns, Linda O.; Ruane, Alexander C.; Roetter, Reimund P.; Asseng, Senthold

    2016-01-01

    Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions include defining criteria for acceptance of models in a crop MME, exploring criteria for evaluating the degree of relatedness of models in a MME, studying the effect of number of models in the ensemble, development of a statistical model of model sampling, creation of a repository for MME results, studies of possible differential weighting of models in an ensemble, creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs specifically oriented toward uncertainty estimation, the creation of super ensembles that sample more than one source of uncertainty, the analysis of super ensemble results to obtain information on total uncertainty and the separate contributions of different sources of uncertainty and finally further investigation of the use of the multi-model mean or median as a predictor.

  10. Creating Weather System Ensembles Through Synergistic Process Modeling and Machine Learning

    Science.gov (United States)

    Chen, B.; Posselt, D. J.; Nguyen, H.; Wu, L.; Su, H.; Braverman, A. J.

    2017-12-01

    Earth's weather and climate are sensitive to a variety of control factors (e.g., initial state, forcing functions, etc). Characterizing the response of the atmosphere to a change in initial conditions or model forcing is critical for weather forecasting (ensemble prediction) and climate change assessment. Input - response relationships can be quantified by generating an ensemble of multiple (100s to 1000s) realistic realizations of weather and climate states. Atmospheric numerical models generate simulated data through discretized numerical approximation of the partial differential equations (PDEs) governing the underlying physics. However, the computational expense of running high resolution atmospheric state models makes generation of more than a few simulations infeasible. Here, we discuss an experiment wherein we approximate the numerical PDE solver within the Weather Research and Forecasting (WRF) Model using neural networks trained on a subset of model run outputs. Once trained, these neural nets can produce large number of realization of weather states from a small number of deterministic simulations with speeds that are orders of magnitude faster than the underlying PDE solver. Our neural network architecture is inspired by the governing partial differential equations. These equations are location-invariant, and consist of first and second derivations. As such, we use a 3x3 lon-lat grid of atmospheric profiles as the predictor in the neural net to provide the network the information necessary to compute the first and second moments. Results indicate that the neural network algorithm can approximate the PDE outputs with high degree of accuracy (less than 1% error), and that this error increases as a function of the prediction time lag.

  11. Stochastic Approaches Within a High Resolution Rapid Refresh Ensemble

    Science.gov (United States)

    Jankov, I.

    2017-12-01

    It is well known that global and regional numerical weather prediction (NWP) ensemble systems are under-dispersive, producing unreliable and overconfident ensemble forecasts. Typical approaches to alleviate this problem include the use of multiple dynamic cores, multiple physics suite configurations, or a combination of the two. While these approaches may produce desirable results, they have practical and theoretical deficiencies and are more difficult and costly to maintain. An active area of research that promotes a more unified and sustainable system is the use of stochastic physics. Stochastic approaches include Stochastic Parameter Perturbations (SPP), Stochastic Kinetic Energy Backscatter (SKEB), and Stochastic Perturbation of Physics Tendencies (SPPT). The focus of this study is to assess model performance within a convection-permitting ensemble at 3-km grid spacing across the Contiguous United States (CONUS) using a variety of stochastic approaches. A single physics suite configuration based on the operational High-Resolution Rapid Refresh (HRRR) model was utilized and ensemble members produced by employing stochastic methods. Parameter perturbations (using SPP) for select fields were employed in the Rapid Update Cycle (RUC) land surface model (LSM) and Mellor-Yamada-Nakanishi-Niino (MYNN) Planetary Boundary Layer (PBL) schemes. Within MYNN, SPP was applied to sub-grid cloud fraction, mixing length, roughness length, mass fluxes and Prandtl number. In the RUC LSM, SPP was applied to hydraulic conductivity and tested perturbing soil moisture at initial time. First iterative testing was conducted to assess the initial performance of several configuration settings (e.g. variety of spatial and temporal de-correlation lengths). Upon selection of the most promising candidate configurations using SPP, a 10-day time period was run and more robust statistics were gathered. SKEB and SPPT were included in additional retrospective tests to assess the impact of using

  12. Ensemble models on palaeoclimate to predict India's groundwater challenge

    Directory of Open Access Journals (Sweden)

    Partha Sarathi Datta

    2013-09-01

    Full Text Available In many parts of the world, freshwater crisis is largely due to increasing water consumption and pollution by rapidly growing population and aspirations for economic development, but, ascribed usually to the climate. However, limited understanding and knowledge gaps in the factors controlling climate and uncertainties in the climate models are unable to assess the probable impacts on water availability in tropical regions. In this context, review of ensemble models on δ18O and δD in rainfall and groundwater, 3H- and 14C- ages of groundwater and 14C- age of lakes sediments helped to reconstruct palaeoclimate and long-term recharge in the North-west India; and predict future groundwater challenge. The annual mean temperature trend indicates both warming/cooling in different parts of India in the past and during 1901–2010. Neither the GCMs (Global Climate Models nor the observational record indicates any significant change/increase in temperature and rainfall over the last century, and climate change during the last 1200 yrs BP. In much of the North-West region, deep groundwater renewal occurred from past humid climate, and shallow groundwater renewal from limited modern recharge over the past decades. To make water management to be more responsive to climate change, the gaps in the science of climate change need to be bridged.

  13. Predictability of Precipitation Over the Conterminous U.S. Based on the CMIP5 Multi-Model Ensemble

    Science.gov (United States)

    Jiang, Mingkai; Felzer, Benjamin S.; Sahagian, Dork

    2016-01-01

    Characterizing precipitation seasonality and variability in the face of future uncertainty is important for a well-informed climate change adaptation strategy. Using the Colwell index of predictability and monthly normalized precipitation data from the Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model ensembles, this study identifies spatial hotspots of changes in precipitation predictability in the United States under various climate scenarios. Over the historic period (1950–2005), the recurrent pattern of precipitation is highly predictable in the East and along the coastal Northwest, and is less so in the arid Southwest. Comparing the future (2040–2095) to the historic period, larger changes in precipitation predictability are observed under Representative Concentration Pathways (RCP) 8.5 than those under RCP 4.5. Finally, there are region-specific hotspots of future changes in precipitation predictability, and these hotspots often coincide with regions of little projected change in total precipitation, with exceptions along the wetter East and parts of the drier central West. Therefore, decision-makers are advised to not rely on future total precipitation as an indicator of water resources. Changes in precipitation predictability and the subsequent changes on seasonality and variability are equally, if not more, important factors to be included in future regional environmental assessment. PMID:27425819

  14. Forecasting skills of the ensemble hydro-meteorological system for the Po river floods

    Science.gov (United States)

    Ricciardi, Giuseppe; Montani, Andrea; Paccagnella, Tiziana; Pecora, Silvano; Tonelli, Fabrizio

    2013-04-01

    The Po basin is the largest and most economically important river-basin in Italy. Extreme hydrological events, including floods, flash floods and droughts, are expected to become more severe in the next future due to climate change, and related ground effects are linked both with environmental and social resilience. A Warning Operational Center (WOC) for hydrological event management was created in Emilia Romagna region. In the last years, the WOC faced challenges in legislation, organization, technology and economics, achieving improvements in forecasting skill and information dissemination. Since 2005, an operational forecasting and modelling system for flood modelling and forecasting has been implemented, aimed at supporting and coordinating flood control and emergency management on the whole Po basin. This system, referred to as FEWSPo, has also taken care of environmental aspects of flood forecast. The FEWSPo system has reached a very high level of complexity, due to the combination of three different hydrological-hydraulic chains (HEC-HMS/RAS - MIKE11 NAM/HD, Topkapi/Sobek), with several meteorological inputs (forecasted - COSMOI2, COSMOI7, COSMO-LEPS among others - and observed). In this hydrological and meteorological ensemble the management of the relative predictive uncertainties, which have to be established and communicated to decision makers, is a debated scientific and social challenge. Real time activities face professional, modelling and technological aspects but are also strongly interrelated with organization and human aspects. The authors will report a case study using the operational flood forecast hydro-meteorological ensemble, provided by the MIKE11 chain fed by COSMO_LEPS EQPF. The basic aim of the proposed approach is to analyse limits and opportunities of the long term forecast (with a lead time ranging from 3 to 5 days), for the implementation of low cost actions, also looking for a well informed decision making and the improvement of

  15. Hybrid vs Adaptive Ensemble Kalman Filtering for Storm Surge Forecasting

    Science.gov (United States)

    Altaf, M. U.; Raboudi, N.; Gharamti, M. E.; Dawson, C.; McCabe, M. F.; Hoteit, I.

    2014-12-01

    Recent storm surge events due to Hurricanes in the Gulf of Mexico have motivated the efforts to accurately forecast water levels. Toward this goal, a parallel architecture has been implemented based on a high resolution storm surge model, ADCIRC. However the accuracy of the model notably depends on the quality and the recentness of the input data (mainly winds and bathymetry), model parameters (e.g. wind and bottom drag coefficients), and the resolution of the model grid. Given all these uncertainties in the system, the challenge is to build an efficient prediction system capable of providing accurate forecasts enough ahead of time for the authorities to evacuate the areas at risk. We have developed an ensemble-based data assimilation system to frequently assimilate available data into the ADCIRC model in order to improve the accuracy of the model. In this contribution we study and analyze the performances of different ensemble Kalman filter methodologies for efficient short-range storm surge forecasting, the aim being to produce the most accurate forecasts at the lowest possible computing time. Using Hurricane Ike meteorological data to force the ADCIRC model over a domain including the Gulf of Mexico coastline, we implement and compare the forecasts of the standard EnKF, the hybrid EnKF and an adaptive EnKF. The last two schemes have been introduced as efficient tools for enhancing the behavior of the EnKF when implemented with small ensembles by exploiting information from a static background covariance matrix. Covariance inflation and localization are implemented in all these filters. Our results suggest that both the hybrid and the adaptive approach provide significantly better forecasts than those resulting from the standard EnKF, even when implemented with much smaller ensembles.

  16. DART: New Research Using Ensemble Data Assimilation in Geophysical Models

    Science.gov (United States)

    Hoar, T. J.; Raeder, K.

    2015-12-01

    The Data Assimilation Research Testbed (DART) is a community facilityfor ensemble data assimilation developed and supported by the NationalCenter for Atmospheric Research. DART provides a comprehensive suite of software, documentation, and tutorials that can be used for ensemble data assimilation research, operations, and education. Scientists and software engineers at NCAR are available to support DART users who want to use existing DART products or develop their own applications. Current DART users range from university professors teaching data assimilation, to individual graduate students working with simple models, through national laboratories doing operational prediction with large state-of-the-art models. DART runs efficiently on many computational platforms ranging from laptops through thousands of cores on the newest supercomputers.This poster focuses on several recent research activities using DART with geophysical models.Using CAM/DART to understand whether OCO-2 Total Precipitable Water observations can be useful in numerical weather prediction.Impacts of the synergistic use of Infra-red CO retrievals (MOPITT, IASI) in CAM-CHEM/DART assimilations.Assimilation and Analysis of Observations of Amazonian Biomass Burning Emissions by MOPITT (aerosol optical depth), MODIS (carbon monoxide) and MISR (plume height).Long term evaluation of the chemical response of MOPITT-CO assimilation in CAM-CHEM/DART OSSEs for satellite planning and emission inversion capabilities.Improved forward observation operators for land models that have multiple land use/land cover segments in a single grid cell,Simulating mesoscale convective systems (MCSs) using a variable resolution, unstructured grid in the Model for Prediction Across Scales (MPAS) and DART.The mesoscale WRF+DART system generated an ensemble of year-long, real-time initializations of a convection allowing model over the United States.Constraining WACCM with observations in the tropical band (30S-30N) using DART

  17. Seasonal Climate Predictability in a Coupled OAGCM Using a Different Approach for Ensemble Forecasts.

    Science.gov (United States)

    Luo, Jing-Jia; Masson, Sebastien; Behera, Swadhin; Shingu, Satoru; Yamagata, Toshio

    2005-11-01

    Predictabilities of tropical climate signals are investigated using a relatively high resolution Scale Interaction Experiment Frontier Research Center for Global Change (FRCGC) coupled GCM (SINTEX-F). Five ensemble forecast members are generated by perturbing the model’s coupling physics, which accounts for the uncertainties of both initial conditions and model physics. Because of the model’s good performance in simulating the climatology and ENSO in the tropical Pacific, a simple coupled SST-nudging scheme generates realistic thermocline and surface wind variations in the equatorial Pacific. Several westerly and easterly wind bursts in the western Pacific are also captured.Hindcast results for the period 1982 2001 show a high predictability of ENSO. All past El Niño and La Niña events, including the strongest 1997/98 warm episode, are successfully predicted with the anomaly correlation coefficient (ACC) skill scores above 0.7 at the 12-month lead time. The predicted signals of some particular events, however, become weak with a delay in the phase at mid and long lead times. This is found to be related to the intraseasonal wind bursts that are unpredicted beyond a few months of lead time. The model forecasts also show a “spring prediction barrier” similar to that in observations. Spatial SST anomalies, teleconnection, and global drought/flood during three different phases of ENSO are successfully predicted at 9 12-month lead times.In the tropical North Atlantic and southwestern Indian Ocean, where ENSO has predominant influences, the model shows skillful predictions at the 7 12-month lead times. The distinct signal of the Indian Ocean dipole (IOD) event in 1994 is predicted at the 6-month lead time. SST anomalies near the western coast of Australia are also predicted beyond the 12-month lead time because of pronounced decadal signals there.

  18. On Ensemble Nonlinear Kalman Filtering with Symmetric Analysis Ensembles

    KAUST Repository

    Luo, Xiaodong; Hoteit, Ibrahim; Moroz, Irene M.

    2010-01-01

    However, by adopting the Monte Carlo method, the EnSRF also incurs certain sampling errors. One way to alleviate this problem is to introduce certain symmetry to the ensembles, which can reduce the sampling errors and spurious modes in evaluation of the means and covariances of the ensembles [7]. In this contribution, we present two methods to produce symmetric ensembles. One is based on the unscented transform [8, 9], which leads to the unscented Kalman filter (UKF) [8, 9] and its variant, the ensemble unscented Kalman filter (EnUKF) [7]. The other is based on Stirling’s interpolation formula (SIF), which results in the divided difference filter (DDF) [10]. Here we propose a simplified divided difference filter (sDDF) in the context of ensemble filtering. The similarity and difference between the sDDF and the EnUKF will be discussed. Numerical experiments will also be conducted to investigate the performance of the sDDF and the EnUKF, and compare them to a well‐established EnSRF, the ensemble transform Kalman filter (ETKF) [2].

  19. Supersymmetry applied to the spectrum edge of random matrix ensembles

    International Nuclear Information System (INIS)

    Andreev, A.V.; Simons, B.D.; Taniguchi, N.

    1994-01-01

    A new matrix ensemble has recently been proposed to describe the transport properties in mesoscopic quantum wires. Both analytical and numerical studies have shown that the ensemble of Laguerre or of chiral random matrices provides a good description of scattering properties in this class of systems. Until now only conventional methods of random matrix theory have been used to study statistical properties within this ensemble. We demonstrate that the supersymmetry method, already employed in the study Dyson ensembles, can be extended to treat this class of random matrix ensembles. In developing this approach we investigate both new, as well as verify known statistical measures. Although we focus on ensembles in which T-invariance is violated our approach lays the foundation for future studies of T-invariant systems. ((orig.))

  20. Reconstruction of the 1997/1998 El Nino from TOPEX/POSEIDON and TOGA/TAO Data Using a Massively Parallel Pacific-Ocean Model and Ensemble Kalman Filter

    Science.gov (United States)

    Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.

    1999-01-01

    Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.

  1. Climate Prediction Center(CPC)Ensemble Canonical Correlation Analysis Forecast of Temperature

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) temperature forecast is a 90-day (seasonal) outlook of US surface temperature anomalies. The ECCA uses Canonical...

  2. Prediction skill of rainstorm events over India in the TIGGE weather prediction models

    Science.gov (United States)

    Karuna Sagar, S.; Rajeevan, M.; Vijaya Bhaskara Rao, S.; Mitra, A. K.

    2017-12-01

    Extreme rainfall events pose a serious threat of leading to severe floods in many countries worldwide. Therefore, advance prediction of its occurrence and spatial distribution is very essential. In this paper, an analysis has been made to assess the skill of numerical weather prediction models in predicting rainstorms over India. Using gridded daily rainfall data set and objective criteria, 15 rainstorms were identified during the monsoon season (June to September). The analysis was made using three TIGGE (THe Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble) models. The models considered are the European Centre for Medium-Range Weather Forecasts (ECMWF), National Centre for Environmental Prediction (NCEP) and the UK Met Office (UKMO). Verification of the TIGGE models for 43 observed rainstorm days from 15 rainstorm events has been made for the period 2007-2015. The comparison reveals that rainstorm events are predictable up to 5 days in advance, however with a bias in spatial distribution and intensity. The statistical parameters like mean error (ME) or Bias, root mean square error (RMSE) and correlation coefficient (CC) have been computed over the rainstorm region using the multi-model ensemble (MME) mean. The study reveals that the spread is large in ECMWF and UKMO followed by the NCEP model. Though the ensemble spread is quite small in NCEP, the ensemble member averages are not well predicted. The rank histograms suggest that the forecasts are under prediction. The modified Contiguous Rain Area (CRA) technique was used to verify the spatial as well as the quantitative skill of the TIGGE models. Overall, the contribution from the displacement and pattern errors to the total RMSE is found to be more in magnitude. The volume error increases from 24 hr forecast to 48 hr forecast in all the three models.

  3. Decadal climate prediction (project GCEP).

    Science.gov (United States)

    Haines, Keith; Hermanson, Leon; Liu, Chunlei; Putt, Debbie; Sutton, Rowan; Iwi, Alan; Smith, Doug

    2009-03-13

    Decadal prediction uses climate models forced by changing greenhouse gases, as in the International Panel for Climate Change, but unlike longer range predictions they also require initialization with observations of the current climate. In particular, the upper-ocean heat content and circulation have a critical influence. Decadal prediction is still in its infancy and there is an urgent need to understand the important processes that determine predictability on these timescales. We have taken the first Hadley Centre Decadal Prediction System (DePreSys) and implemented it on several NERC institute compute clusters in order to study a wider range of initial condition impacts on decadal forecasting, eventually including the state of the land and cryosphere. The eScience methods are used to manage submission and output from the many ensemble model runs required to assess predictive skill. Early results suggest initial condition skill may extend for several years, even over land areas, but this depends sensitively on the definition used to measure skill, and alternatives are presented. The Grid for Coupled Ensemble Prediction (GCEP) system will allow the UK academic community to contribute to international experiments being planned to explore decadal climate predictability.

  4. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.

    Science.gov (United States)

    Chevrette, Marc G; Aicheler, Fabian; Kohlbacher, Oliver; Currie, Cameron R; Medema, Marnix H

    2017-10-15

    Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). chevrette@wisc.edu or marnix.medema@wur.nl. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. A Standardized Evaluation System for Decadal Climate Prediction

    Science.gov (United States)

    Kadow, C.; Cubasch, U.

    2012-12-01

    The evaluation of decadal prediction systems is a scientific challenge as well as a technical challenge in the climate research. The major project MiKlip (www.fona-miklip.de) for medium-term climate prediction funded by the Federal Ministry of Education and Research in Germany (BMBF) has the aim to create a model system that can provide reliable decadal forecasts on climate and weather. The model system to be developed will be novel in several aspects, with great challenges for the methodology development. This concerns especially the determination of the initial conditions, the inclusion into the model of processes relevant to decadal predictions, the increase of the spatial resolution through regionalisation, the improvement or adjustment of statistical post-processing, and finally the synthesis and validation of the entire model system. Therefore, a standardized evaluation system will be part of the MiKlip system to validate it - developed by the project 'Integrated data and evaluation system for decadal scale prediction' (INTEGRATION). The presentation gives an overview of the different linkages of such a project, shows the different development stages and gives an outlook for users and possible end users in climate service. The technical interface combines all projects inside of MiKlip and invites them to participate in a common evaluation system. The system design and the validation strategy from a standalone tool in the beginning to a user friendly web based system using GRID technologies to an integrated part of the operational MiKlip system for industry and society will give the opportunity to enhance the MiKlip strategy. First results of different possibilities of such a system will be shown to present the scientific background through Taylor diagrams, ensemble skill scores and e.g. climatological means to show the usability and possibilities of MiKlip and the INTEGRATION project.

  6. Monthly hydrometeorological ensemble prediction of streamflow droughts and corresponding drought indices

    Directory of Open Access Journals (Sweden)

    F. Fundel

    2013-01-01

    Full Text Available Streamflow droughts, characterized by low runoff as consequence of a drought event, affect numerous aspects of life. Economic sectors that are impacted by low streamflow are, e.g., power production, agriculture, tourism, water quality management and shipping. Those sectors could potentially benefit from forecasts of streamflow drought events, even of short events on the monthly time scales or below. Numerical hydrometeorological models have increasingly been used to forecast low streamflow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the evaluation of low streamflow and of the derived indices as duration, severity and magnitude, characterizing streamflow droughts up to a lead time of one month.

    The ECMWF VarEPS 5-member ensemble reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification reveals that, compared to probabilistic peak-flow forecasts, which show skill up to a lead time of two weeks, forecasts of streamflow droughts are skilful over the entire forecast range of one month. For forecasts at the lower end of the runoff regime, the quality of the initial state seems to be crucial to achieve a good forecast quality in the longer range. It is shown that the states used in this study to initialize forecasts satisfy this requirement. The produced forecasts of streamflow drought indices, derived from the ensemble forecasts, could be beneficially included in a decision-making process. This is valid for probabilistic forecasts of streamflow drought events falling below a daily varying threshold, based on a quantile derived from a runoff climatology. Although the forecasts have a tendency to overpredict streamflow droughts, it is shown that the relative economic value of the ensemble forecasts reaches up to 60%, in case a forecast user is able to take preventive

  7. Monthly hydrometeorological ensemble prediction of streamflow droughts and corresponding drought indices

    Science.gov (United States)

    Fundel, F.; Jörg-Hess, S.; Zappa, M.

    2013-01-01

    Streamflow droughts, characterized by low runoff as consequence of a drought event, affect numerous aspects of life. Economic sectors that are impacted by low streamflow are, e.g., power production, agriculture, tourism, water quality management and shipping. Those sectors could potentially benefit from forecasts of streamflow drought events, even of short events on the monthly time scales or below. Numerical hydrometeorological models have increasingly been used to forecast low streamflow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the evaluation of low streamflow and of the derived indices as duration, severity and magnitude, characterizing streamflow droughts up to a lead time of one month. The ECMWF VarEPS 5-member ensemble reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification reveals that, compared to probabilistic peak-flow forecasts, which show skill up to a lead time of two weeks, forecasts of streamflow droughts are skilful over the entire forecast range of one month. For forecasts at the lower end of the runoff regime, the quality of the initial state seems to be crucial to achieve a good forecast quality in the longer range. It is shown that the states used in this study to initialize forecasts satisfy this requirement. The produced forecasts of streamflow drought indices, derived from the ensemble forecasts, could be beneficially included in a decision-making process. This is valid for probabilistic forecasts of streamflow drought events falling below a daily varying threshold, based on a quantile derived from a runoff climatology. Although the forecasts have a tendency to overpredict streamflow droughts, it is shown that the relative economic value of the ensemble forecasts reaches up to 60%, in case a forecast user is able to take preventive action based on the forecast.

  8. An operational ensemble prediction system for catchment rainfall over eastern Africa spanning multiple temporal and spatial scales

    Science.gov (United States)

    Riddle, E. E.; Hopson, T. M.; Gebremichael, M.; Boehnert, J.; Broman, D.; Sampson, K. M.; Rostkier-Edelstein, D.; Collins, D. C.; Harshadeep, N. R.; Burke, E.; Havens, K.

    2017-12-01

    While it is not yet certain how precipitation patterns will change over Africa in the future, it is clear that effectively managing the available water resources is going to be crucial in order to mitigate the effects of water shortages and floods that are likely to occur in a changing climate. One component of effective water management is the availability of state-of-the-art and easy to use rainfall forecasts across multiple spatial and temporal scales. We present a web-based system for displaying and disseminating ensemble forecast and observed precipitation data over central and eastern Africa. The system provides multi-model rainfall forecasts integrated to relevant hydrological catchments for timescales ranging from one day to three months. A zoom-in features is available to access high resolution forecasts for small-scale catchments. Time series plots and data downloads with forecasts, recent rainfall observations and climatological data are available by clicking on individual catchments. The forecasts are calibrated using a quantile regression technique and an optimal multi-model forecast is provided at each timescale. The forecast skill at the various spatial and temporal scales will discussed, as will current applications of this tool for managing water resources in Sudan and optimizing hydropower operations in Ethiopia and Tanzania.

  9. Climatological Observations for Maritime Prediction and Analysis Support Service (COMPASS)

    Science.gov (United States)

    OConnor, A.; Kirtman, B. P.; Harrison, S.; Gorman, J.

    2016-02-01

    Current US Navy forecasting systems cannot easily incorporate extended-range forecasts that can improve mission readiness and effectiveness; ensure safety; and reduce cost, labor, and resource requirements. If Navy operational planners had systems that incorporated these forecasts, they could plan missions using more reliable and longer-term weather and climate predictions. Further, using multi-model forecast ensembles instead of single forecasts would produce higher predictive performance. Extended-range multi-model forecast ensembles, such as those available in the North American Multi-Model Ensemble (NMME), are ideal for system integration because of their high skill predictions; however, even higher skill predictions can be produced if forecast model ensembles are combined correctly. While many methods for weighting models exist, the best method in a given environment requires expert knowledge of the models and combination methods.We present an innovative approach that uses machine learning to combine extended-range predictions from multi-model forecast ensembles and generate a probabilistic forecast for any region of the globe up to 12 months in advance. Our machine-learning approach uses 30 years of hindcast predictions to learn patterns of forecast model successes and failures. Each model is assigned a weight for each environmental condition, 100 km2 region, and day given any expected environmental information. These weights are then applied to the respective predictions for the region and time of interest to effectively stitch together a single, coherent probabilistic forecast. Our experimental results demonstrate the benefits of our approach to produce extended-range probabilistic forecasts for regions and time periods of interest that are superior, in terms of skill, to individual NMME forecast models and commonly weighted models. The probabilistic forecast leverages the strengths of three NMME forecast models to predict environmental conditions for an

  10. A consensus approach for estimating the predictive accuracy of dynamic models in biology.

    Science.gov (United States)

    Villaverde, Alejandro F; Bongard, Sophia; Mauch, Klaus; Müller, Dirk; Balsa-Canto, Eva; Schmid, Joachim; Banga, Julio R

    2015-04-01

    Mathematical models that predict the complex dynamic behaviour of cellular networks are fundamental in systems biology, and provide an important basis for biomedical and biotechnological applications. However, obtaining reliable predictions from large-scale dynamic models is commonly a challenging task due to lack of identifiability. The present work addresses this challenge by presenting a methodology for obtaining high-confidence predictions from dynamic models using time-series data. First, to preserve the complex behaviour of the network while reducing the number of estimated parameters, model parameters are combined in sets of meta-parameters, which are obtained from correlations between biochemical reaction rates and between concentrations of the chemical species. Next, an ensemble of models with different parameterizations is constructed and calibrated. Finally, the ensemble is used for assessing the reliability of model predictions by defining a measure of convergence of model outputs (consensus) that is used as an indicator of confidence. We report results of computational tests carried out on a metabolic model of Chinese Hamster Ovary (CHO) cells, which are used for recombinant protein production. Using noisy simulated data, we find that the aggregated ensemble predictions are on average more accurate than the predictions of individual ensemble models. Furthermore, ensemble predictions with high consensus are statistically more accurate than ensemble predictions with large variance. The procedure provides quantitative estimates of the confidence in model predictions and enables the analysis of sufficiently complex networks as required for practical applications. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. Ensemble of ground subsidence hazard maps using fuzzy logic

    Science.gov (United States)

    Park, Inhye; Lee, Jiyeong; Saro, Lee

    2014-06-01

    Hazard maps of ground subsidence around abandoned underground coal mines (AUCMs) in Samcheok, Korea, were constructed using fuzzy ensemble techniques and a geographical information system (GIS). To evaluate the factors related to ground subsidence, a spatial database was constructed from topographic, geologic, mine tunnel, land use, groundwater, and ground subsidence maps. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 70/30 for training and validation of the models. The relationships between the detected ground-subsidence area and the factors were identified and quantified by frequency ratio (FR), logistic regression (LR) and artificial neural network (ANN) models. The relationships were used as factor ratings in the overlay analysis to create ground-subsidence hazard indexes and maps. The three GSH maps were then used as new input factors and integrated using fuzzy-ensemble methods to make better hazard maps. All of the hazard maps were validated by comparison with known subsidence areas that were not used directly in the analysis. As the result, the ensemble model was found to be more effective in terms of prediction accuracy than the individual model.

  12. EnsembleGASVR: A novel ensemble method for classifying missense single nucleotide polymorphisms

    KAUST Repository

    Rapakoulia, Trisevgeni

    2014-04-26

    Motivation: Single nucleotide polymorphisms (SNPs) are considered the most frequently occurring DNA sequence variations. Several computational methods have been proposed for the classification of missense SNPs to neutral and disease associated. However, existing computational approaches fail to select relevant features by choosing them arbitrarily without sufficient documentation. Moreover, they are limited to the problem ofmissing values, imbalance between the learning datasets and most of them do not support their predictions with confidence scores. Results: To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a twostep algorithm, which in its first step applies a novel evolutionary embedded algorithm to locate close to optimal Support Vector Regression models. In its second step, these models are combined to extract a universal predictor, which is less prone to overfitting issues, systematizes the rebalancing of the learning sets and uses an internal approach for solving the missing values problem without loss of information. Confidence scores support all the predictions and the model becomes tunable by modifying the classification thresholds. An extensive study was performed for collecting the most relevant features for the problem of classifying SNPs, and a superset of 88 features was constructed. Experimental results show that the proposed framework outperforms well-known algorithms in terms of classification performance in the examined datasets. Finally, the proposed algorithmic framework was able to uncover the significant role of certain features such as the solvent accessibility feature, and the top-scored predictions were further validated by linking them with disease phenotypes. © The Author 2014.

  13. Three-dimensional visualization of ensemble weather forecasts – Part 2: Forecasting warm conveyor belt situations for aircraft-based field campaigns

    Directory of Open Access Journals (Sweden)

    M. Rautenhaus

    2015-07-01

    Full Text Available We present the application of interactive three-dimensional (3-D visualization of ensemble weather predictions to forecasting warm conveyor belt situations during aircraft-based atmospheric research campaigns. Motivated by forecast requirements of the T-NAWDEX-Falcon 2012 (THORPEX – North Atlantic Waveguide and Downstream Impact Experiment campaign, a method to predict 3-D probabilities of the spatial occurrence of warm conveyor belts (WCBs has been developed. Probabilities are derived from Lagrangian particle trajectories computed on the forecast wind fields of the European Centre for Medium Range Weather Forecasts (ECMWF ensemble prediction system. Integration of the method into the 3-D ensemble visualization tool Met.3D, introduced in the first part of this study, facilitates interactive visualization of WCB features and derived probabilities in the context of the ECMWF ensemble forecast. We investigate the sensitivity of the method with respect to trajectory seeding and grid spacing of the forecast wind field. Furthermore, we propose a visual analysis method to quantitatively analyse the contribution of ensemble members to a probability region and, thus, to assist the forecaster in interpreting the obtained probabilities. A case study, revisiting a forecast case from T-NAWDEX-Falcon, illustrates the practical application of Met.3D and demonstrates the use of 3-D and uncertainty visualization for weather forecasting and for planning flight routes in the medium forecast range (3 to 7 days before take-off.

  14. Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing

    KAUST Repository

    Toye, Habib; Zhan, Peng; Gopalakrishnan, Ganesh; Kartadikaria, Aditya R.; Huang, Huang; Knio, Omar; Hoteit, Ibrahim

    2017-01-01

    We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation

  15. Realization of Deutsch-like algorithm using ensemble computing

    International Nuclear Information System (INIS)

    Wei Daxiu; Luo Jun; Sun Xianping; Zeng Xizhi

    2003-01-01

    The Deutsch-like algorithm [Phys. Rev. A. 63 (2001) 034101] distinguishes between even and odd query functions using fewer function calls than its possible classical counterpart in a two-qubit system. But the similar method cannot be applied to a multi-qubit system. We propose a new approach for solving Deutsch-like problem using ensemble computing. The proposed algorithm needs an ancillary qubit and can be easily extended to multi-qubit system with one query. Our ensemble algorithm beginning with a easily-prepared initial state has three main steps. The classifications of the functions can be obtained directly from the spectra of the ancilla qubit. We also demonstrate the new algorithm in a four-qubit molecular system using nuclear magnetic resonance (NMR). One hydrogen and three carbons are selected as the four qubits, and one of carbons is ancilla qubit. We choice two unitary transformations, corresponding to two functions (one odd function and one even function), to validate the ensemble algorithm. The results show that our experiment is successfully and our ensemble algorithm for solving the Deutsch-like problem is virtual

  16. Pauci ex tanto numero: reduce redundancy in multi-model ensembles

    Science.gov (United States)

    Solazzo, E.; Riccio, A.; Kioutsioukis, I.; Galmarini, S.

    2013-08-01

    We explicitly address the fundamental issue of member diversity in multi-model ensembles. To date, no attempts in this direction have been documented within the air quality (AQ) community despite the extensive use of ensembles in this field. Common biases and redundancy are the two issues directly deriving from lack of independence, undermining the significance of a multi-model ensemble, and are the subject of this study. Shared, dependant biases among models do not cancel out but will instead determine a biased ensemble. Redundancy derives from having too large a portion of common variance among the members of the ensemble, producing overconfidence in the predictions and underestimation of the uncertainty. The two issues of common biases and redundancy are analysed in detail using the AQMEII ensemble of AQ model results for four air pollutants in two European regions. We show that models share large portions of bias and variance, extending well beyond those induced by common inputs. We make use of several techniques to further show that subsets of models can explain the same amount of variance as the full ensemble with the advantage of being poorly correlated. Selecting the members for generating skilful, non-redundant ensembles from such subsets proved, however, non-trivial. We propose and discuss various methods of member selection and rate the ensemble performance they produce. In most cases, the full ensemble is outscored by the reduced ones. We conclude that, although independence of outputs may not always guarantee enhancement of scores (but this depends upon the skill being investigated), we discourage selecting the members of the ensemble simply on the basis of scores; that is, independence and skills need to be considered disjointly.

  17. Evaluating the applicability of using daily forecasts from seasonal prediction systems (SPSs) for agriculture: a case study of Nepal's Terai with the NCEP CFSv2

    Science.gov (United States)

    Jha, Prakash K.; Athanasiadis, Panos; Gualdi, Silvio; Trabucco, Antonio; Mereu, Valentina; Shelia, Vakhtang; Hoogenboom, Gerrit

    2018-03-01

    Ensemble forecasts from dynamic seasonal prediction systems (SPSs) have the potential to improve decision-making for crop management to help cope with interannual weather variability. Because the reliability of crop yield predictions based on seasonal weather forecasts depends on the quality of the forecasts, it is essential to evaluate forecasts prior to agricultural applications. This study analyses the potential of Climate Forecast System version 2 (CFSv2) in predicting the Indian summer monsoon (ISM) for producing meteorological variables relevant to crop modeling. The focus area was Nepal's Terai region, and the local hindcasts were compared with weather station and reanalysis data. The results showed that the CFSv2 model accurately predicts monthly anomalies of daily maximum and minimum air temperature (Tmax and Tmin) as well as incoming total surface solar radiation (Srad). However, the daily climatologies of the respective CFSv2 hindcasts exhibit significant systematic biases compared to weather station data. The CFSv2 is less capable of predicting monthly precipitation anomalies and simulating the respective intra-seasonal variability over the growing season. Nevertheless, the observed daily climatologies of precipitation fall within the ensemble spread of the respective daily climatologies of CFSv2 hindcasts. These limitations in the CFSv2 seasonal forecasts, primarily in precipitation, restrict the potential application for predicting the interannual variability of crop yield associated with weather variability. Despite these limitations, ensemble averaging of the simulated yield using all CFSv2 members after applying bias correction may lead to satisfactory yield predictions.

  18. The canonical ensemble redefined - 3. Ideal Bose gas

    International Nuclear Information System (INIS)

    Venkataraman, R.

    1984-12-01

    The ideal Bose gas solved in the redefined ensemble formalism exhibits a discontinuity in the specific heat suggesting that Bose-Einstein condensation is a second order phase transition. The deviations from the classical ideal gas behaviour are larger than those predicted by Gibbs ensemble. Below Tsub(c) the pressure is not independent of the volume. For a certain range of values of VT 3 , the peak in black body radiation shows a shift in the frequency scale and this could be detected, at least in principle, experimentally. (author)

  19. Pauci ex tanto numero: reducing redundancy in multi-model ensembles

    Science.gov (United States)

    Solazzo, E.; Riccio, A.; Kioutsioukis, I.; Galmarini, S.

    2013-02-01

    We explicitly address the fundamental issue of member diversity in multi-model ensembles. To date no attempts in this direction are documented within the air quality (AQ) community, although the extensive use of ensembles in this field. Common biases and redundancy are the two issues directly deriving from lack of independence, undermining the significance of a multi-model ensemble, and are the subject of this study. Shared biases among models will determine a biased ensemble, making therefore essential the errors of the ensemble members to be independent so that bias can cancel out. Redundancy derives from having too large a portion of common variance among the members of the ensemble, producing overconfidence in the predictions and underestimation of the uncertainty. The two issues of common biases and redundancy are analysed in detail using the AQMEII ensemble of AQ model results for four air pollutants in two European regions. We show that models share large portions of bias and variance, extending well beyond those induced by common inputs. We make use of several techniques to further show that subsets of models can explain the same amount of variance as the full ensemble with the advantage of being poorly correlated. Selecting the members for generating skilful, non-redundant ensembles from such subsets proved, however, non-trivial. We propose and discuss various methods of member selection and rate the ensemble performance they produce. In most cases, the full ensemble is outscored by the reduced ones. We conclude that, although independence of outputs may not always guarantee enhancement of scores (but this depends upon the skill being investigated) we discourage selecting the members of the ensemble simply on the basis of scores, that is, independence and skills need to be considered disjointly.

  20. Uncertainty in solid precipitation and snow depth prediction for Siberia using the Noah and Noah-MP land surface models

    Science.gov (United States)

    Suzuki, Kazuyoshi; Zupanski, Milija

    2018-01-01

    In this study, we investigate the uncertainties associated with land surface processes in an ensemble predication context. Specifically, we compare the uncertainties produced by a coupled atmosphere-land modeling system with two different land surface models, the Noah- MP land surface model (LSM) and the Noah LSM, by using the Maximum Likelihood Ensemble Filter (MLEF) data assimilation system as a platform for ensemble prediction. We carried out 24-hour prediction simulations in Siberia with 32 ensemble members beginning at 00:00 UTC on 5 March 2013. We then compared the model prediction uncertainty of snow depth and solid precipitation with observation-based research products and evaluated the standard deviation of the ensemble spread. The prediction skill and ensemble spread exhibited high positive correlation for both LSMs, indicating a realistic uncertainty estimation. The inclusion of a multiple snowlayer model in the Noah-MP LSM was beneficial for reducing the uncertainties of snow depth and snow depth change compared to the Noah LSM, but the uncertainty in daily solid precipitation showed minimal difference between the two LSMs. The impact of LSM choice in reducing temperature uncertainty was limited to surface layers of the atmosphere. In summary, we found that the more sophisticated Noah-MP LSM reduces uncertainties associated with land surface processes compared to the Noah LSM. Thus, using prediction models with improved skill implies improved predictability and greater certainty of prediction.

  1. Predicting Hepatotoxicity of Drug Metabolites Via an Ensemble Approach Based on Support Vector Machine

    Science.gov (United States)

    Lu, Yin; Liu, Lili; Lu, Dong; Cai, Yudong; Zheng, Mingyue; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2017-11-20

    Drug-induced liver injury (DILI) is a major cause of drug withdrawal. The chemical properties of the drug, especially drug metabolites, play key roles in DILI. Our goal is to construct a QSAR model to predict drug hepatotoxicity based on drug metabolites. 64 hepatotoxic drug metabolites and 3,339 non-hepatotoxic drug metabolites were gathered from MDL Metabolite Database. Considering the imbalance of the dataset, we randomly split the negative samples and combined each portion with all the positive samples to construct individually balanced datasets for constructing independent classifiers. Then, we adopted an ensemble approach to make prediction based on the results of all individual classifiers and applied the minimum Redundancy Maximum Relevance (mRMR) feature selection method to select the molecular descriptors. Eventually, for the drugs in the external test set, a Bayesian inference method was used to predict the hepatotoxicity of a drug based on its metabolites. The model showed the average balanced accuracy=78.47%, sensitivity =74.17%, and specificity=82.77%. Five molecular descriptors characterizing molecular polarity, intramolecular bonding strength, and molecular frontier orbital energy were obtained. When predicting the hepatotoxicity of a drug based on all its metabolites, the sensitivity, specificity and balanced accuracy were 60.38%, 70.00%, and 65.19%, respectively, indicating that this method is useful for identifying the hepatotoxicity of drugs. We developed an in silico model to predict hepatotoxicity of drug metabolites. Moreover, Bayesian inference was applied to predict the hepatotoxicity of a drug based on its metabolites which brought out valuable high sensitivity and specificity. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  2. Can the confidence in long range atmospheric transport models be increased? The pan European experience of ensemble

    International Nuclear Information System (INIS)

    Galmarini, S.; Bianconi, R.; Mikkelsen, T.

    2003-01-01

    Full text: In the unfortunate event of an accidental release of radioactive material to the environment, the first concern for early-phase emergency response is atmospheric dispersion. For this purpose, several countries worldwide use operational Long Range Atmospheric Transport (LRAT) models to produce predictions of the event evolution over the continental scale to determine whether, when and how the radioactive cloud is going to hit their country. While presenting the multi-model ensemble dispersion forecast system (ENSEMBLE), the paper seeks to answer the following questions: is atmospheric dispersion forecasting an important asset of the early-phase emergency response management?; Is there a 'Perfect Atmospheric Dispersion Model'?; Is there a way to make the results of dispersion models more reliable and trustworthy? Several activities conducted during the 1990's, sought to estimate quantitatively the capability of LRAT models to forecast the atmospheric dispersion of radionuclides in the atmosphere. The results obtained clearly demonstrated that: the predictions of the various operational LRAT models used worldwide do not systematically agree (mainly due to conceptual differences in model structure and differences in the meteorological forecasts used to simulate the dispersion); none of the models used in the various countries is better than others under all circumstances and therefore there is no objective indication that shows one or few models to be the 'perfect model/s'. Given the realistic scenario that an accident can take place any time, any national authority is however faced with the practical need of managing the emergency and therefore with the dilemma: 'shall one rely an a LRAT model or only an the now cast provided by a monitoring network?' and 'to what extent are a model predictions going to be deceptive in the decision making process?' Since it goes without saying that even a vague idea an the future evolution of a dispersion process is better

  3. Assessing uncertainties in crop and pasture ensemble model simulations of productivity and N2 O emissions.

    Science.gov (United States)

    Ehrhardt, Fiona; Soussana, Jean-François; Bellocchi, Gianni; Grace, Peter; McAuliffe, Russel; Recous, Sylvie; Sándor, Renáta; Smith, Pete; Snow, Val; de Antoni Migliorati, Massimiliano; Basso, Bruno; Bhatia, Arti; Brilli, Lorenzo; Doltra, Jordi; Dorich, Christopher D; Doro, Luca; Fitton, Nuala; Giacomini, Sandro J; Grant, Brian; Harrison, Matthew T; Jones, Stephanie K; Kirschbaum, Miko U F; Klumpp, Katja; Laville, Patricia; Léonard, Joël; Liebig, Mark; Lieffering, Mark; Martin, Raphaël; Massad, Raia S; Meier, Elizabeth; Merbold, Lutz; Moore, Andrew D; Myrgiotis, Vasileios; Newton, Paul; Pattey, Elizabeth; Rolinski, Susanne; Sharp, Joanna; Smith, Ward N; Wu, Lianhai; Zhang, Qing

    2018-02-01

    Simulation models are extensively used to predict agricultural productivity and greenhouse gas emissions. However, the uncertainties of (reduced) model ensemble simulations have not been assessed systematically for variables affecting food security and climate change mitigation, within multi-species agricultural contexts. We report an international model comparison and benchmarking exercise, showing the potential of multi-model ensembles to predict productivity and nitrous oxide (N 2 O) emissions for wheat, maize, rice and temperate grasslands. Using a multi-stage modelling protocol, from blind simulations (stage 1) to partial (stages 2-4) and full calibration (stage 5), 24 process-based biogeochemical models were assessed individually or as an ensemble against long-term experimental data from four temperate grassland and five arable crop rotation sites spanning four continents. Comparisons were performed by reference to the experimental uncertainties of observed yields and N 2 O emissions. Results showed that across sites and crop/grassland types, 23%-40% of the uncalibrated individual models were within two standard deviations (SD) of observed yields, while 42 (rice) to 96% (grasslands) of the models were within 1 SD of observed N 2 O emissions. At stage 1, ensembles formed by the three lowest prediction model errors predicted both yields and N 2 O emissions within experimental uncertainties for 44% and 33% of the crop and grassland growth cycles, respectively. Partial model calibration (stages 2-4) markedly reduced prediction errors of the full model ensemble E-median for crop grain yields (from 36% at stage 1 down to 4% on average) and grassland productivity (from 44% to 27%) and to a lesser and more variable extent for N 2 O emissions. Yield-scaled N 2 O emissions (N 2 O emissions divided by crop yields) were ranked accurately by three-model ensembles across crop species and field sites. The potential of using process-based model ensembles to predict jointly

  4. Wave Extremes in the Northeast Atlantic from Ensemble Forecasts

    Science.gov (United States)

    Breivik, Øyvind; Aarnes, Ole Johan; Bidlot, Jean-Raymond; Carrasco, Ana; Saetra, Øyvind

    2013-10-01

    A method for estimating return values from ensembles of forecasts at advanced lead times is presented. Return values of significant wave height in the North-East Atlantic, the Norwegian Sea and the North Sea are computed from archived +240-h forecasts of the ECMWF ensemble prediction system (EPS) from 1999 to 2009. We make three assumptions: First, each forecast is representative of a six-hour interval and collectively the data set is then comparable to a time period of 226 years. Second, the model climate matches the observed distribution, which we confirm by comparing with buoy data. Third, the ensemble members are sufficiently uncorrelated to be considered independent realizations of the model climate. We find anomaly correlations of 0.20, but peak events (>P97) are entirely uncorrelated. By comparing return values from individual members with return values of subsamples of the data set we also find that the estimates follow the same distribution and appear unaffected by correlations in the ensemble. The annual mean and variance over the 11-year archived period exhibit no significant departures from stationarity compared with a recent reforecast, i.e., there is no spurious trend due to model upgrades. EPS yields significantly higher return values than ERA-40 and ERA-Interim and is in good agreement with the high-resolution hindcast NORA10, except in the lee of unresolved islands where EPS overestimates and in enclosed seas where it is biased low. Confidence intervals are half the width of those found for ERA-Interim due to the magnitude of the data set.

  5. Protein folding simulations by generalized-ensemble algorithms.

    Science.gov (United States)

    Yoda, Takao; Sugita, Yuji; Okamoto, Yuko

    2014-01-01

    In the protein folding problem, conventional simulations in physical statistical mechanical ensembles, such as the canonical ensemble with fixed temperature, face a great difficulty. This is because there exist a huge number of local-minimum-energy states in the system and the conventional simulations tend to get trapped in these states, giving wrong results. Generalized-ensemble algorithms are based on artificial unphysical ensembles and overcome the above difficulty by performing random walks in potential energy, volume, and other physical quantities or their corresponding conjugate parameters such as temperature, pressure, etc. The advantage of generalized-ensemble simulations lies in the fact that they not only avoid getting trapped in states of energy local minima but also allows the calculations of physical quantities as functions of temperature or other parameters from a single simulation run. In this article we review the generalized-ensemble algorithms. Four examples, multicanonical algorithm, replica-exchange method, replica-exchange multicanonical algorithm, and multicanonical replica-exchange method, are described in detail. Examples of their applications to the protein folding problem are presented.

  6. Dispersion Modeling Using Ensemble Forecasts Compared to ETEX Measurements.

    Science.gov (United States)

    Straume, Anne Grete; N'dri Koffi, Ernest; Nodop, Katrin

    1998-11-01

    Numerous numerical models are developed to predict long-range transport of hazardous air pollution in connection with accidental releases. When evaluating and improving such a model, it is important to detect uncertainties connected to the meteorological input data. A Lagrangian dispersion model, the Severe Nuclear Accident Program, is used here to investigate the effect of errors in the meteorological input data due to analysis error. An ensemble forecast, produced at the European Centre for Medium-Range Weather Forecasts, is then used as model input. The ensemble forecast members are generated by perturbing the initial meteorological fields of the weather forecast. The perturbations are calculated from singular vectors meant to represent possible forecast developments generated by instabilities in the atmospheric flow during the early part of the forecast. The instabilities are generated by errors in the analyzed fields. Puff predictions from the dispersion model, using ensemble forecast input, are compared, and a large spread in the predicted puff evolutions is found. This shows that the quality of the meteorological input data is important for the success of the dispersion model. In order to evaluate the dispersion model, the calculations are compared with measurements from the European Tracer Experiment. The model manages to predict the measured puff evolution concerning shape and time of arrival to a fairly high extent, up to 60 h after the start of the release. The modeled puff is still too narrow in the advection direction.

  7. Constructing Better Classifier Ensemble Based on Weighted Accuracy and Diversity Measure

    Directory of Open Access Journals (Sweden)

    Xiaodong Zeng

    2014-01-01

    Full Text Available A weighted accuracy and diversity (WAD method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases.

  8. Ensemble forecasting for renewable energy applications - status and current challenges for their generation and verification

    Science.gov (United States)

    Pinson, Pierre

    2016-04-01

    The operational management of renewable energy generation in power systems and electricity markets requires forecasts in various forms, e.g., deterministic or probabilistic, continuous or categorical, depending upon the decision process at hand. Besides, such forecasts may also be necessary at various spatial and temporal scales, from high temporal resolutions (in the order of minutes) and very localized for an offshore wind farm, to coarser temporal resolutions (hours) and covering a whole country for day-ahead power scheduling problems. As of today, weather predictions are a common input to forecasting methodologies for renewable energy generation. Since for most decision processes, optimal decisions can only be made if accounting for forecast uncertainties, ensemble predictions and density forecasts are increasingly seen as the product of choice. After discussing some of the basic approaches to obtaining ensemble forecasts of renewable power generation, it will be argued that space-time trajectories of renewable power production may or may not be necessitate post-processing ensemble forecasts for relevant weather variables. Example approaches and test case applications will be covered, e.g., looking at the Horns Rev offshore wind farm in Denmark, or gridded forecasts for the whole continental Europe. Eventually, we will illustrate some of the limitations of current frameworks to forecast verification, which actually make it difficult to fully assess the quality of post-processing approaches to obtain renewable energy predictions.

  9. Evaluation of ensemble precipitation forecasts generated through post-processing in a Canadian catchment

    Directory of Open Access Journals (Sweden)

    S. K. Jha

    2018-03-01

    Full Text Available Flooding in Canada is often caused by heavy rainfall during the snowmelt period. Hydrologic forecast centers rely on precipitation forecasts obtained from numerical weather prediction (NWP models to enforce hydrological models for streamflow forecasting. The uncertainties in raw quantitative precipitation forecasts (QPFs are enhanced by physiography and orography effects over a diverse landscape, particularly in the western catchments of Canada. A Bayesian post-processing approach called rainfall post-processing (RPP, developed in Australia (Robertson et al., 2013; Shrestha et al., 2015, has been applied to assess its forecast performance in a Canadian catchment. Raw QPFs obtained from two sources, Global Ensemble Forecasting System (GEFS Reforecast 2 project, from the National Centers for Environmental Prediction, and Global Deterministic Forecast System (GDPS, from Environment and Climate Change Canada, are used in this study. The study period from January 2013 to December 2015 covered a major flood event in Calgary, Alberta, Canada. Post-processed results show that the RPP is able to remove the bias and reduce the errors of both GEFS and GDPS forecasts. Ensembles generated from the RPP reliably quantify the forecast uncertainty.

  10. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm

    International Nuclear Information System (INIS)

    Yu, Lean; Wang, Shouyang; Lai, Kin Keung

    2008-01-01

    In this study, an empirical mode decomposition (EMD) based neural network ensemble learning paradigm is proposed for world crude oil spot price forecasting. For this purpose, the original crude oil spot price series were first decomposed into a finite, and often small, number of intrinsic mode functions (IMFs). Then a three-layer feed-forward neural network (FNN) model was used to model each of the extracted IMFs, so that the tendencies of these IMFs could be accurately predicted. Finally, the prediction results of all IMFs are combined with an adaptive linear neural network (ALNN), to formulate an ensemble output for the original crude oil price series. For verification and testing, two main crude oil price series, West Texas Intermediate (WTI) crude oil spot price and Brent crude oil spot price, are used to test the effectiveness of the proposed EMD-based neural network ensemble learning methodology. Empirical results obtained demonstrate attractiveness of the proposed EMD-based neural network ensemble learning paradigm. (author)

  11. Dispersion of aerosol particles in the free atmosphere using ensemble forecasts

    Directory of Open Access Journals (Sweden)

    T. Haszpra

    2013-10-01

    Full Text Available The dispersion of aerosol particle pollutants is studied using 50 members of an ensemble forecast in the example of a hypothetical free atmospheric emission above Fukushima over a period of 2.5 days. Considerable differences are found among the dispersion predictions of the different ensemble members, as well as between the ensemble mean and the deterministic result at the end of the observation period. The variance is found to decrease with the particle size. The geographical area where a threshold concentration is exceeded in at least one ensemble member expands to a 5–10 times larger region than the area from the deterministic forecast, both for air column "concentration" and in the "deposition" field. We demonstrate that the root-mean-square distance of any particle from its own clones in the ensemble members can reach values on the order of one thousand kilometers. Even the centers of mass of the particle cloud of the ensemble members deviate considerably from that obtained by the deterministic forecast. All these indicate that an investigation of the dispersion of aerosol particles in the spirit of ensemble forecast contains useful hints for the improvement of risk assessment.

  12. Spectral statistics in semiclassical random-matrix ensembles

    International Nuclear Information System (INIS)

    Feingold, M.; Leitner, D.M.; Wilkinson, M.

    1991-01-01

    A novel random-matrix ensemble is introduced which mimics the global structure inherent in the Hamiltonian matrices of autonomous, ergodic systems. Changes in its parameters induce a transition between a Poisson and a Wigner distribution for the level spacings, P(s). The intermediate distributions are uniquely determined by a single scaling variable. Semiclassical constraints force the ensemble to be in a regime with Wigner P(s) for systems with more than two freedoms

  13. Quantifying predictability through information theory: small sample estimation in a non-Gaussian framework

    International Nuclear Information System (INIS)

    Haven, Kyle; Majda, Andrew; Abramov, Rafail

    2005-01-01

    Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed 'toy' climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95% confidence level

  14. Extension of the GHJW theorem for operator ensembles

    International Nuclear Information System (INIS)

    Choi, Jeong Woon; Hong, Dowon; Chang, Ku-Young; Chi, Dong Pyo; Lee, Soojoon

    2011-01-01

    The Gisin-Hughston-Jozsa-Wootters theorem plays an important role in analyzing various theories about quantum information, quantum communication, and quantum cryptography. It means that any purifications on the extended system which yield indistinguishable state ensembles on their subsystem should have a specific local unitary relation. In this Letter, we show that the local relation is also established even when the indistinguishability of state ensembles is extended to that of operator ensembles.

  15. The seasonal predictability of blocking frequency in two seasonal prediction systems (CMCC, Met-Office) and the associated representation of low-frequency variability.

    Science.gov (United States)

    Athanasiadis, Panos; Gualdi, Silvio; Scaife, Adam A.; Bellucci, Alessio; Hermanson, Leon; MacLachlan, Craig; Arribas, Alberto; Materia, Stefano; Borelli, Andrea

    2014-05-01

    Low-frequency variability is a fundamental component of the atmospheric circulation. Extratropical teleconnections, the occurrence of blocking and the slow modulation of the jet streams and storm tracks are all different aspects of low-frequency variability. Part of the latter is attributed to the chaotic nature of the atmosphere and is inherently unpredictable. On the other hand, primarily as a response to boundary forcings, tropospheric low-frequency variability includes components that are potentially predictable. Seasonal forecasting faces the difficult task of predicting these components. Particularly referring to the extratropics, the current generation of seasonal forecasting systems seem to be approaching this target by realistically initializing most components of the climate system, using higher resolution and utilizing large ensemble sizes. Two seasonal prediction systems (Met-Office GloSea and CMCC-SPS-v1.5) are analyzed in terms of their representation of different aspects of extratropical low-frequency variability. The current operational Met-Office system achieves unprecedented high scores in predicting the winter-mean phase of the North Atlantic Oscillation (NAO, corr. 0.74 at 500 hPa) and the Pacific-N. American pattern (PNA, corr. 0.82). The CMCC system, considering its small ensemble size and course resolution, also achieves good scores (0.42 for NAO, 0.51 for PNA). Despite these positive features, both models suffer from biases in low-frequency variance, particularly in the N. Atlantic. Consequently, it is found that their intrinsic variability patterns (sectoral EOFs) differ significantly from the observed, and the known teleconnections are underrepresented. Regarding the representation of N. hemisphere blocking, after bias correction both systems exhibit a realistic climatology of blocking frequency. In this assessment, instantaneous blocking and large-scale persistent blocking events are identified using daily geopotential height fields at

  16. Exploiting conformational ensembles in modeling protein-protein interactions on the proteome scale

    Science.gov (United States)

    Kuzu, Guray; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2013-01-01

    Cellular functions are performed through protein-protein interactions; therefore, identification of these interactions is crucial for understanding biological processes. Recent studies suggest that knowledge-based approaches are more useful than ‘blind’ docking for modeling at large scales. However, a caveat of knowledge-based approaches is that they treat molecules as rigid structures. The Protein Data Bank (PDB) offers a wealth of conformations. Here, we exploited ensemble of the conformations in predictions by a knowledge-based method, PRISM. We tested ‘difficult’ cases in a docking-benchmark dataset, where the unbound and bound protein forms are structurally different. Considering alternative conformations for each protein, the percentage of successfully predicted interactions increased from ~26% to 66%, and 57% of the interactions were successfully predicted in an ‘unbiased’ scenario, in which data related to the bound forms were not utilized. If the appropriate conformation, or relevant template interface, is unavailable in the PDB, PRISM could not predict the interaction successfully. The pace of the growth of the PDB promises a rapid increase of ensemble conformations emphasizing the merit of such knowledge-based ensemble strategies for higher success rates in protein-protein interaction predictions on an interactome-scale. We constructed the structural network of ERK interacting proteins as a case study. PMID:23590674

  17. Ensemble modeling for aromatic production in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Matthew L Rizk

    2009-09-01

    Full Text Available Ensemble Modeling (EM is a recently developed method for metabolic modeling, particularly for utilizing the effect of enzyme tuning data on the production of a specific compound to refine the model. This approach is used here to investigate the production of aromatic products in Escherichia coli. Instead of using dynamic metabolite data to fit a model, the EM approach uses phenotypic data (effects of enzyme overexpression or knockouts on the steady state production rate to screen possible models. These data are routinely generated during strain design. An ensemble of models is constructed that all reach the same steady state and are based on the same mechanistic framework at the elementary reaction level. The behavior of the models spans the kinetics allowable by thermodynamics. Then by using existing data from the literature for the overexpression of genes coding for transketolase (Tkt, transaldolase (Tal, and phosphoenolpyruvate synthase (Pps to screen the ensemble, we arrive at a set of models that properly describes the known enzyme overexpression phenotypes. This subset of models becomes more predictive as additional data are used to refine the models. The final ensemble of models demonstrates the characteristic of the cell that Tkt is the first rate controlling step, and correctly predicts that only after Tkt is overexpressed does an increase in Pps increase the production rate of aromatics. This work demonstrates that EM is able to capture the result of enzyme overexpression on aromatic producing bacteria by successfully utilizing routinely generated enzyme tuning data to guide model learning.

  18. Bioactive focus in conformational ensembles: a pluralistic approach

    Science.gov (United States)

    Habgood, Matthew

    2017-12-01

    Computational generation of conformational ensembles is key to contemporary drug design. Selecting the members of the ensemble that will approximate the conformation most likely to bind to a desired target (the bioactive conformation) is difficult, given that the potential energy usually used to generate and rank the ensemble is a notoriously poor discriminator between bioactive and non-bioactive conformations. In this study an approach to generating a focused ensemble is proposed in which each conformation is assigned multiple rankings based not just on potential energy but also on solvation energy, hydrophobic or hydrophilic interaction energy, radius of gyration, and on a statistical potential derived from Cambridge Structural Database data. The best ranked structures derived from each system are then assembled into a new ensemble that is shown to be better focused on bioactive conformations. This pluralistic approach is tested on ensembles generated by the Molecular Operating Environment's Low Mode Molecular Dynamics module, and by the Cambridge Crystallographic Data Centre's conformation generator software.

  19. Universal LD50 predictions using deep learning

    Science.gov (United States)

    NICEATM Predictive Models for Acute Oral Systemic Toxicity LD50 entry Risa R. Sayre (sayre.risa@epa.gov) & Christopher M. Grulke Our approach uses an ensemble of multilayer perceptron regressions to predict rat acute oral LD50 values from chemical features. Features were genera...

  20. Using Analog Ensemble to generate spatially downscaled probabilistic wind power forecasts

    Science.gov (United States)

    Delle Monache, L.; Shahriari, M.; Cervone, G.

    2017-12-01

    We use the Analog Ensemble (AnEn) method to generate probabilistic 80-m wind power forecasts. We use data from the NCEP GFS ( 28 km resolution) and NCEP NAM (12 km resolution). We use forecasts data from NAM and GFS, and analysis data from NAM which enables us to: 1) use a lower-resolution model to create higher-resolution forecasts, and 2) use a higher-resolution model to create higher-resolution forecasts. The former essentially increases computing speed and the latter increases forecast accuracy. An aggregated model of the former can be compared against the latter to measure the accuracy of the AnEn spatial downscaling. The AnEn works by taking a deterministic future forecast and comparing it with past forecasts. The model searches for the best matching estimates within the past forecasts and selects the predictand value corresponding to these past forecasts as the ensemble prediction for the future forecast. Our study is based on predicting wind speed and air density at more than 13,000 grid points in the continental US. We run the AnEn model twice: 1) estimating 80-m wind speed by using predictor variables such as temperature, pressure, geopotential height, U-component and V-component of wind, 2) estimating air density by using predictors such as temperature, pressure, and relative humidity. We use the air density values to correct the standard wind power curves for different values of air density. The standard deviation of the ensemble members (i.e. ensemble spread) will be used as the degree of difficulty to predict wind power at different locations. The value of the correlation coefficient between the ensemble spread and the forecast error determines the appropriateness of this measure. This measure is prominent for wind farm developers as building wind farms in regions with higher predictability will reduce the real-time risks of operating in the electricity markets.

  1. Real-time prediction of hand trajectory by ensembles of cortical neurons in primates

    Science.gov (United States)

    Wessberg, Johan; Stambaugh, Christopher R.; Kralik, Jerald D.; Beck, Pamela D.; Laubach, Mark; Chapin, John K.; Kim, Jung; Biggs, S. James; Srinivasan, Mandayam A.; Nicolelis, Miguel A. L.

    2000-11-01

    Signals derived from the rat motor cortex can be used for controlling one-dimensional movements of a robot arm. It remains unknown, however, whether real-time processing of cortical signals can be employed to reproduce, in a robotic device, the kind of complex arm movements used by primates to reach objects in space. Here we recorded the simultaneous activity of large populations of neurons, distributed in the premotor, primary motor and posterior parietal cortical areas, as non-human primates performed two distinct motor tasks. Accurate real-time predictions of one- and three-dimensional arm movement trajectories were obtained by applying both linear and nonlinear algorithms to cortical neuronal ensemble activity recorded from each animal. In addition, cortically derived signals were successfully used for real-time control of robotic devices, both locally and through the Internet. These results suggest that long-term control of complex prosthetic robot arm movements can be achieved by simple real-time transformations of neuronal population signals derived from multiple cortical areas in primates.

  2. An artificial neural network ensemble method for fault diagnosis of proton exchange membrane fuel cell system

    International Nuclear Information System (INIS)

    Shao, Meng; Zhu, Xin-Jian; Cao, Hong-Fei; Shen, Hai-Feng

    2014-01-01

    The commercial viability of PEMFC (proton exchange membrane fuel cell) systems depends on using effective fault diagnosis technologies in PEMFC systems. However, many researchers have experimentally studied PEMFC (proton exchange membrane fuel cell) systems without considering certain fault conditions. In this paper, an ANN (artificial neural network) ensemble method is presented that improves the stability and reliability of the PEMFC systems. In the first part, a transient model giving it flexibility in application to some exceptional conditions is built. The PEMFC dynamic model is built and simulated using MATLAB. In the second, using this model and experiments, the mechanisms of four different faults in PEMFC systems are analyzed in detail. Third, the ANN ensemble for the fault diagnosis is built and modeled. This model is trained and tested by the data. The test result shows that, compared with the previous method for fault diagnosis of PEMFC systems, the proposed fault diagnosis method has higher diagnostic rate and generalization ability. Moreover, the partial structure of this method can be altered easily, along with the change of the PEMFC systems. In general, this method for diagnosis of PEMFC has value for certain applications. - Highlights: • We analyze the principles and mechanisms of the four faults in PEMFC (proton exchange membrane fuel cell) system. • We design and model an ANN (artificial neural network) ensemble method for the fault diagnosis of PEMFC system. • This method has high diagnostic rate and strong generalization ability

  3. Ensemble streamflow assimilation with the National Water Model.

    Science.gov (United States)

    Rafieeinasab, A.; McCreight, J. L.; Noh, S.; Seo, D. J.; Gochis, D.

    2017-12-01

    Through case studies of flooding across the US, we compare the performance of the National Water Model (NWM) data assimilation (DA) scheme to that of a newly implemented ensemble Kalman filter approach. The NOAA National Water Model (NWM) is an operational implementation of the community WRF-Hydro modeling system. As of August 2016, the NWM forecasts of distributed hydrologic states and fluxes (including soil moisture, snowpack, ET, and ponded water) over the contiguous United States have been publicly disseminated by the National Center for Environmental Prediction (NCEP) . It also provides streamflow forecasts at more than 2.7 million river reaches up to 30 days in advance. The NWM employs a nudging scheme to assimilate more than 6,000 USGS streamflow observations and provide initial conditions for its forecasts. A problem with nudging is how the forecasts relax quickly to open-loop bias in the forecast. This has been partially addressed by an experimental bias correction approach which was found to have issues with phase errors during flooding events. In this work, we present an ensemble streamflow data assimilation approach combining new channel-only capabilities of the NWM and HydroDART (a coupling of the offline WRF-Hydro model and NCAR's Data Assimilation Research Testbed; DART). Our approach focuses on the single model state of discharge and incorporates error distributions on channel-influxes (overland and groundwater) in the assimilation via an ensemble Kalman filter (EnKF). In order to avoid filter degeneracy associated with a limited number of ensemble at large scale, DART's covariance inflation (Anderson, 2009) and localization capabilities are implemented and evaluated. The current NWM data assimilation scheme is compared to preliminary results from the EnKF application for several flooding case studies across the US.

  4. Ensemble forecasting of potential habitat for three invasive fishes

    Science.gov (United States)

    Poulos, Helen M.; Chernoff, Barry; Fuller, Pam L.; Butman, David

    2012-01-01

    Aquatic invasive species pose major ecological and economic threats to aquatic ecosystems worldwide via displacement, predation, or hybridization with native species and the alteration of aquatic habitats and hydrologic cycles. Modeling the habitat suitability of alien aquatic species through spatially explicit mapping is an increasingly important risk assessment tool. Habitat modeling also facilitates identification of key environmental variables influencing invasive species distributions. We compared four modeling methods to predict the potential continental United States distributions of northern snakehead Channa argus (Cantor, 1842), round goby Neogobius melanostomus (Pallas, 1814), and silver carp Hypophthalmichthys molitrix (Valenciennes, 1844) using maximum entropy (Maxent), the genetic algorithm for rule set production (GARP), DOMAIN, and support vector machines (SVM). We used inventory records from the USGS Nonindigenous Aquatic Species Database and a geographic information system of 20 climatic and environmental variables to generate individual and ensemble distribution maps for each species. The ensemble maps from our study performed as well as or better than all of the individual models except Maxent. The ensemble and Maxent models produced significantly higher accuracy individual maps than GARP, one-class SVMs, or DOMAIN. The key environmental predictor variables in the individual models were consistent with the tolerances of each species. Results from this study provide insights into which locations and environmental conditions may promote the future spread of invasive fish in the US.

  5. Path planning in uncertain flow fields using ensemble method

    KAUST Repository

    Wang, Tong

    2016-08-20

    An ensemble-based approach is developed to conduct optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where an ensemble of deterministic predictions is used to model and quantify uncertainty. In an operational setting, much about dynamics, topography, and forcing of the ocean environment is uncertain. To address this uncertainty, the flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each of the resulting realizations of the uncertain current field, we predict the path that minimizes the travel time by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of the sampling strategy and to develop insight into extensions dealing with general circulation ocean models. In particular, the ensemble method enables us to perform a statistical analysis of travel times and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.

  6. Creating ensembles of decision trees through sampling

    Science.gov (United States)

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  7. Bidirectional Modulation of Intrinsic Excitability in Rat Prelimbic Cortex Neuronal Ensembles and Non-Ensembles after Operant Learning.

    Science.gov (United States)

    Whitaker, Leslie R; Warren, Brandon L; Venniro, Marco; Harte, Tyler C; McPherson, Kylie B; Beidel, Jennifer; Bossert, Jennifer M; Shaham, Yavin; Bonci, Antonello; Hope, Bruce T

    2017-09-06

    Learned associations between environmental stimuli and rewards drive goal-directed learning and motivated behavior. These memories are thought to be encoded by alterations within specific patterns of sparsely distributed neurons called neuronal ensembles that are activated selectively by reward-predictive stimuli. Here, we use the Fos promoter to identify strongly activated neuronal ensembles in rat prelimbic cortex (PLC) and assess altered intrinsic excitability after 10 d of operant food self-administration training (1 h/d). First, we used the Daun02 inactivation procedure in male FosLacZ-transgenic rats to ablate selectively Fos-expressing PLC neurons that were active during operant food self-administration. Selective ablation of these neurons decreased food seeking. We then used male FosGFP-transgenic rats to assess selective alterations of intrinsic excitability in Fos-expressing neuronal ensembles (FosGFP + ) that were activated during food self-administration and compared these with alterations in less activated non-ensemble neurons (FosGFP - ). Using whole-cell recordings of layer V pyramidal neurons in an ex vivo brain slice preparation, we found that operant self-administration increased excitability of FosGFP + neurons and decreased excitability of FosGFP - neurons. Increased excitability of FosGFP + neurons was driven by increased steady-state input resistance. Decreased excitability of FosGFP - neurons was driven by increased contribution of small-conductance calcium-activated potassium (SK) channels. Injections of the specific SK channel antagonist apamin into PLC increased Fos expression but had no effect on food seeking. Overall, operant learning increased intrinsic excitability of PLC Fos-expressing neuronal ensembles that play a role in food seeking but decreased intrinsic excitability of Fos - non-ensembles. SIGNIFICANCE STATEMENT Prefrontal cortex activity plays a critical role in operant learning, but the underlying cellular mechanisms are

  8. Summer drought predictability over Europe: empirical versus dynamical forecasts

    Science.gov (United States)

    Turco, Marco; Ceglar, Andrej; Prodhomme, Chloé; Soret, Albert; Toreti, Andrea; Doblas-Reyes Francisco, J.

    2017-08-01

    Seasonal climate forecasts could be an important planning tool for farmers, government and insurance companies that can lead to better and timely management of seasonal climate risks. However, climate seasonal forecasts are often under-used, because potential users are not well aware of the capabilities and limitations of these products. This study aims at assessing the merits and caveats of a statistical empirical method, the ensemble streamflow prediction system (ESP, an ensemble based on reordering historical data) and an operational dynamical forecast system, the European Centre for Medium-Range Weather Forecasts—System 4 (S4) in predicting summer drought in Europe. Droughts are defined using the Standardized Precipitation Evapotranspiration Index for the month of August integrated over 6 months. Both systems show useful and mostly comparable deterministic skill. We argue that this source of predictability is mostly attributable to the observed initial conditions. S4 shows only higher skill in terms of ability to probabilistically identify drought occurrence. Thus, currently, both approaches provide useful information and ESP represents a computationally fast alternative to dynamical prediction applications for drought prediction.

  9. Deviations from Wick's theorem in the canonical ensemble

    Science.gov (United States)

    Schönhammer, K.

    2017-07-01

    Wick's theorem for the expectation values of products of field operators for a system of noninteracting fermions or bosons plays an important role in the perturbative approach to the quantum many-body problem. A finite-temperature version holds in the framework of the grand canonical ensemble, but not for the canonical ensemble appropriate for systems with fixed particle number such as ultracold quantum gases in optical lattices. Here we present formulas for expectation values of products of field operators in the canonical ensemble using a method in the spirit of Gaudin's proof of Wick's theorem for the grand canonical case. The deviations from Wick's theorem are examined quantitatively for two simple models of noninteracting fermions.

  10. Dissipation induced asymmetric steering of distant atomic ensembles

    Science.gov (United States)

    Cheng, Guangling; Tan, Huatang; Chen, Aixi

    2018-04-01

    The asymmetric steering effects of separated atomic ensembles denoted by the effective bosonic modes have been explored by the means of quantum reservoir engineering in the setting of the cascaded cavities, in each of which an atomic ensemble is involved. It is shown that the steady-state asymmetric steering of the mesoscopic objects is unconditionally achieved via the dissipation of the cavities, by which the nonlocal interaction occurs between two atomic ensembles, and the direction of steering could be easily controlled through variation of certain tunable system parameters. One advantage of the present scheme is that it could be rather robust against parameter fluctuations, and does not require the accurate control of evolution time and the original state of the system. Furthermore, the double-channel Raman transitions between the long-lived atomic ground states are used and the atomic ensembles act as the quantum network nodes, which makes our scheme insensitive to the collective spontaneous emission of atoms.

  11. Changes in Appetitive Associative Strength Modulates Nucleus Accumbens, But Not Orbitofrontal Cortex Neuronal Ensemble Excitability.

    Science.gov (United States)

    Ziminski, Joseph J; Hessler, Sabine; Margetts-Smith, Gabriella; Sieburg, Meike C; Crombag, Hans S; Koya, Eisuke

    2017-03-22

    Cues that predict the availability of food rewards influence motivational states and elicit food-seeking behaviors. If a cue no longer predicts food availability, then animals may adapt accordingly by inhibiting food-seeking responses. Sparsely activated sets of neurons, coined "neuronal ensembles," have been shown to encode the strength of reward-cue associations. Although alterations in intrinsic excitability have been shown to underlie many learning and memory processes, little is known about these properties specifically on cue-activated neuronal ensembles. We examined the activation patterns of cue-activated orbitofrontal cortex (OFC) and nucleus accumbens (NAc) shell ensembles using wild-type and Fos-GFP mice, which express green fluorescent protein (GFP) in activated neurons, after appetitive conditioning with sucrose and extinction learning. We also investigated the neuronal excitability of recently activated, GFP+ neurons in these brain areas using whole-cell electrophysiology in brain slices. Exposure to a sucrose cue elicited activation of neurons in both the NAc shell and OFC. In the NAc shell, but not the OFC, these activated GFP+ neurons were more excitable than surrounding GFP- neurons. After extinction, the number of neurons activated in both areas was reduced and activated ensembles in neither area exhibited altered excitability. These data suggest that learning-induced alterations in the intrinsic excitability of neuronal ensembles is regulated dynamically across different brain areas. Furthermore, we show that changes in associative strength modulate the excitability profile of activated ensembles in the NAc shell. SIGNIFICANCE STATEMENT Sparsely distributed sets of neurons called "neuronal ensembles" encode learned associations about food and cues predictive of its availability. Widespread changes in neuronal excitability have been observed in limbic brain areas after associative learning, but little is known about the excitability changes that

  12. Hybrid neural intelligent system to predict business failure in small-to-medium-size enterprises.

    Science.gov (United States)

    Borrajo, M Lourdes; Baruque, Bruno; Corchado, Emilio; Bajo, Javier; Corchado, Juan M

    2011-08-01

    During the last years there has been a growing need of developing innovative tools that can help small to medium sized enterprises to predict business failure as well as financial crisis. In this study we present a novel hybrid intelligent system aimed at monitoring the modus operandi of the companies and predicting possible failures. This system is implemented by means of a neural-based multi-agent system that models the different actors of the companies as agents. The core of the multi-agent system is a type of agent that incorporates a case-based reasoning system and automates the business control process and failure prediction. The stages of the case-based reasoning system are implemented by means of web services: the retrieval stage uses an innovative weighted voting summarization of self-organizing maps ensembles-based method and the reuse stage is implemented by means of a radial basis function neural network. An initial prototype was developed and the results obtained related to small and medium enterprises in a real scenario are presented.

  13. Kohn-Sham Theory for Ground-State Ensembles

    International Nuclear Information System (INIS)

    Ullrich, C. A.; Kohn, W.

    2001-01-01

    An electron density distribution n(r) which can be represented by that of a single-determinant ground state of noninteracting electrons in an external potential v(r) is called pure-state v -representable (P-VR). Most physical electronic systems are P-VR. Systems which require a weighted sum of several such determinants to represent their density are called ensemble v -representable (E-VR). This paper develops formal Kohn-Sham equations for E-VR physical systems, using the appropriate coupling constant integration. It also derives local density- and generalized gradient approximations, and conditions and corrections specific to ensembles

  14. Regional interdependency of precipitation indices across Denmark in two ensembles of high-resolution RCMs

    DEFF Research Database (Denmark)

    Sunyer Pinya, Maria Antonia; Madsen, Henrik; Rosbjerg, Dan

    2013-01-01

    all these methods is that the climate models are independent. This study addresses the validity of this assumption for two ensembles of regional climate models (RCMs) from the Ensemble-Based Predictions of Climate Changes and their Impacts (ENSEMBLES) project based on the land cells covering Denmark....... Daily precipitation indices from an ensemble of RCMs driven by the 40-yrECMWFRe-Analysis (ERA-40) and an ensemble of the same RCMs driven by different general circulation models (GCMs) are analyzed. Two different methods are used to estimate the amount of independent information in the ensembles....... These are based on different statistical properties of a measure of climate model error. Additionally, a hierarchical cluster analysis is carried out. Regardless of the method used, the effective number of RCMs is smaller than the total number of RCMs. The estimated effective number of RCMs varies depending...

  15. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    International Nuclear Information System (INIS)

    Lehtivarjo, Juuso; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino; Peräkylä, Mikael

    2012-01-01

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein 1 H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6–17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for 1 Hα, 1 HN, 13 Cα, 13 Cβ, 13 CO and backbone 15 N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  16. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    Energy Technology Data Exchange (ETDEWEB)

    Lehtivarjo, Juuso, E-mail: juuso.lehtivarjo@uef.fi; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino [University of Eastern Finland, School of Pharmacy (Finland); Peraekylae, Mikael [University of Eastern Finland, Institute of Biomedicine (Finland)

    2012-03-15

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein {sup 1}H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6-17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for {sup 1}H{alpha}, {sup 1}HN, {sup 13}C{alpha}, {sup 13}C{beta}, {sup 13}CO and backbone {sup 15}N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  17. Generation of scenarios from calibrated ensemble forecasts with a dual ensemble copula coupling approach

    DEFF Research Database (Denmark)

    Ben Bouallègue, Zied; Heppelmann, Tobias; Theis, Susanne E.

    2016-01-01

    the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error. The new...... approach, called d-ECC, is applied to wind forecasts from the high resolution ensemble system COSMO-DE-EPS run operationally at the German weather service. Scenarios generated by ECC and d-ECC are compared and assessed in the form of time series by means of multivariate verification tools and in a product...

  18. Non-Boltzmann Ensembles and Monte Carlo Simulations

    International Nuclear Information System (INIS)

    Murthy, K. P. N.

    2016-01-01

    Boltzmann sampling based on Metropolis algorithm has been extensively used for simulating a canonical ensemble and for calculating macroscopic properties of a closed system at desired temperatures. An estimate of a mechanical property, like energy, of an equilibrium system, is made by averaging over a large number microstates generated by Boltzmann Monte Carlo methods. This is possible because we can assign a numerical value for energy to each microstate. However, a thermal property like entropy, is not easily accessible to these methods. The reason is simple. We can not assign a numerical value for entropy, to a microstate. Entropy is not a property associated with any single microstate. It is a collective property of all the microstates. Toward calculating entropy and other thermal properties, a non-Boltzmann Monte Carlo technique called Umbrella sampling was proposed some forty years ago. Umbrella sampling has since undergone several metamorphoses and we have now, multi-canonical Monte Carlo, entropic sampling, flat histogram methods, Wang-Landau algorithm etc . This class of methods generates non-Boltzmann ensembles which are un-physical. However, physical quantities can be calculated as follows. First un-weight a microstates of the entropic ensemble; then re-weight it to the desired physical ensemble. Carry out weighted average over the entropic ensemble to estimate physical quantities. In this talk I shall tell you of the most recent non- Boltzmann Monte Carlo method and show how to calculate free energy for a few systems. We first consider estimation of free energy as a function of energy at different temperatures to characterize phase transition in an hairpin DNA in the presence of an unzipping force. Next we consider free energy as a function of order parameter and to this end we estimate density of states g ( E , M ), as a function of both energy E , and order parameter M . This is carried out in two stages. We estimate g ( E ) in the first stage

  19. ENSEMBLE methods to reconcile disparate national long range dispersion forecasting

    Energy Technology Data Exchange (ETDEWEB)

    Mikkelsen, T; Galmarini, S; Bianconi, R; French, S [eds.

    2003-11-01

    ENSEMBLE is a web-based decision support system for real-time exchange and evaluation of national long-range dispersion forecasts of nuclear releases with cross-boundary consequences. The system is developed with the purpose to reconcile among disparate national forecasts for long-range dispersion. ENSEMBLE addresses the problem of achieving a common coherent strategy across European national emergency management when national long-range dispersion forecasts differ from one another during an accidental atmospheric release of radioactive material. A series of new decision-making 'ENSEMBLE' procedures and Web-based software evaluation and exchange tools have been created for real-time reconciliation and harmonisation of real-time dispersion forecasts from meteorological and emergency centres across Europe during an accident. The new ENSEMBLE software tools is available to participating national emergency and meteorological forecasting centres, which may choose to integrate them directly into operational emergency information systems, or possibly use them as a basis for future system development. (au)

  20. ENSEMBLE methods to reconcile disparate national long range dispersion forecasting

    Energy Technology Data Exchange (ETDEWEB)

    Mikkelsen, T.; Galmarini, S.; Bianconi, R.; French, S. (eds.)

    2003-11-01

    ENSEMBLE is a web-based decision support system for real-time exchange and evaluation of national long-range dispersion forecasts of nuclear releases with cross-boundary consequences. The system is developed with the purpose to reconcile among disparate national forecasts for long-range dispersion. ENSEMBLE addresses the problem of achieving a common coherent strategy across European national emergency management when national long-range dispersion forecasts differ from one another during an accidental atmospheric release of radioactive material. A series of new decision-making 'ENSEMBLE' procedures and Web-based software evaluation and exchange tools have been created for real-time reconciliation and harmonisation of real-time dispersion forecasts from meteorological and emergency centres across Europe during an accident. The new ENSEMBLE software tools is available to participating national emergency and meteorological forecasting centres, which may choose to integrate them directly into operational emergency information systems, or possibly use them as a basis for future system development. (au)

  1. DYNAMIC STABILITY OF THE SOLAR SYSTEM: STATISTICALLY INCONCLUSIVE RESULTS FROM ENSEMBLE INTEGRATIONS

    Energy Technology Data Exchange (ETDEWEB)

    Zeebe, Richard E., E-mail: zeebe@soest.hawaii.edu [School of Ocean and Earth Science and Technology, University of Hawaii at Manoa, 1000 Pope Road, MSB 629, Honolulu, HI 96822 (United States)

    2015-01-01

    Due to the chaotic nature of the solar system, the question of its long-term stability can only be answered in a statistical sense, for instance, based on numerical ensemble integrations of nearby orbits. Destabilization of the inner planets, leading to close encounters and/or collisions can be initiated through a large increase in Mercury's eccentricity, with a currently assumed likelihood of ∼1%. However, little is known at present about the robustness of this number. Here I report ensemble integrations of the full equations of motion of the eight planets and Pluto over 5 Gyr, including contributions from general relativity. The results show that different numerical algorithms lead to statistically different results for the evolution of Mercury's eccentricity (e{sub M}). For instance, starting at present initial conditions (e{sub M}≃0.21), Mercury's maximum eccentricity achieved over 5 Gyr is, on average, significantly higher in symplectic ensemble integrations using heliocentric rather than Jacobi coordinates and stricter error control. In contrast, starting at a possible future configuration (e{sub M}≃0.53), Mercury's maximum eccentricity achieved over the subsequent 500 Myr is, on average, significantly lower using heliocentric rather than Jacobi coordinates. For example, the probability for e{sub M} to increase beyond 0.53 over 500 Myr is >90% (Jacobi) versus only 40%-55% (heliocentric). This poses a dilemma because the physical evolution of the real system—and its probabilistic behavior—cannot depend on the coordinate system or the numerical algorithm chosen to describe it. Some tests of the numerical algorithms suggest that symplectic integrators using heliocentric coordinates underestimate the odds for destabilization of Mercury's orbit at high initial e{sub M}.

  2. A user credit assessment model based on clustering ensemble for broadband network new media service supervision

    Science.gov (United States)

    Liu, Fang; Cao, San-xing; Lu, Rui

    2012-04-01

    This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.

  3. Ensemble inequivalence: Landau theory and the ABC model

    International Nuclear Information System (INIS)

    Cohen, O; Mukamel, D

    2012-01-01

    It is well known that systems with long-range interactions may exhibit different phase diagrams when studied within two different ensembles. In many of the previously studied examples of ensemble inequivalence, the phase diagrams differ only when the transition in one of the ensembles is first order. By contrast, in a recent study of a generalized ABC model, the canonical and grand-canonical ensembles of the model were shown to differ even when they both exhibit a continuous transition. Here we show that the order of the transition where ensemble inequivalence may occur is related to the symmetry properties of the order parameter associated with the transition. This is done by analyzing the Landau expansion of a generic model with long-range interactions. The conclusions drawn from the generic analysis are demonstrated for the ABC model by explicit calculation of its Landau expansion. (paper)

  4. DroidEnsemble: Detecting Android Malicious Applications with Ensemble of String and Structural Static Features

    KAUST Repository

    Wang, Wei

    2018-05-11

    Android platform has dominated the Operating System of mobile devices. However, the dramatic increase of Android malicious applications (malapps) has caused serious software failures to Android system and posed a great threat to users. The effective detection of Android malapps has thus become an emerging yet crucial issue. Characterizing the behaviors of Android applications (apps) is essential to detecting malapps. Most existing work on detecting Android malapps was mainly based on string static features such as permissions and API usage extracted from apps. There also exists work on the detection of Android malapps with structural features, such as Control Flow Graph (CFG) and Data Flow Graph (DFG). As Android malapps have become increasingly polymorphic and sophisticated, using only one type of static features may result in false negatives. In this work, we propose DroidEnsemble that takes advantages of both string features and structural features to systematically and comprehensively characterize the static behaviors of Android apps and thus build a more accurate detection model for the detection of Android malapps. We extract each app’s string features, including permissions, hardware features, filter intents, restricted API calls, used permissions, code patterns, as well as structural features like function call graph. We then use three machine learning algorithms, namely, Support Vector Machine (SVM), k-Nearest Neighbor (kNN) and Random Forest (RF), to evaluate the performance of these two types of features and of their ensemble. In the experiments, We evaluate our methods and models with 1386 benign apps and 1296 malapps. Extensive experimental results demonstrate the effectiveness of DroidEnsemble. It achieves the detection accuracy as 95.8% with only string features and as 90.68% with only structural features. DroidEnsemble reaches the detection accuracy as 98.4% with the ensemble of both types of features, reducing 9 false positives and 12 false

  5. An Assessment of the Subseasonal Forecast Performance in the Extended Global Ensemble Forecast System (GEFS)

    Science.gov (United States)

    Sinsky, E.; Zhu, Y.; Li, W.; Guan, H.; Melhauser, C.

    2017-12-01

    Optimal forecast quality is crucial for the preservation of life and property. Improving monthly forecast performance over both the tropics and extra-tropics requires attention to various physical aspects such as the representation of the underlying SST, model physics and the representation of the model physics uncertainty for an ensemble forecast system. This work focuses on the impact of stochastic physics, SST and the convection scheme on forecast performance for the sub-seasonal scale over the tropics and extra-tropics with emphasis on the Madden-Julian Oscillation (MJO). A 2-year period is evaluated using the National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS). Three experiments with different configurations than the operational GEFS were performed to illustrate the impact of the stochastic physics, SST and convection scheme. These experiments are compared against a control experiment (CTL) which consists of the operational GEFS but its integration is extended from 16 to 35 days. The three configurations are: 1) SPs, which uses a Stochastically Perturbed Physics Tendencies (SPPT), Stochastic Perturbed Humidity (SHUM) and Stochastic Kinetic Energy Backscatter (SKEB); 2) SPs+SST_bc, which uses a combination of SPs and a bias-corrected forecast SST from the NCEP Climate Forecast System Version 2 (CFSv2); and 3) SPs+SST_bc+SA_CV, which combines SPs, a bias-corrected forecast SST and a scale aware convection scheme. When comparing to the CTL experiment, SPs shows substantial improvement. The MJO skill has improved by about 4 lead days during the 2-year period. Improvement is also seen over the extra-tropics due to the updated stochastic physics, where there is a 3.1% and a 4.2% improvement during weeks 3 and 4 over the northern hemisphere and southern hemisphere, respectively. Improvement is also seen when the bias-corrected CFSv2 SST is combined with SPs. Additionally, forecast performance enhances when the scale aware

  6. Combining multi-objective optimization and bayesian model averaging to calibrate forecast ensembles of soil hydraulic models

    Energy Technology Data Exchange (ETDEWEB)

    Vrugt, Jasper A [Los Alamos National Laboratory; Wohling, Thomas [NON LANL

    2008-01-01

    Most studies in vadose zone hydrology use a single conceptual model for predictive inference and analysis. Focusing on the outcome of a single model is prone to statistical bias and underestimation of uncertainty. In this study, we combine multi-objective optimization and Bayesian Model Averaging (BMA) to generate forecast ensembles of soil hydraulic models. To illustrate our method, we use observed tensiometric pressure head data at three different depths in a layered vadose zone of volcanic origin in New Zealand. A set of seven different soil hydraulic models is calibrated using a multi-objective formulation with three different objective functions that each measure the mismatch between observed and predicted soil water pressure head at one specific depth. The Pareto solution space corresponding to these three objectives is estimated with AMALGAM, and used to generate four different model ensembles. These ensembles are post-processed with BMA and used for predictive analysis and uncertainty estimation. Our most important conclusions for the vadose zone under consideration are: (1) the mean BMA forecast exhibits similar predictive capabilities as the best individual performing soil hydraulic model, (2) the size of the BMA uncertainty ranges increase with increasing depth and dryness in the soil profile, (3) the best performing ensemble corresponds to the compromise (or balanced) solution of the three-objective Pareto surface, and (4) the combined multi-objective optimization and BMA framework proposed in this paper is very useful to generate forecast ensembles of soil hydraulic models.

  7. NYYD Ensemble

    Index Scriptorium Estoniae

    2002-01-01

    NYYD Ensemble'i duost Traksmann - Lukk E.-S. Tüüri teosega "Symbiosis", mis on salvestatud ka hiljuti ilmunud NYYD Ensemble'i CDle. 2. märtsil Rakvere Teatri väikeses saalis ja 3. märtsil Rotermanni Soolalaos, kavas Tüür, Kaumann, Berio, Reich, Yun, Hauta-aho, Buckinx

  8. Verification of Ensemble Forecasts for the New York City Operations Support Tool

    Science.gov (United States)

    Day, G.; Schaake, J. C.; Thiemann, M.; Draijer, S.; Wang, L.

    2012-12-01

    The New York City water supply system operated by the Department of Environmental Protection (DEP) serves nine million people. It covers 2,000 square miles of portions of the Catskill, Delaware, and Croton watersheds, and it includes nineteen reservoirs and three controlled lakes. DEP is developing an Operations Support Tool (OST) to support its water supply operations and planning activities. OST includes historical and real-time data, a model of the water supply system complete with operating rules, and lake water quality models developed to evaluate alternatives for managing turbidity in the New York City Catskill reservoirs. OST will enable DEP to manage turbidity in its unfiltered system while satisfying its primary objective of meeting the City's water supply needs, in addition to considering secondary objectives of maintaining ecological flows, supporting fishery and recreation releases, and mitigating downstream flood peaks. The current version of OST relies on statistical forecasts of flows in the system based on recent observed flows. To improve short-term decision making, plans are being made to transition to National Weather Service (NWS) ensemble forecasts based on hydrologic models that account for short-term weather forecast skill, longer-term climate information, as well as the hydrologic state of the watersheds and recent observed flows. To ensure that the ensemble forecasts are unbiased and that the ensemble spread reflects the actual uncertainty of the forecasts, a statistical model has been developed to post-process the NWS ensemble forecasts to account for hydrologic model error as well as any inherent bias and uncertainty in initial model states, meteorological data and forecasts. The post-processor is designed to produce adjusted ensemble forecasts that are consistent with the DEP historical flow sequences that were used to develop the system operating rules. A set of historical hindcasts that is representative of the real-time ensemble

  9. A study on the predictability of the transition day from the dry to the rainy season over South Korea

    Science.gov (United States)

    Lee, Sang-Min; Nam, Ji-Eun; Choi, Hee-Wook; Ha, Jong-Chul; Lee, Yong Hee; Kim, Yeon-Hee; Kang, Hyun-Suk; Cho, ChunHo

    2016-08-01

    This study was conducted to evaluate the prediction accuracies of THe Observing system Research and Predictability EXperiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) data at six operational forecast centers using the root-mean square difference (RMSD) and Brier score (BS) from April to July 2012. And it was performed to test the precipitation predictability of ensemble prediction systems (EPS) on the onset of the summer rainy season, the day of withdrawal in spring drought over South Korea on 29 June 2012 with use of the ensemble mean precipitation, ensemble probability precipitation, 10-day lag ensemble forecasts (ensemble mean and probability precipitation), and effective drought index (EDI). The RMSD analysis of atmospheric variables (geopotential-height at 500 hPa, temperature at 850 hPa, sea-level pressure and specific humidity at 850 hPa) showed that the prediction accuracies of the EPS at the Meteorological Service of Canada (CMC) and China Meteorological Administration (CMA) were poor and those at the European Center for Medium-Range Weather Forecasts (ECMWF) and Korea Meteorological Administration (KMA) were good. Also, ECMWF and KMA showed better results than other EPSs for predicting precipitation in the BS distributions. It is also evaluated that the onset of the summer rainy season could be predicted using ensemble-mean precipitation from 4-day leading time at all forecast centers. In addition, the spatial distributions of predicted precipitation of the EPS at KMA and the Met Office of the United Kingdom (UKMO) were similar to those of observed precipitation; thus, the predictability showed good performance. The precipitation probability forecasts of EPS at CMA, the National Centers for Environmental Prediction (NCEP), and UKMO (ECMWF and KMA) at 1-day lead time produced over-forecasting (under-forecasting) in the reliability diagram. And all the ones at 2˜4-day lead time showed under-forecasting. Also, the precipitation on onset day of

  10. Representing Color Ensembles.

    Science.gov (United States)

    Chetverikov, Andrey; Campana, Gianluca; Kristjánsson, Árni

    2017-10-01

    Colors are rarely uniform, yet little is known about how people represent color distributions. We introduce a new method for studying color ensembles based on intertrial learning in visual search. Participants looked for an oddly colored diamond among diamonds with colors taken from either uniform or Gaussian color distributions. On test trials, the targets had various distances in feature space from the mean of the preceding distractor color distribution. Targets on test trials therefore served as probes into probabilistic representations of distractor colors. Test-trial response times revealed a striking similarity between the physical distribution of colors and their internal representations. The results demonstrate that the visual system represents color ensembles in a more detailed way than previously thought, coding not only mean and variance but, most surprisingly, the actual shape (uniform or Gaussian) of the distribution of colors in the environment.

  11. Improving quantitative precipitation nowcasting with a local ensemble transform Kalman filter radar data assimilation system: observing system simulation experiments

    Directory of Open Access Journals (Sweden)

    Chih-Chien Tsai

    2014-03-01

    Full Text Available This study develops a Doppler radar data assimilation system, which couples the local ensemble transform Kalman filter with the Weather Research and Forecasting model. The benefits of this system to quantitative precipitation nowcasting (QPN are evaluated with observing system simulation experiments on Typhoon Morakot (2009, which brought record-breaking rainfall and extensive damage to central and southern Taiwan. The results indicate that the assimilation of radial velocity and reflectivity observations improves the three-dimensional winds and rain-mixing ratio most significantly because of the direct relations in the observation operator. The patterns of spiral rainbands become more consistent between different ensemble members after radar data assimilation. The rainfall intensity and distribution during the 6-hour deterministic nowcast are also improved, especially for the first 3 hours. The nowcasts with and without radar data assimilation have similar evolution trends driven by synoptic-scale conditions. Furthermore, we carry out a series of sensitivity experiments to develop proper assimilation strategies, in which a mixed localisation method is proposed for the first time and found to give further QPN improvement in this typhoon case.

  12. Reconstruction of ensembles of coupled time-delay systems from time series.

    Science.gov (United States)

    Sysoev, I V; Prokhorov, M D; Ponomarenko, V I; Bezruchko, B P

    2014-06-01

    We propose a method to recover from time series the parameters of coupled time-delay systems and the architecture of couplings between them. The method is based on a reconstruction of model delay-differential equations and estimation of statistical significance of couplings. It can be applied to networks composed of nonidentical nodes with an arbitrary number of unidirectional and bidirectional couplings. We test our method on chaotic and periodic time series produced by model equations of ensembles of diffusively coupled time-delay systems in the presence of noise, and apply it to experimental time series obtained from electronic oscillators with delayed feedback coupled by resistors.

  13. Post-processing of multi-model ensemble river discharge forecasts using censored EMOS

    Science.gov (United States)

    Hemri, Stephan; Lisniak, Dmytro; Klein, Bastian

    2014-05-01

    When forecasting water levels and river discharge, ensemble weather forecasts are used as meteorological input to hydrologic process models. As hydrologic models are imperfect and the input ensembles tend to be biased and underdispersed, the output ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, statistical post-processing is required in order to achieve calibrated and sharp predictions. Standard post-processing methods such as Ensemble Model Output Statistics (EMOS) that have their origins in meteorological forecasting are now increasingly being used in hydrologic applications. Here we consider two sub-catchments of River Rhine, for which the forecasting system of the Federal Institute of Hydrology (BfG) uses runoff data that are censored below predefined thresholds. To address this methodological challenge, we develop a censored EMOS method that is tailored to such data. The censored EMOS forecast distribution can be understood as a mixture of a point mass at the censoring threshold and a continuous part based on a truncated normal distribution. Parameter estimates of the censored EMOS model are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over the training dataset. Model fitting on Box-Cox transformed data allows us to take account of the positive skewness of river discharge distributions. In order to achieve realistic forecast scenarios over an entire range of lead-times, there is a need for multivariate extensions. To this end, we smooth the marginal parameter estimates over lead-times. In order to obtain realistic scenarios of discharge evolution over time, the marginal distributions have to be linked with each other. To this end, the multivariate dependence structure can either be adopted from the raw ensemble like in Ensemble Copula Coupling (ECC), or be estimated from observations in a training period. The censored EMOS model has been applied to multi-model ensemble forecasts issued on a

  14. Ensemble hydro-meteorological forecasting for early warning of floods and scheduling of hydropower production

    Science.gov (United States)

    Solvang Johansen, Stian; Steinsland, Ingelin; Engeland, Kolbjørn

    2016-04-01

    Running hydrological models with precipitation and temperature ensemble forcing to generate ensembles of streamflow is a commonly used method in operational hydrology. Evaluations of streamflow ensembles have however revealed that the ensembles are biased with respect to both mean and spread. Thus postprocessing of the ensembles is needed in order to improve the forecast skill. The aims of this study is (i) to to evaluate how postprocessing of streamflow ensembles works for Norwegian catchments within different hydrological regimes and to (ii) demonstrate how post processed streamflow ensembles are used operationally by a hydropower producer. These aims were achieved by postprocessing forecasted daily discharge for 10 lead-times for 20 catchments in Norway by using EPS forcing from ECMWF applied the semi-distributed HBV-model dividing each catchment into 10 elevation zones. Statkraft Energi uses forecasts from these catchments for scheduling hydropower production. The catchments represent different hydrological regimes. Some catchments have stable winter condition with winter low flow and a major flood event during spring or early summer caused by snow melting. Others has a more mixed snow-rain regime, often with a secondary flood season during autumn, and in the coastal areas, the stream flow is dominated by rain, and the main flood season is autumn and winter. For post processing, a Bayesian model averaging model (BMA) close to (Kleiber et al 2011) is used. The model creates a predictive PDF that is a weighted average of PDFs centered on the individual bias corrected forecasts. The weights are here equal since all ensemble members come from the same model, and thus have the same probability. For modeling streamflow, the gamma distribution is chosen as a predictive PDF. The bias correction parameters and the PDF parameters are estimated using a 30-day sliding window training period. Preliminary results show that the improvement varies between catchments depending

  15. Predicting human splicing branchpoints by combining sequence-derived features and multi-label learning methods.

    Science.gov (United States)

    Zhang, Wen; Zhu, Xiaopeng; Fu, Yu; Tsuji, Junko; Weng, Zhiping

    2017-12-01

    Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order to guide wet experiments, we develop computational methods to predict human splicing branchpoints. Considering the fact that an intron may have multiple branchpoints, we transform the branchpoint prediction as the multi-label learning problem, and attempt to predict branchpoint sites from intron sequences. First, we investigate a variety of intron sequence-derived features, such as sparse profile, dinucleotide profile, position weight matrix profile, Markov motif profile and polypyrimidine tract profile. Second, we consider several multi-label learning methods: partial least squares regression, canonical correlation analysis and regularized canonical correlation analysis, and use them as the basic classification engines. Third, we propose two ensemble learning schemes which integrate different features and different classifiers to build ensemble learning systems for the branchpoint prediction. One is the genetic algorithm-based weighted average ensemble method; the other is the logistic regression-based ensemble method. In the computational experiments, two ensemble learning methods outperform benchmark branchpoint prediction methods, and can produce high-accuracy results on the benchmark dataset.

  16. Reproducing multi-model ensemble average with Ensemble-averaged Reconstructed Forcings (ERF) in regional climate modeling

    Science.gov (United States)

    Erfanian, A.; Fomenko, L.; Wang, G.

    2016-12-01

    Multi-model ensemble (MME) average is considered the most reliable for simulating both present-day and future climates. It has been a primary reference for making conclusions in major coordinated studies i.e. IPCC Assessment Reports and CORDEX. The biases of individual models cancel out each other in MME average, enabling the ensemble mean to outperform individual members in simulating the mean climate. This enhancement however comes with tremendous computational cost, which is especially inhibiting for regional climate modeling as model uncertainties can originate from both RCMs and the driving GCMs. Here we propose the Ensemble-based Reconstructed Forcings (ERF) approach to regional climate modeling that achieves a similar level of bias reduction at a fraction of cost compared with the conventional MME approach. The new method constructs a single set of initial and boundary conditions (IBCs) by averaging the IBCs of multiple GCMs, and drives the RCM with this ensemble average of IBCs to conduct a single run. Using a regional climate model (RegCM4.3.4-CLM4.5), we tested the method over West Africa for multiple combination of (up to six) GCMs. Our results indicate that the performance of the ERF method is comparable to that of the MME average in simulating the mean climate. The bias reduction seen in ERF simulations is achieved by using more realistic IBCs in solving the system of equations underlying the RCM physics and dynamics. This endows the new method with a theoretical advantage in addition to reducing computational cost. The ERF output is an unaltered solution of the RCM as opposed to a climate state that might not be physically plausible due to the averaging of multiple solutions with the conventional MME approach. The ERF approach should be considered for use in major international efforts such as CORDEX. Key words: Multi-model ensemble, ensemble analysis, ERF, regional climate modeling

  17. Multimodel hydrological ensemble forecasts for the Baskatong catchment in Canada using the TIGGE database.

    Science.gov (United States)

    Tito Arandia Martinez, Fabian

    2014-05-01

    combined to form a grand ensemble. Results show that the hydrological forecasts derived from the grand ensemble perform better than the pseudo ensemble forecasts actually used operationally at Hydro-Québec. References: [1] M. Verbunt, A. Walser, J. Gurtz et al., "Probabilistic flood forecasting with a limited-area ensemble prediction system: Selected case studies," Journal of Hydrometeorology, vol. 8, no. 4, pp. 897-909, Aug, 2007. [2] N. Evora, Valorisation des prévisions météorologiques d'ensemble, Institu de recherceh d'Hydro-Québec 2005. [3] V. Fortin, Le modèle météo-apport HSAMI: historique, théorie et application, Institut de recherche d'Hydro-Québec, 2000.

  18. Microcanonical ensemble extensive thermodynamics of Tsallis statistics

    International Nuclear Information System (INIS)

    Parvan, A.S.

    2005-01-01

    The microscopic foundation of the generalized equilibrium statistical mechanics based on the Tsallis entropy is given by using the Gibbs idea of statistical ensembles of the classical and quantum mechanics.The equilibrium distribution functions are derived by the thermodynamic method based upon the use of the fundamental equation of thermodynamics and the statistical definition of the functions of the state of the system. It is shown that if the entropic index ξ = 1/q - 1 in the microcanonical ensemble is an extensive variable of the state of the system, then in the thermodynamic limit z bar = 1/(q - 1)N = const the principle of additivity and the zero law of thermodynamics are satisfied. In particular, the Tsallis entropy of the system is extensive and the temperature is intensive. Thus, the Tsallis statistics completely satisfies all the postulates of the equilibrium thermodynamics. Moreover, evaluation of the thermodynamic identities in the microcanonical ensemble is provided by the Euler theorem. The principle of additivity and the Euler theorem are explicitly proved by using the illustration of the classical microcanonical ideal gas in the thermodynamic limit

  19. Microcanonical ensemble extensive thermodynamics of Tsallis statistics

    International Nuclear Information System (INIS)

    Parvan, A.S.

    2006-01-01

    The microscopic foundation of the generalized equilibrium statistical mechanics based on the Tsallis entropy is given by using the Gibbs idea of statistical ensembles of the classical and quantum mechanics. The equilibrium distribution functions are derived by the thermodynamic method based upon the use of the fundamental equation of thermodynamics and the statistical definition of the functions of the state of the system. It is shown that if the entropic index ξ=1/(q-1) in the microcanonical ensemble is an extensive variable of the state of the system, then in the thermodynamic limit z-bar =1/(q-1)N=const the principle of additivity and the zero law of thermodynamics are satisfied. In particular, the Tsallis entropy of the system is extensive and the temperature is intensive. Thus, the Tsallis statistics completely satisfies all the postulates of the equilibrium thermodynamics. Moreover, evaluation of the thermodynamic identities in the microcanonical ensemble is provided by the Euler theorem. The principle of additivity and the Euler theorem are explicitly proved by using the illustration of the classical microcanonical ideal gas in the thermodynamic limit

  20. Simulating ensembles of nonlinear continuous time dynamical systems via active ultra wideband wireless network

    Energy Technology Data Exchange (ETDEWEB)

    Dmitriev, Alexander S.; Yemelyanov, Ruslan Yu. [V.A. Kotelnikov Institute of Radio Engineering and Electronics of the RAS Mokhovaya 11-7, Moscow, 125009 (Russian Federation); Moscow Institute of Physics and Technology (State University) 9 Institutskiy per., Dolgoprudny, Moscow, 141700 (Russian Federation); Gerasimov, Mark Yu. [V.A. Kotelnikov Institute of Radio Engineering and Electronics of the RAS Mokhovaya 11-7, Moscow, 125009 (Russian Federation); Itskov, Vadim V. [Moscow Institute of Physics and Technology (State University) 9 Institutskiy per., Dolgoprudny, Moscow, 141700 (Russian Federation)

    2016-06-08

    The paper deals with a new multi-element processor platform assigned for modelling the behaviour of interacting dynamical systems, i.e., active wireless network. Experimentally, this ensemble is implemented in an active network, the active nodes of which include direct chaotic transceivers and special actuator boards containing microcontrollers for modelling the dynamical systems and an information display unit (colored LEDs). The modelling technique and experimental results are described and analyzed.

  1. Orbital magnetism in ensembles of ballistic billiards

    International Nuclear Information System (INIS)

    Ullmo, D.; Richter, K.; Jalabert, R.A.

    1993-01-01

    The magnetic response of ensembles of small two-dimensional structures at finite temperatures is calculated. Using semiclassical methods and numerical calculation it is demonstrated that only short classical trajectories are relevant. The magnetic susceptibility is enhanced in regular systems, where these trajectories appear in families. For ensembles of squares large paramagnetic susceptibility is obtained, in good agreement with recent measurements in the ballistic regime. (authors). 20 refs., 2 figs

  2. Robustness of the far-field response of nonlocal plasmonic ensembles

    DEFF Research Database (Denmark)

    Tserkezis, Christos; Maack, Johan Rosenkrantz; Liu, Zhaowei

    2016-01-01

    Contrary to classical predictions, the optical response of few-nm plasmonic particles depends on particle size due to effects such as nonlocality and electron spill-out. Ensembles of such nanoparticles are therefore expected to exhibit a nonclassical inhomogeneous spectral broadening due to size...... distribution. For a normal distribution of free-electron nanoparticles, and within the simple nonlocal hydrodynamic Drude model, both the nonlocal blueshift and the plasmon linewidth are shown to be considerably affected by ensemble averaging. Size-variance effects tend however to conceal nonlocality...... to a lesser extent when the homogeneous size-dependent broadening of individual nanoparticles is taken into account, either through a local size-dependent damping model or through the Generalized Nonlocal Optical Response theory. The role of ensemble averaging is further explored in realistic distributions...

  3. Reducing Risk of Noise-Induced Hearing Loss in Collegiate Music Ensembles Using Ambient Technology.

    Science.gov (United States)

    Powell, Jason; Chesky, Kris

    2017-09-01

    Student musicians are at risk for noise-induced hearing loss (NIHL) as they develop skills and perform during instructional activities. Studies using longitudinal dosimeter data show that pedagogical procedures and instructor behaviors are highly predictive of NIHL risk, thus implying the need for innovative approaches to increase instructor competency in managing instructional activities without interfering with artistic and academic freedom. Ambient information systems, an emerging trend in human-computer interaction that infuses psychological behavioral theories into technologies, can help construct informative risk-regulating systems. The purpose of this study was to determine the effects of introducing an ambient information system into the ensemble setting. The system used two ambient displays and a counterbalanced within-subjects treatment study design with six jazz ensemble instructors to determine if the system could induce a behavior change that alters trends in measures resulting from dosimeter data. This study assessed efficacy using time series analysis to determine changes in eight statistical measures of behavior over a 9-wk period. Analysis showed that the system was effective, as all instructors showed changes in a combination of measures. This study is in an important step in developing non-interfering technology to reduce NIHL among academic musicians.

  4. Mass Conservation and Positivity Preservation with Ensemble-type Kalman Filter Algorithms

    Science.gov (United States)

    Janjic, Tijana; McLaughlin, Dennis B.; Cohn, Stephen E.; Verlaan, Martin

    2013-01-01

    Maintaining conservative physical laws numerically has long been recognized as being important in the development of numerical weather prediction (NWP) models. In the broader context of data assimilation, concerted efforts to maintain conservation laws numerically and to understand the significance of doing so have begun only recently. In order to enforce physically based conservation laws of total mass and positivity in the ensemble Kalman filter, we incorporate constraints to ensure that the filter ensemble members and the ensemble mean conserve mass and remain nonnegative through measurement updates. We show that the analysis steps of ensemble transform Kalman filter (ETKF) algorithm and ensemble Kalman filter algorithm (EnKF) can conserve the mass integral, but do not preserve positivity. Further, if localization is applied or if negative values are simply set to zero, then the total mass is not conserved either. In order to ensure mass conservation, a projection matrix that corrects for localization effects is constructed. In order to maintain both mass conservation and positivity preservation through the analysis step, we construct a data assimilation algorithms based on quadratic programming and ensemble Kalman filtering. Mass and positivity are both preserved by formulating the filter update as a set of quadratic programming problems that incorporate constraints. Some simple numerical experiments indicate that this approach can have a significant positive impact on the posterior ensemble distribution, giving results that are more physically plausible both for individual ensemble members and for the ensemble mean. The results show clear improvements in both analyses and forecasts, particularly in the presence of localized features. Behavior of the algorithm is also tested in presence of model error.

  5. 'Lazy' quantum ensembles

    International Nuclear Information System (INIS)

    Parfionov, George; Zapatrin, Roman

    2006-01-01

    We compare different strategies aimed to prepare an ensemble with a given density matrix ρ. Preparing the ensemble of eigenstates of ρ with appropriate probabilities can be treated as 'generous' strategy: it provides maximal accessible information about the state. Another extremity is the so-called 'Scrooge' ensemble, which is mostly stingy in sharing the information. We introduce 'lazy' ensembles which require minimal effort to prepare the density matrix by selecting pure states with respect to completely random choice. We consider two parties, Alice and Bob, playing a kind of game. Bob wishes to guess which pure state is prepared by Alice. His null hypothesis, based on the lack of any information about Alice's intention, is that Alice prepares any pure state with equal probability. Then, the average quantum state measured by Bob turns out to be ρ, and he has to make a new hypothesis about Alice's intention solely based on the information that the observed density matrix is ρ. The arising 'lazy' ensemble is shown to be the alternative hypothesis which minimizes type I error

  6. Combining super-ensembles and statistical emulation to improve a regional climate and vegetation model

    Science.gov (United States)

    Hawkins, L. R.; Rupp, D. E.; Li, S.; Sarah, S.; McNeall, D. J.; Mote, P.; Betts, R. A.; Wallom, D.

    2017-12-01

    Changing regional patterns of surface temperature, precipitation, and humidity may cause ecosystem-scale changes in vegetation, altering the distribution of trees, shrubs, and grasses. A changing vegetation distribution, in turn, alters the albedo, latent heat flux, and carbon exchanged with the atmosphere with resulting feedbacks onto the regional climate. However, a wide range of earth-system processes that affect the carbon, energy, and hydrologic cycles occur at sub grid scales in climate models and must be parameterized. The appropriate parameter values in such parameterizations are often poorly constrained, leading to uncertainty in predictions of how the ecosystem will respond to changes in forcing. To better understand the sensitivity of regional climate to parameter selection and to improve regional climate and vegetation simulations, we used a large perturbed physics ensemble and a suite of statistical emulators. We dynamically downscaled a super-ensemble (multiple parameter sets and multiple initial conditions) of global climate simulations using a 25-km resolution regional climate model HadRM3p with the land-surface scheme MOSES2 and dynamic vegetation module TRIFFID. We simultaneously perturbed land surface parameters relating to the exchange of carbon, water, and energy between the land surface and atmosphere in a large super-ensemble of regional climate simulations over the western US. Statistical emulation was used as a computationally cost-effective tool to explore uncertainties in interactions. Regions of parameter space that did not satisfy observational constraints were eliminated and an ensemble of parameter sets that reduce regional biases and span a range of plausible interactions among earth system processes were selected. This study demonstrated that by combining super-ensemble simulations with statistical emulation, simulations of regional climate could be improved while simultaneously accounting for a range of plausible land

  7. Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.

    Science.gov (United States)

    Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

    2018-01-01

    In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.

  8. Deep Predictive Models in Interactive Music

    OpenAIRE

    Martin, Charles P.; Ellefsen, Kai Olav; Torresen, Jim

    2018-01-01

    Automatic music generation is a compelling task where much recent progress has been made with deep learning models. In this paper, we ask how these models can be integrated into interactive music systems; how can they encourage or enhance the music making of human users? Musical performance requires prediction to operate instruments, and perform in groups. We argue that predictive models could help interactive systems to understand their temporal context, and ensemble behaviour. Deep learning...

  9. One-Step Dynamic Classifier Ensemble Model for Customer Value Segmentation with Missing Values

    Directory of Open Access Journals (Sweden)

    Jin Xiao

    2014-01-01

    Full Text Available Scientific customer value segmentation (CVS is the base of efficient customer relationship management, and customer credit scoring, fraud detection, and churn prediction all belong to CVS. In real CVS, the customer data usually include lots of missing values, which may affect the performance of CVS model greatly. This study proposes a one-step dynamic classifier ensemble model for missing values (ODCEM model. On the one hand, ODCEM integrates the preprocess of missing values and the classification modeling into one step; on the other hand, it utilizes multiple classifiers ensemble technology in constructing the classification models. The empirical results in credit scoring dataset “German” from UCI and the real customer churn prediction dataset “China churn” show that the ODCEM outperforms four commonly used “two-step” models and the ensemble based model LMF and can provide better decision support for market managers.

  10. Assessment of Surface Air Temperature over China Using Multi-criterion Model Ensemble Framework

    Science.gov (United States)

    Li, J.; Zhu, Q.; Su, L.; He, X.; Zhang, X.

    2017-12-01

    The General Circulation Models (GCMs) are designed to simulate the present climate and project future trends. It has been noticed that the performances of GCMs are not always in agreement with each other over different regions. Model ensemble techniques have been developed to post-process the GCMs' outputs and improve their prediction reliabilities. To evaluate the performances of GCMs, root-mean-square error, correlation coefficient, and uncertainty are commonly used statistical measures. However, the simultaneous achievements of these satisfactory statistics cannot be guaranteed when using many model ensemble techniques. Meanwhile, uncertainties and future scenarios are critical for Water-Energy management and operation. In this study, a new multi-model ensemble framework was proposed. It uses a state-of-art evolutionary multi-objective optimization algorithm, termed Multi-Objective Complex Evolution Global Optimization with Principle Component Analysis and Crowding Distance (MOSPD), to derive optimal GCM ensembles and demonstrate the trade-offs among various solutions. Such trade-off information was further analyzed with a robust Pareto front with respect to different statistical measures. A case study was conducted to optimize the surface air temperature (SAT) ensemble solutions over seven geographical regions of China for the historical period (1900-2005) and future projection (2006-2100). The results showed that the ensemble solutions derived with MOSPD algorithm are superior over the simple model average and any single model output during the historical simulation period. For the future prediction, the proposed ensemble framework identified that the largest SAT change would occur in the South Central China under RCP 2.6 scenario, North Eastern China under RCP 4.5 scenario, and North Western China under RCP 8.5 scenario, while the smallest SAT change would occur in the Inner Mongolia under RCP 2.6 scenario, South Central China under RCP 4.5 scenario, and

  11. Improving the accuracy of flood forecasting with transpositions of ensemble NWP rainfall fields considering orographic effects

    Science.gov (United States)

    Yu, Wansik; Nakakita, Eiichi; Kim, Sunmin; Yamaguchi, Kosei

    2016-08-01

    The use of meteorological ensembles to produce sets of hydrological predictions increased the capability to issue flood warnings. However, space scale of the hydrological domain is still much finer than meteorological model, and NWP models have challenges with displacement. The main objective of this study to enhance the transposition method proposed in Yu et al. (2014) and to suggest the post-processing ensemble flood forecasting method for the real-time updating and the accuracy improvement of flood forecasts that considers the separation of the orographic rainfall and the correction of misplaced rain distributions using additional ensemble information through the transposition of rain distributions. In the first step of the proposed method, ensemble forecast rainfalls from a numerical weather prediction (NWP) model are separated into orographic and non-orographic rainfall fields using atmospheric variables and the extraction of topographic effect. Then the non-orographic rainfall fields are examined by the transposition scheme to produce additional ensemble information and new ensemble NWP rainfall fields are calculated by recombining the transposition results of non-orographic rain fields with separated orographic rainfall fields for a generation of place-corrected ensemble information. Then, the additional ensemble information is applied into a hydrologic model for post-flood forecasting with a 6-h interval. The newly proposed method has a clear advantage to improve the accuracy of mean value of ensemble flood forecasting. Our study is carried out and verified using the largest flood event by typhoon 'Talas' of 2011 over the two catchments, which are Futatsuno (356.1 km2) and Nanairo (182.1 km2) dam catchments of Shingu river basin (2360 km2), which is located in the Kii peninsula, Japan.

  12. Ensemble forecasts of road surface temperatures

    Czech Academy of Sciences Publication Activity Database

    Sokol, Zbyněk; Bližňák, Vojtěch; Sedlák, Pavel; Zacharov, Petr, jr.; Pešice, Petr; Škuthan, M.

    2017-01-01

    Roč. 187, 1 May (2017), s. 33-41 ISSN 0169-8095 R&D Projects: GA ČR GA13-34856S; GA TA ČR(CZ) TA01031509 Institutional support: RVO:68378289 Keywords : ensemble prediction * road surface temperature * road weather forecast Subject RIV: DG - Athmosphere Sciences, Meteorology OBOR OECD: Meteorology and atmospheric sciences Impact factor: 3.778, year: 2016 http://www.sciencedirect.com/science/article/pii/S0169809516307311

  13. Phthalocyanine-nanocarbon ensembles: from discrete molecular and supramolecular systems to hybrid nanomaterials.

    Science.gov (United States)

    Bottari, Giovanni; de la Torre, Gema; Torres, Tomas

    2015-04-21

    Phthalocyanines (Pcs) are macrocyclic and aromatic compounds that present unique electronic features such as high molar absorption coefficients, rich redox chemistry, and photoinduced energy/electron transfer abilities that can be modulated as a function of the electronic character of their counterparts in donor-acceptor (D-A) ensembles. In this context, carbon nanostructures such as fullerenes, carbon nanotubes (CNTs), and, more recently, graphene are among the most suitable Pc "companions". Pc-C60 ensembles have been for a long time the main actors in this field, due to the commercial availability of C60 and the well-established synthetic methods for its functionalization. As a result, many Pc-C60 architectures have been prepared, featuring different connectivities (covalent or supramolecular), intermolecular interactions (self-organized or molecularly dispersed species), and Pc HOMO/LUMO levels. All these elements provide a versatile toolbox for tuning the photophysical properties in terms of the type of process (photoinduced energy/electron transfer), the nature of the interactions between the electroactive units (through bond or space), and the kinetics of the formation/decay of the photogenerated species. Some recent trends in this field include the preparation of stimuli-responsive multicomponent systems with tunable photophysical properties and highly ordered nanoarchitectures and surface-supported systems showing high charge mobilities. A breakthrough in the Pc-nanocarbon field was the appearance of CNTs and graphene, which opened a new avenue for the preparation of intriguing photoresponsive hybrid ensembles showing light-stimulated charge separation. The scarce solubility of these 1-D and 2-D nanocarbons, together with their lower reactivity with respect to C60 stemming from their less strained sp(2) carbon networks, has not meant an unsurmountable limitation for the preparation of variety of Pc-based hybrids. These systems, which show improved

  14. Advance and prospectus of seasonal prediction: assessment of the APCC/CliPAS 14-model ensemble retrospective seasonal prediction (1980-2004)

    Science.gov (United States)

    Wang, Bin; Lee, June-Yi; Kang, In-Sik; Shukla, J.; Park, C.-K.; Kumar, A.; Schemm, J.; Cocke, S.; Kug, J.-S.; Luo, J.-J.; Zhou, T.; Wang, B.; Fu, X.; Yun, W.-T.; Alves, O.; Jin, E. K.; Kinter, J.; Kirtman, B.; Krishnamurti, T.; Lau, N. C.; Lau, W.; Liu, P.; Pegion, P.; Rosati, T.; Schubert, S.; Stern, W.; Suarez, M.; Yamagata, T.

    2009-07-01

    We assessed current status of multi-model ensemble (MME) deterministic and probabilistic seasonal prediction based on 25-year (1980-2004) retrospective forecasts performed by 14 climate model systems (7 one-tier and 7 two-tier systems) that participate in the Climate Prediction and its Application to Society (CliPAS) project sponsored by the Asian-Pacific Economic Cooperation Climate Center (APCC). We also evaluated seven DEMETER models’ MME for the period of 1981-2001 for comparison. Based on the assessment, future direction for improvement of seasonal prediction is discussed. We found that two measures of probabilistic forecast skill, the Brier Skill Score (BSS) and Area under the Relative Operating Characteristic curve (AROC), display similar spatial patterns as those represented by temporal correlation coefficient (TCC) score of deterministic MME forecast. A TCC score of 0.6 corresponds approximately to a BSS of 0.1 and an AROC of 0.7 and beyond these critical threshold values, they are almost linearly correlated. The MME method is demonstrated to be a valuable approach for reducing errors and quantifying forecast uncertainty due to model formulation. The MME prediction skill is substantially better than the averaged skill of all individual models. For instance, the TCC score of CliPAS one-tier MME forecast of Niño 3.4 index at a 6-month lead initiated from 1 May is 0.77, which is significantly higher than the corresponding averaged skill of seven individual coupled models (0.63). The MME made by using 14 coupled models from both DEMETER and CliPAS shows an even higher TCC score of 0.87. Effectiveness of MME depends on the averaged skill of individual models and their mutual independency. For probabilistic forecast the CliPAS MME gains considerable skill from increased forecast reliability as the number of model being used increases; the forecast resolution also increases for 2 m temperature but slightly decreases for precipitation. Equatorial Sea Surface

  15. Advance and prospectus of seasonal prediction: assessment of the APCC/CliPAS 14-model ensemble retrospective seasonal prediction (1980-2004)

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Bin; Lee, June-Yi; Fu, X.; Liu, P. [University of Hawaii, Department of Meteorology and International Pacific Research Center, IPRC, School of Ocean and Earth Science and Technology, Honolulu, HI (United States); Kang, In-Sik; Kug, J.S. [Seoul National University, School of Earth and Environmental Sciences, Seoul (Korea); Shukla, J.; Jin, E.K.; Kinter, J.; Kirtman, B. [George Mason University and COLA, Climate Dynamics Program, Calverton, MD (United States); Park, C.K. [APEC Climate Center, Busan (Korea); Kumar, A.; Schemm, J. [Climate Prediction Center/NCEP, Camp Springs, MD (United States); Cocke, S.; Krishnamurti, T. [Florida State University, Tallahassee, FL (United States); Luo, J.J. [Frontier Research Center for Global Chnage, Yokohama (Japan); Zhou, T.; Wang, B. [Chinese Academy of Sciences, LASG/Institute of Atmospheric Physics, Beijing (China); Yun, W.T. [Korean Meteorological Administration, Seoul (Korea); Alves, O. [Bureau of Meteorology Research Center, Melburne (Australia); Lau, N.C.; Rosati, T.; Stern, W. [Princeton University, Geophysical Fluid Dynamics Laboratory/NOAA, Princeton, NJ (United States); Lau, W.; Pegion, P.; Schubert, S.; Suarez, M. [Godard Space Flight Center/NASA, Greenbelt, MD (United States)

    2009-07-15

    We assessed current status of multi-model ensemble (MME) deterministic and probabilistic seasonal prediction based on 25-year (1980-2004) retrospective forecasts performed by 14 climate model systems (7 one-tier and 7 two-tier systems) that participate in the Climate Prediction and its Application to Society (CliPAS) project sponsored by the Asian-Pacific Economic Cooperation Climate Center (APCC). We also evaluated seven DEMETER models' MME for the period of 1981-2001 for comparison. Based on the assessment, future direction for improvement of seasonal prediction is discussed. We found that two measures of probabilistic forecast skill, the Brier Skill Score (BSS) and Area under the Relative Operating Characteristic curve (AROC), display similar spatial patterns as those represented by temporal correlation coefficient (TCC) score of deterministic MME forecast. A TCC score of 0.6 corresponds approximately to a BSS of 0.1 and an AROC of 0.7 and beyond these critical threshold values, they are almost linearly correlated. The MME method is demonstrated to be a valuable approach for reducing errors and quantifying forecast uncertainty due to model formulation. The MME prediction skill is substantially better than the averaged skill of all individual models. For instance, the TCC score of CliPAS one-tier MME forecast of Nino 3.4 index at a 6-month lead initiated from 1 May is 0.77, which is significantly higher than the corresponding averaged skill of seven individual coupled models (0.63). The MME made by using 14 coupled models from both DEMETER and CliPAS shows an even higher TCC score of 0.87. Effectiveness of MME depends on the averaged skill of individual models and their mutual independency. For probabilistic forecast the CliPAS MME gains considerable skill from increased forecast reliability as the number of model being used increases; the forecast resolution also increases for 2 m temperature but slightly decreases for precipitation. Equatorial Sea Surface

  16. Time-optimal path planning in uncertain flow fields using ensemble method

    KAUST Repository

    Wang, Tong

    2016-01-06

    An ensemble-based approach is developed to conduct time-optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where a set deterministic predictions is used to model and quantify uncertainty in the predictions. In the operational setting, much about dynamics, topography and forcing of the ocean environment is uncertain, and as a result a single path produced by a model simulation has limited utility. To overcome this limitation, we rely on a finitesize ensemble of deterministic forecasts to quantify the impact of variability in the dynamics. The uncertainty of flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each the resulting realizations of the uncertain current field, we predict the optimal path by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of sampling strategy, and develop insight into extensions dealing with regional or general circulation models. In particular, the ensemble method enables us to perform a statistical analysis of travel times, and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.

  17. Embedded random matrix ensembles in quantum physics

    CERN Document Server

    Kota, V K B

    2014-01-01

    Although used with increasing frequency in many branches of physics, random matrix ensembles are not always sufficiently specific to account for important features of the physical system at hand. One refinement which retains the basic stochastic approach but allows for such features consists in the use of embedded ensembles.  The present text is an exhaustive introduction to and survey of this important field. Starting with an easy-to-read introduction to general random matrix theory, the text then develops the necessary concepts from the beginning, accompanying the reader to the frontiers of present-day research. With some notable exceptions, to date these ensembles have primarily been applied in nuclear spectroscopy. A characteristic example is the use of a random two-body interaction in the framework of the nuclear shell model. Yet, topics in atomic physics, mesoscopic physics, quantum information science and statistical mechanics of isolated finite quantum systems can also be addressed using these ensemb...

  18. A gain-loss framework based on ensemble flow forecasts to switch the urban drainage-wastewater system management towards energy optimization during dry periods

    Science.gov (United States)

    Courdent, Vianney; Grum, Morten; Munk-Nielsen, Thomas; Mikkelsen, Peter S.

    2017-05-01

    Precipitation is the cause of major perturbation to the flow in urban drainage and wastewater systems. Flow forecasts, generated by coupling rainfall predictions with a hydrologic runoff model, can potentially be used to optimize the operation of integrated urban drainage-wastewater systems (IUDWSs) during both wet and dry weather periods. Numerical weather prediction (NWP) models have significantly improved in recent years, having increased their spatial and temporal resolution. Finer resolution NWP are suitable for urban-catchment-scale applications, providing longer lead time than radar extrapolation. However, forecasts are inevitably uncertain, and fine resolution is especially challenging for NWP. This uncertainty is commonly addressed in meteorology with ensemble prediction systems (EPSs). Handling uncertainty is challenging for decision makers and hence tools are necessary to provide insight on ensemble forecast usage and to support the rationality of decisions (i.e. forecasts are uncertain and therefore errors will be made; decision makers need tools to justify their choices, demonstrating that these choices are beneficial in the long run). This study presents an economic framework to support the decision-making process by providing information on when acting on the forecast is beneficial and how to handle the EPS. The relative economic value (REV) approach associates economic values with the potential outcomes and determines the preferential use of the EPS forecast. The envelope curve of the REV diagram combines the results from each probability forecast to provide the highest relative economic value for a given gain-loss ratio. This approach is traditionally used at larger scales to assess mitigation measures for adverse events (i.e. the actions are taken when events are forecast). The specificity of this study is to optimize the energy consumption in IUDWS during low-flow periods by exploiting the electrical smart grid market (i.e. the actions are taken

  19. Imprinting and recalling cortical ensembles.

    Science.gov (United States)

    Carrillo-Reid, Luis; Yang, Weijian; Bando, Yuki; Peterka, Darcy S; Yuste, Rafael

    2016-08-12

    Neuronal ensembles are coactive groups of neurons that may represent building blocks of cortical circuits. These ensembles could be formed by Hebbian plasticity, whereby synapses between coactive neurons are strengthened. Here we report that repetitive activation with two-photon optogenetics of neuronal populations from ensembles in the visual cortex of awake mice builds neuronal ensembles that recur spontaneously after being imprinted and do not disrupt preexisting ones. Moreover, imprinted ensembles can be recalled by single- cell stimulation and remain coactive on consecutive days. Our results demonstrate the persistent reconfiguration of cortical circuits by two-photon optogenetics into neuronal ensembles that can perform pattern completion. Copyright © 2016, American Association for the Advancement of Science.

  20. Geometric integrator for simulations in the canonical ensemble

    Energy Technology Data Exchange (ETDEWEB)

    Tapias, Diego, E-mail: diego.tapias@nucleares.unam.mx [Departamento de Física, Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México 04510 (Mexico); Sanders, David P., E-mail: dpsanders@ciencias.unam.mx [Departamento de Física, Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México 04510 (Mexico); Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139 (United States); Bravetti, Alessandro, E-mail: alessandro.bravetti@iimas.unam.mx [Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México 04510 (Mexico)

    2016-08-28

    We introduce a geometric integrator for molecular dynamics simulations of physical systems in the canonical ensemble that preserves the invariant distribution in equations arising from the density dynamics algorithm, with any possible type of thermostat. Our integrator thus constitutes a unified framework that allows the study and comparison of different thermostats and of their influence on the equilibrium and non-equilibrium (thermo-)dynamic properties of a system. To show the validity and the generality of the integrator, we implement it with a second-order, time-reversible method and apply it to the simulation of a Lennard-Jones system with three different thermostats, obtaining good conservation of the geometrical properties and recovering the expected thermodynamic results. Moreover, to show the advantage of our geometric integrator over a non-geometric one, we compare the results with those obtained by using the non-geometric Gear integrator, which is frequently used to perform simulations in the canonical ensemble. The non-geometric integrator induces a drift in the invariant quantity, while our integrator has no such drift, thus ensuring that the system is effectively sampling the correct ensemble.

  1. Geometric integrator for simulations in the canonical ensemble

    International Nuclear Information System (INIS)

    Tapias, Diego; Sanders, David P.; Bravetti, Alessandro

    2016-01-01

    We introduce a geometric integrator for molecular dynamics simulations of physical systems in the canonical ensemble that preserves the invariant distribution in equations arising from the density dynamics algorithm, with any possible type of thermostat. Our integrator thus constitutes a unified framework that allows the study and comparison of different thermostats and of their influence on the equilibrium and non-equilibrium (thermo-)dynamic properties of a system. To show the validity and the generality of the integrator, we implement it with a second-order, time-reversible method and apply it to the simulation of a Lennard-Jones system with three different thermostats, obtaining good conservation of the geometrical properties and recovering the expected thermodynamic results. Moreover, to show the advantage of our geometric integrator over a non-geometric one, we compare the results with those obtained by using the non-geometric Gear integrator, which is frequently used to perform simulations in the canonical ensemble. The non-geometric integrator induces a drift in the invariant quantity, while our integrator has no such drift, thus ensuring that the system is effectively sampling the correct ensemble.

  2. Extended Range Prediction of Indian Summer Monsoon: Current status

    Science.gov (United States)

    Sahai, A. K.; Abhilash, S.; Borah, N.; Joseph, S.; Chattopadhyay, R.; S, S.; Rajeevan, M.; Mandal, R.; Dey, A.

    2014-12-01

    The main focus of this study is to develop forecast consensus in the extended range prediction (ERP) of monsoon Intraseasonal oscillations using a suit of different variants of Climate Forecast system (CFS) model. In this CFS based Grand MME prediction system (CGMME), the ensemble members are generated by perturbing the initial condition and using different configurations of CFSv2. This is to address the role of different physical mechanisms known to have control on the error growth in the ERP in the 15-20 day time scale. The final formulation of CGMME is based on 21 ensembles of the standalone Global Forecast System (GFS) forced with bias corrected forecasted SST from CFS, 11 low resolution CFST126 and 11 high resolution CFST382. Thus, we develop the multi-model consensus forecast for the ERP of Indian summer monsoon (ISM) using a suite of different variants of CFS model. This coordinated international effort lead towards the development of specific tailor made regional forecast products over Indian region. Skill of deterministic and probabilistic categorical rainfall forecast as well the verification of large-scale low frequency monsoon intraseasonal oscillations has been carried out using hindcast from 2001-2012 during the monsoon season in which all models are initialized at every five days starting from 16May to 28 September. The skill of deterministic forecast from CGMME is better than the best participating single model ensemble configuration (SME). The CGMME approach is believed to quantify the uncertainty in both initial conditions and model formulation. Main improvement is attained in probabilistic forecast which is because of an increase in the ensemble spread, thereby reducing the error due to over-confident ensembles in a single model configuration. For probabilistic forecast, three tercile ranges are determined by ranking method based on the percentage of ensemble members from all the participating models falls in those three categories. CGMME further

  3. World Music Ensemble: Kulintang

    Science.gov (United States)

    Beegle, Amy C.

    2012-01-01

    As instrumental world music ensembles such as steel pan, mariachi, gamelan and West African drums are becoming more the norm than the exception in North American school music programs, there are other world music ensembles just starting to gain popularity in particular parts of the United States. The kulintang ensemble, a drum and gong ensemble…

  4. Determining the bounds of skilful forecast range for probabilistic prediction of system-wide wind power generation

    Directory of Open Access Journals (Sweden)

    Dirk Cannon

    2017-06-01

    Full Text Available State-of-the-art wind power forecasts beyond a few hours ahead rely on global numerical weather prediction models to forecast the future large-scale atmospheric state. Often they provide initial and boundary conditions for nested high resolution simulations. In this paper, both upper and lower bounds on forecast range are identified within which global ensemble forecasts provide skilful information for system-wide wind power applications. An upper bound on forecast range is associated with the limit of predictability, beyond which forecasts have no more skill than predictions based on climatological statistics. A lower bound is defined at the lead time beyond which the resolved uncertainty associated with estimating the future large-scale atmospheric state is larger than the unresolved uncertainty associated with estimating the system-wide wind power response to a given large-scale state.The bounds of skilful ensemble forecast range are quantified for three leading global forecast systems. The power system of Great Britain (GB is used as an example because independent verifying data is available from National Grid. The upper bound defined by forecasts of GB-total wind power generation at a specific point in time is found to be 6–8 days. The lower bound is found to be 1.4–2.4 days. Both bounds depend on the global forecast system and vary seasonally. In addition, forecasts of the probability of an extreme power ramp event were found to possess a shorter limit of predictability (4.5–5.5 days. The upper bound on this forecast range can only be extended by improving the global forecast system (outside the control of most users or by changing the metric used in the probability forecast. Improved downscaling and microscale modelling of the wind farm response may act to decrease the lower bound. The potential gain from such improvements have diminishing returns beyond the short-range (out to around 2 days.

  5. Impact of ensemble learning in the assessment of skeletal maturity.

    Science.gov (United States)

    Cunha, Pedro; Moura, Daniel C; Guevara López, Miguel Angel; Guerra, Conceição; Pinto, Daniela; Ramos, Isabel

    2014-09-01

    The assessment of the bone age, or skeletal maturity, is an important task in pediatrics that measures the degree of maturation of children's bones. Nowadays, there is no standard clinical procedure for assessing bone age and the most widely used approaches are the Greulich and Pyle and the Tanner and Whitehouse methods. Computer methods have been proposed to automatize the process; however, there is a lack of exploration about how to combine the features of the different parts of the hand, and how to take advantage of ensemble techniques for this purpose. This paper presents a study where the use of ensemble techniques for improving bone age assessment is evaluated. A new computer method was developed that extracts descriptors for each joint of each finger, which are then combined using different ensemble schemes for obtaining a final bone age value. Three popular ensemble schemes are explored in this study: bagging, stacking and voting. Best results were achieved by bagging with a rule-based regression (M5P), scoring a mean absolute error of 10.16 months. Results show that ensemble techniques improve the prediction performance of most of the evaluated regression algorithms, always achieving best or comparable to best results. Therefore, the success of the ensemble methods allow us to conclude that their use may improve computer-based bone age assessment, offering a scalable option for utilizing multiple regions of interest and combining their output.

  6. Correlation of chemical shifts predicted by molecular dynamics simulations for partially disordered proteins

    Energy Technology Data Exchange (ETDEWEB)

    Karp, Jerome M.; Erylimaz, Ertan; Cowburn, David, E-mail: cowburn@cowburnlab.org, E-mail: David.cowburn@einstein.yu.edu [Albert Einstein College of Medicine of Yeshiva University, Department of Biochemistry (United States)

    2015-01-15

    There has been a longstanding interest in being able to accurately predict NMR chemical shifts from structural data. Recent studies have focused on using molecular dynamics (MD) simulation data as input for improved prediction. Here we examine the accuracy of chemical shift prediction for intein systems, which have regions of intrinsic disorder. We find that using MD simulation data as input for chemical shift prediction does not consistently improve prediction accuracy over use of a static X-ray crystal structure. This appears to result from the complex conformational ensemble of the disordered protein segments. We show that using accelerated molecular dynamics (aMD) simulations improves chemical shift prediction, suggesting that methods which better sample the conformational ensemble like aMD are more appropriate tools for use in chemical shift prediction for proteins with disordered regions. Moreover, our study suggests that data accurately reflecting protein dynamics must be used as input for chemical shift prediction in order to correctly predict chemical shifts in systems with disorder.

  7. The Earth System Prediction Suite: Toward a Coordinated U.S. Modeling Capability

    Science.gov (United States)

    Theurich, Gerhard; DeLuca, C.; Campbell, T.; Liu, F.; Saint, K.; Vertenstein, M.; Chen, J.; Oehmke, R.; Doyle, J.; Whitcomb, T.; hide

    2016-01-01

    The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open source terms or to credentialed users.The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the U.S. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multi-agency development of coupled modeling systems, controlled experimentation and testing, and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NavGEM), HYbrid Coordinate Ocean Model (HYCOM), and Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and GEOS-5 atmospheric general circulation model.

  8. Analogies between random matrix ensembles and the one-component plasma in two-dimensions

    Directory of Open Access Journals (Sweden)

    Peter J. Forrester

    2016-03-01

    Full Text Available The eigenvalue PDF for some well known classes of non-Hermitian random matrices — the complex Ginibre ensemble for example — can be interpreted as the Boltzmann factor for one-component plasma systems in two-dimensional domains. We address this theme in a systematic fashion, identifying the plasma system for the Ginibre ensemble of non-Hermitian Gaussian random matrices G, the spherical ensemble of the product of an inverse Ginibre matrix and a Ginibre matrix G1−1G2, and the ensemble formed by truncating unitary matrices, as well as for products of such matrices. We do this when each has either real, complex or real quaternion elements. One consequence of this analogy is that the leading form of the eigenvalue density follows as a corollary. Another is that the eigenvalue correlations must obey sum rules known to characterise the plasma system, and this leads us to an exhibit of an integral identity satisfied by the two-particle correlation for real quaternion matrices in the neighbourhood of the real axis. Further random matrix ensembles investigated from this viewpoint are self dual non-Hermitian matrices, in which a previous study has related to the one-component plasma system in a disk at inverse temperature β=4, and the ensemble formed by the single row and column of quaternion elements from a member of the circular symplectic ensemble.

  9. Lattice gauge theory in the microcanonical ensemble

    International Nuclear Information System (INIS)

    Callaway, D.J.E.; Rahman, A.

    1983-01-01

    The microcanonical-ensemble formulation of lattice gauge theory proposed recently is examined in detail. Expectation values in this new ensemble are determined by solving a large set of coupled ordinary differential equations, after the fashion of a molecular dynamics simulation. Following a brief review of the microcanonical ensemble, calculations are performed for the gauge groups U(1), SU(2), and SU(3). The results are compared and contrasted with standard methods of computation. Several advantages of the new formalism are noted. For example, no random numbers are required to update the system. Also, this update is performed in a simultaneous fashion. Thus the microcanonical method presumably adapts well to parallel processing techniques, especially when the p action is highly nonlocal (such as when fermions are included)

  10. Transferability of hydrological models and ensemble averaging methods between contrasting climatic periods

    Science.gov (United States)

    Broderick, Ciaran; Matthews, Tom; Wilby, Robert L.; Bastola, Satish; Murphy, Conor

    2016-10-01

    Understanding hydrological model predictive capabilities under contrasting climate conditions enables more robust decision making. Using Differential Split Sample Testing (DSST), we analyze the performance of six hydrological models for 37 Irish catchments under climate conditions unlike those used for model training. Additionally, we consider four ensemble averaging techniques when examining interperiod transferability. DSST is conducted using 2/3 year noncontinuous blocks of (i) the wettest/driest years on record based on precipitation totals and (ii) years with a more/less pronounced seasonal precipitation regime. Model transferability between contrasting regimes was found to vary depending on the testing scenario, catchment, and evaluation criteria considered. As expected, the ensemble average outperformed most individual ensemble members. However, averaging techniques differed considerably in the number of times they surpassed the best individual model member. Bayesian Model Averaging (BMA) and the Granger-Ramanathan Averaging (GRA) method were found to outperform the simple arithmetic mean (SAM) and Akaike Information Criteria Averaging (AICA). Here GRA performed better than the best individual model in 51%-86% of cases (according to the Nash-Sutcliffe criterion). When assessing model predictive skill under climate change conditions we recommend (i) setting up DSST to select the best available analogues of expected annual mean and seasonal climate conditions; (ii) applying multiple performance criteria; (iii) testing transferability using a diverse set of catchments; and (iv) using a multimodel ensemble in conjunction with an appropriate averaging technique. Given the computational efficiency and performance of GRA relative to BMA, the former is recommended as the preferred ensemble averaging technique for climate assessment.

  11. High resolution ensemble forecasting for the Gulf of Mexico eddies and fronts

    Science.gov (United States)

    Counillon, F.; Bertino, L.

    2007-05-01

    As oil production moves further into deeper waters, the costs related to strong current hazards are increasing accordingly, and accurate three-dimensional forecasts of currents are urgently needed. To be useful, models have to locate eddies and fronts to an accuracy of 30 km at a nowcast stage, which is almost impossible to accomplish with the use of satellite data of the same accuracy. The use of stochastic forecast allows us to give confidence of our prediction. We are using a nested configuration of the Hybrid coordinate ocean model (HYCOM), where the TOPAZ system, which covers the Atlantic and the Artic, gives lateral boundary condition to a high-resolution (5km) model of the Gulf of Mexico (GOM). TOPAZ is a real-time forecasting coupled ocean-ice model, which assimilates sea level anomaly (SLA), sea surface temperature, and sea ice concentration, with the ensemble Kalman filter. The high- resolution model assimilates SLA using the ensemble optimal interpolation, which updates accordingly the currents, salinity, temperature, and layer interface at all depths. Here, we evaluate the ensemble forecast capabilities of our high-resolution model, for eddy Extreme that has been observed from altimeters around the 15th of July. We run 6 successive ensemble runs composed of 10 members of equal likelihood. Members differ by perturbations of the initial state, of the lateral boundary conditions, and of the atmospheric boundary conditions. We have started the experiment 1 month prior to the shedding event, because it was the time necessary for perturbation of boundary conditions to spread uniformly and reach a significant level across the GOM. The ensemble reproduces well the dynamics of the eddy shedding and produces a significant spread at the boundary of the eddy, but underestimates the RMS error of the SLA. Prior to the shedding time, the error growth increase, induced by the highly non-linear growth of cyclonic eddies at the boundary of the Loop Current. Additionally

  12. Multi-model ensembles for assessment of flood losses and associated uncertainty

    Science.gov (United States)

    Figueiredo, Rui; Schröter, Kai; Weiss-Motz, Alexander; Martina, Mario L. V.; Kreibich, Heidi

    2018-05-01

    Flood loss modelling is a crucial part of risk assessments. However, it is subject to large uncertainty that is often neglected. Most models available in the literature are deterministic, providing only single point estimates of flood loss, and large disparities tend to exist among them. Adopting any one such model in a risk assessment context is likely to lead to inaccurate loss estimates and sub-optimal decision-making. In this paper, we propose the use of multi-model ensembles to address these issues. This approach, which has been applied successfully in other scientific fields, is based on the combination of different model outputs with the aim of improving the skill and usefulness of predictions. We first propose a model rating framework to support ensemble construction, based on a probability tree of model properties, which establishes relative degrees of belief between candidate models. Using 20 flood loss models in two test cases, we then construct numerous multi-model ensembles, based both on the rating framework and on a stochastic method, differing in terms of participating members, ensemble size and model weights. We evaluate the performance of ensemble means, as well as their probabilistic skill and reliability. Our results demonstrate that well-designed multi-model ensembles represent a pragmatic approach to consistently obtain more accurate flood loss estimates and reliable probability distributions of model uncertainty.

  13. Combining structural modeling with ensemble machine learning to accurately predict protein fold stability and binding affinity effects upon mutation.

    Directory of Open Access Journals (Sweden)

    Niklas Berliner

    Full Text Available Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.

  14. Multidimensional generalized-ensemble algorithms for complex systems.

    Science.gov (United States)

    Mitsutake, Ayori; Okamoto, Yuko

    2009-06-07

    We give general formulations of the multidimensional multicanonical algorithm, simulated tempering, and replica-exchange method. We generalize the original potential energy function E(0) by adding any physical quantity V of interest as a new energy term. These multidimensional generalized-ensemble algorithms then perform a random walk not only in E(0) space but also in V space. Among the three algorithms, the replica-exchange method is the easiest to perform because the weight factor is just a product of regular Boltzmann-like factors, while the weight factors for the multicanonical algorithm and simulated tempering are not a priori known. We give a simple procedure for obtaining the weight factors for these two latter algorithms, which uses a short replica-exchange simulation and the multiple-histogram reweighting techniques. As an example of applications of these algorithms, we have performed a two-dimensional replica-exchange simulation and a two-dimensional simulated-tempering simulation using an alpha-helical peptide system. From these simulations, we study the helix-coil transitions of the peptide in gas phase and in aqueous solution.

  15. Level-statistics in Disordered Systems: A single parametric scaling and Connection to Brownian Ensembles

    OpenAIRE

    Shukla, Pragya

    2004-01-01

    We find that the statistics of levels undergoing metal-insulator transition in systems with multi-parametric Gaussian disorders and non-interacting electrons behaves in a way similar to that of the single parametric Brownian ensembles \\cite{dy}. The latter appear during a Poisson $\\to$ Wigner-Dyson transition, driven by a random perturbation. The analogy provides the analytical evidence for the single parameter scaling of the level-correlations in disordered systems as well as a tool to obtai...

  16. Predictability of weather and climate

    National Research Council Canada - National Science Library

    Palmer, Tim; Hagedorn, Renate

    2006-01-01

    ... and anthropogenic climate change are among those included. Ensemble systems for forecasting predictability are discussed extensively. Ed Lorenz, father of chaos theory, makes a contribution to theoretical analysis with a previously unpublished paper. This well-balanced volume will be a valuable resource for many years. High-quality chapter autho...

  17. ENSEMBLE methods to reconcile disparate national long range dispersion forecasts

    DEFF Research Database (Denmark)

    Mikkelsen, Torben; Galmarini, S.; Bianconi, R.

    2003-01-01

    ENSEMBLE is a web-based decision support system for real-time exchange and evaluation of national long-range dispersion forecasts of nuclear releases with cross-boundary consequences. The system is developed with the purpose to reconcile among disparatenational forecasts for long-range dispersion...... emergency and meteorological forecasting centres, which may choose to integrate them directly intooperational emergency information systems, or possibly use them as a basis for future system development.......ENSEMBLE is a web-based decision support system for real-time exchange and evaluation of national long-range dispersion forecasts of nuclear releases with cross-boundary consequences. The system is developed with the purpose to reconcile among disparatenational forecasts for long-range dispersion....... ENSEMBLE addresses the problem of achieving a common coherent strategy across European national emergency management when national long-range dispersion forecasts differ from one another during an accidentalatmospheric release of radioactive material. A series of new decision-making “ENSEMBLE” procedures...

  18. A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines

    Energy Technology Data Exchange (ETDEWEB)

    Meza, Juan C.; Woods, Mark

    2009-12-18

    Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.

  19. Improvement of Surface Temperature Prediction Using SVR with MOGREPS Data for Short and Medium range over South Korea

    Science.gov (United States)

    Lim, S. J.; Choi, R. K.; Ahn, K. D.; Ha, J. C.; Cho, C. H.

    2014-12-01

    As the Korea Meteorology Administration (KMA) has operated Met Office Global and Regional Ensemble Prediction System (MOGREPS) with introduction of Unified Model (UM), many attempts have been made to improve predictability in temperature forecast in last years. In this study, post-processing method of MOGREPS for surface temperature prediction is developed with machine learning over 52 locations in South Korea. Past 60-day lag time was used as a training phase of Support Vector Regression (SVR) method for surface temperature forecast model. The selected inputs for SVR are followings: date and surface temperatures from Numerical Weather prediction (NWP), such as GDAPS, individual 24 ensemble members, mean and median of ensemble members for every 3hours for 12 days.To verify the reliability of SVR-based ensemble prediction (SVR-EP), 93 days are used (from March 1 to May 31, 2014). The result yielded improvement of SVR-EP by RMSE value of 16 % throughout entire prediction period against conventional ensemble prediction (EP). In particular, short range predictability of SVR-EP resulted in 18.7% better RMSE for 1~3 day forecast. The mean temperature bias between SVR-EP and EP at all test locations showed around 0.36°C and 1.36°C, respectively. SVR-EP is currently extending for more vigorous sensitivity test, such as increasing training phase and optimizing machine learning model.

  20. Ensemble Nonlinear Autoregressive Exogenous Artificial Neural Networks for Short-Term Wind Speed and Power Forecasting.

    Science.gov (United States)

    Men, Zhongxian; Yee, Eugene; Lien, Fue-Sang; Yang, Zhiling; Liu, Yongqian

    2014-01-01

    Short-term wind speed and wind power forecasts (for a 72 h period) are obtained using a nonlinear autoregressive exogenous artificial neural network (ANN) methodology which incorporates either numerical weather prediction or high-resolution computational fluid dynamics wind field information as an exogenous input. An ensemble approach is used to combine the predictions from many candidate ANNs in order to provide improved forecasts for wind speed and power, along with the associated uncertainties in these forecasts. More specifically, the ensemble ANN is used to quantify the uncertainties arising from the network weight initialization and from the unknown structure of the ANN. All members forming the ensemble of neural networks were trained using an efficient particle swarm optimization algorithm. The results of the proposed methodology are validated using wind speed and wind power data obtained from an operational wind farm located in Northern China. The assessment demonstrates that this methodology for wind speed and power forecasting generally provides an improvement in predictive skills when compared to the practice of using an "optimal" weight vector from a single ANN while providing additional information in the form of prediction uncertainty bounds.

  1. PERPADUAN COMBINED SAMPLING DAN ENSEMBLE OF SUPPORT VECTOR MACHINE (ENSVM UNTUK MENANGANI KASUS CHURN PREDICTION PERUSAHAAN TELEKOMUNIKASI

    Directory of Open Access Journals (Sweden)

    Fernandy Marbun

    2010-07-01

    Full Text Available Churn prediction adalah suatu cara untuk memprediksi pelanggan yang berpotensial untuk churn. Data mining khususnya klasifikasi tampaknya dapat menjadi alternatif solusi dalam membuat model churn prediction yang akurat. Namun hasil klasifikasi menjadi tidak akurat disebabkan karena data churn bersifat imbalance. Kelas data menjadi tidak stabil karena data akan lebih condong ke bagian data yang memiliki komposisi data yang lebih besar. Salah satu cara untuk menangani permasalahan ini adalah dengan memodifikasi dataset yang digunakan atau yang lebih dikenal dengan metode resampling. Teknik resampling ini meliputi over-sampling, under-sampling, dan combined-sampling. Metode Ensemble of SVM (EnSVM diharapkan dapat meminimalisir kesalahan klasifikasi kelas mayor dan minor yang dihasilkan oleh classifier SVM tunggal. Dalam penelitian ini akan dicoba untuk memadukan combined sampling dan EnSVM untuk churn predicition. Pengujian dilakukan dengan membandingkan hasil klasifikasi CombinedSampling-EnSVM dengan SMOTE-SVM (perpaduan oversamping-SVM dan pure-SVM. Hasil pengujian menunjukkan bahwa metode CombinedSampling-EnSVM secara umum hanya mampu menghasilkan performansi Gini Index yang lebih baik daripada metode SMOTE-SVM dan tanpa resampling (pure-SVM.

  2. Polarimetric SAR Image Classification Using Multiple-feature Fusion and Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Sun Xun

    2016-12-01

    Full Text Available In this paper, we propose a supervised classification algorithm for Polarimetric Synthetic Aperture Radar (PolSAR images using multiple-feature fusion and ensemble learning. First, we extract different polarimetric features, including extended polarimetric feature space, Hoekman, Huynen, H/alpha/A, and fourcomponent scattering features of PolSAR images. Next, we randomly select two types of features each time from all feature sets to guarantee the reliability and diversity of later ensembles and use a support vector machine as the basic classifier for predicting classification results. Finally, we concatenate all prediction probabilities of basic classifiers as the final feature representation and employ the random forest method to obtain final classification results. Experimental results at the pixel and region levels show the effectiveness of the proposed algorithm.

  3. Ensemble Methods in Data Mining Improving Accuracy Through Combining Predictions

    CERN Document Server

    Seni, Giovanni

    2010-01-01

    This book is aimed at novice and advanced analytic researchers and practitioners -- especially in Engineering, Statistics, and Computer Science. Those with little exposure to ensembles will learn why and how to employ this breakthrough method, and advanced practitioners will gain insight into building even more powerful models. Throughout, snippets of code in R are provided to illustrate the algorithms described and to encourage the reader to try the techniques. The authors are industry experts in data mining and machine learning who are also adjunct professors and popular speakers. Although e

  4. Localization of atomic ensembles via superfluorescence

    International Nuclear Information System (INIS)

    Macovei, Mihai; Evers, Joerg; Keitel, Christoph H.; Zubairy, M. Suhail

    2007-01-01

    The subwavelength localization of an ensemble of atoms concentrated to a small volume in space is investigated. The localization relies on the interaction of the ensemble with a standing wave laser field. The light scattered in the interaction of the standing wave field and the atom ensemble depends on the position of the ensemble relative to the standing wave nodes. This relation can be described by a fluorescence intensity profile, which depends on the standing wave field parameters and the ensemble properties and which is modified due to collective effects in the ensemble of nearby particles. We demonstrate that the intensity profile can be tailored to suit different localization setups. Finally, we apply these results to two localization schemes. First, we show how to localize an ensemble fixed at a certain position in the standing wave field. Second, we discuss localization of an ensemble passing through the standing wave field

  5. Ocean-Atmosphere Coupling Processes Affecting Predictability in the Climate System

    Science.gov (United States)

    Miller, A. J.; Subramanian, A. C.; Seo, H.; Eliashiv, J. D.

    2017-12-01

    Predictions of the ocean and atmosphere are often sensitive to coupling at the air-sea interface in ways that depend on the temporal and spatial scales of the target fields. We will discuss several aspects of these types of coupled interactions including oceanic and atmospheric forecast applications. For oceanic mesoscale eddies, the coupling can influence the energetics of the oceanic flow itself. For Madden-Julian Oscillation onset, the coupling timestep should resolve the diurnal cycle to properly raise time-mean SST and latent heat flux prior to deep convection. For Atmospheric River events, the evolving SST field can alter the trajectory and intensity of precipitation anomalies along the California coast. Improvements in predictions will also rely on identifying and alleviating sources of biases in the climate states of the coupled system. Surprisingly, forecast skill can also be improved by enhancing stochastic variability in the atmospheric component of coupled models as found in a multiscale ensemble modeling approach.

  6. Ensembl 2017

    OpenAIRE

    Aken, Bronwen L.; Achuthan, Premanand; Akanni, Wasiu; Amode, M. Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Gir?n, Carlos Garc?a; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E.

    2016-01-01

    Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access ...

  7. Ensemble Sampling

    OpenAIRE

    Lu, Xiuyuan; Van Roy, Benjamin

    2017-01-01

    Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applica...

  8. Regionalization of post-processed ensemble runoff forecasts

    Directory of Open Access Journals (Sweden)

    J. O. Skøien

    2016-05-01

    Full Text Available For many years, meteorological models have been run with perturbated initial conditions or parameters to produce ensemble forecasts that are used as a proxy of the uncertainty of the forecasts. However, the ensembles are usually both biased (the mean is systematically too high or too low, compared with the observed weather, and has dispersion errors (the ensemble variance indicates a too low or too high confidence in the forecast, compared with the observed weather. The ensembles are therefore commonly post-processed to correct for these shortcomings. Here we look at one of these techniques, referred to as Ensemble Model Output Statistics (EMOS (Gneiting et al., 2005. Originally, the post-processing parameters were identified as a fixed set of parameters for a region. The application of our work is the European Flood Awareness System (http://www.efas.eu, where a distributed model is run with meteorological ensembles as input. We are therefore dealing with a considerably larger data set than previous analyses. We also want to regionalize the parameters themselves for other locations than the calibration gauges. The post-processing parameters are therefore estimated for each calibration station, but with a spatial penalty for deviations from neighbouring stations, depending on the expected semivariance between the calibration catchment and these stations. The estimated post-processed parameters can then be used for regionalization of the postprocessing parameters also for uncalibrated locations using top-kriging in the rtop-package (Skøien et al., 2006, 2014. We will show results from cross-validation of the methodology and although our interest is mainly in identifying exceedance probabilities for certain return levels, we will also show how the rtop package can be used for creating a set of post-processed ensembles through simulations.

  9. Efficient Kernel-Based Ensemble Gaussian Mixture Filtering

    KAUST Repository

    Liu, Bo

    2015-11-11

    We consider the Bayesian filtering problem for data assimilation following the kernel-based ensemble Gaussian-mixture filtering (EnGMF) approach introduced by Anderson and Anderson (1999). In this approach, the posterior distribution of the system state is propagated with the model using the ensemble Monte Carlo method, providing a forecast ensemble that is then used to construct a prior Gaussian-mixture (GM) based on the kernel density estimator. This results in two update steps: a Kalman filter (KF)-like update of the ensemble members and a particle filter (PF)-like update of the weights, followed by a resampling step to start a new forecast cycle. After formulating EnGMF for any observational operator, we analyze the influence of the bandwidth parameter of the kernel function on the covariance of the posterior distribution. We then focus on two aspects: i) the efficient implementation of EnGMF with (relatively) small ensembles, where we propose a new deterministic resampling strategy preserving the first two moments of the posterior GM to limit the sampling error; and ii) the analysis of the effect of the bandwidth parameter on contributions of KF and PF updates and on the weights variance. Numerical results using the Lorenz-96 model are presented to assess the behavior of EnGMF with deterministic resampling, study its sensitivity to different parameters and settings, and evaluate its performance against ensemble KFs. The proposed EnGMF approach with deterministic resampling suggests improved estimates in all tested scenarios, and is shown to require less localization and to be less sensitive to the choice of filtering parameters.

  10. ENSEMBLE methods to reconcile disparate national long range dispersion forecasts

    OpenAIRE

    Mikkelsen, Torben; Galmarini, S.; Bianconi, R.; French, S.

    2003-01-01

    ENSEMBLE is a web-based decision support system for real-time exchange and evaluation of national long-range dispersion forecasts of nuclear releases with cross-boundary consequences. The system is developed with the purpose to reconcile among disparatenational forecasts for long-range dispersion. ENSEMBLE addresses the problem of achieving a common coherent strategy across European national emergency management when national long-range dispersion forecasts differ from one another during an a...

  11. The Use of Artificial-Intelligence-Based Ensembles for Intrusion Detection: A Review

    Directory of Open Access Journals (Sweden)

    Gulshan Kumar

    2012-01-01

    Full Text Available In supervised learning-based classification, ensembles have been successfully employed to different application domains. In the literature, many researchers have proposed different ensembles by considering different combination methods, training datasets, base classifiers, and many other factors. Artificial-intelligence-(AI- based techniques play prominent role in development of ensemble for intrusion detection (ID and have many benefits over other techniques. However, there is no comprehensive review of ensembles in general and AI-based ensembles for ID to examine and understand their current research status to solve the ID problem. Here, an updated review of ensembles and their taxonomies has been presented in general. The paper also presents the updated review of various AI-based ensembles for ID (in particular during last decade. The related studies of AI-based ensembles are compared by set of evaluation metrics driven from (1 architecture & approach followed; (2 different methods utilized in different phases of ensemble learning; (3 other measures used to evaluate classification performance of the ensembles. The paper also provides the future directions of the research in this area. The paper will help the better understanding of different directions in which research of ensembles has been done in general and specifically: field of intrusion detection systems (IDSs.

  12. EnsembleGraph: Interactive Visual Analysis of Spatial-Temporal Behavior for Ensemble Simulation Data

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Qingya; Guo, Hanqi; Che, Limei; Yuan, Xiaoru; Liu, Junfeng; Liang, Jie

    2016-04-19

    We present a novel visualization framework—EnsembleGraph— for analyzing ensemble simulation data, in order to help scientists understand behavior similarities between ensemble members over space and time. A graph-based representation is used to visualize individual spatiotemporal regions with similar behaviors, which are extracted by hierarchical clustering algorithms. A user interface with multiple-linked views is provided, which enables users to explore, locate, and compare regions that have similar behaviors between and then users can investigate and analyze the selected regions in detail. The driving application of this paper is the studies on regional emission influences over tropospheric ozone, which is based on ensemble simulations conducted with different anthropogenic emission absences using the MOZART-4 (model of ozone and related tracers, version 4) model. We demonstrate the effectiveness of our method by visualizing the MOZART-4 ensemble simulation data and evaluating the relative regional emission influences on tropospheric ozone concentrations. Positive feedbacks from domain experts and two case studies prove efficiency of our method.

  13. Decadal Prediction Skill in the GEOS-5 Forecast System

    Science.gov (United States)

    Ham, Yoo-Geun; Rienecker, Michele M.; Suarez, Max J.; Vikhliaev, Yury; Zhao, Bin; Marshak, Jelena; Vernieres, Guillaume; Schubert, Siegfried D.

    2013-01-01

    A suite of decadal predictions has been conducted with the NASA Global Modeling and Assimilation Office's (GMAO's) GEOS-5 Atmosphere-Ocean general circulation model. The hind casts are initialized every December 1st from 1959 to 2010, following the CMIP5 experimental protocol for decadal predictions. The initial conditions are from a multivariate ensemble optimal interpolation ocean and sea-ice reanalysis, and from GMAO's atmospheric reanalysis, the modern-era retrospective analysis for research and applications. The mean forecast skill of a three-member-ensemble is compared to that of an experiment without initialization but also forced with observed greenhouse gases. The results show that initialization increases the forecast skill of North Atlantic sea surface temperature compared to the uninitialized runs, with the increase in skill maintained for almost a decade over the subtropical and mid-latitude Atlantic. On the other hand, the initialization reduces the skill in predicting the warming trend over some regions outside the Atlantic. The annual-mean Atlantic meridional overturning circulation index, which is defined here as the maximum of the zonally-integrated overturning stream function at mid-latitude, is predictable up to a 4-year lead time, consistent with the predictable signal in upper ocean heat content over the North Atlantic. While the 6- to 9-year forecast skill measured by mean squared skill score shows 50 percent improvement in the upper ocean heat content over the subtropical and mid-latitude Atlantic, prediction skill is relatively low in the sub-polar gyre. This low skill is due in part to features in the spatial pattern of the dominant simulated decadal mode in upper ocean heat content over this region that differ from observations. An analysis of the large-scale temperature budget shows that this is the result of a model bias, implying that realistic simulation of the climatological fields is crucial for skillful decadal forecasts.

  14. A convection-allowing ensemble forecast based on the breeding growth mode and associated optimization of precipitation forecast

    Science.gov (United States)

    Li, Xiang; He, Hongrang; Chen, Chaohui; Miao, Ziqing; Bai, Shigang

    2017-10-01

    A convection-allowing ensemble forecast experiment on a squall line was conducted based on the breeding growth mode (BGM). Meanwhile, the probability matched mean (PMM) and neighborhood ensemble probability (NEP) methods were used to optimize the associated precipitation forecast. The ensemble forecast predicted the precipitation tendency accurately, which was closer to the observation than in the control forecast. For heavy rainfall, the precipitation center produced by the ensemble forecast was also better. The Fractions Skill Score (FSS) results indicated that the ensemble mean was skillful in light rainfall, while the PMM produced better probability distribution of precipitation for heavy rainfall. Preliminary results demonstrated that convection-allowing ensemble forecast could improve precipitation forecast skill through providing valuable probability forecasts. It is necessary to employ new methods, such as the PMM and NEP, to generate precipitation probability forecasts. Nonetheless, the lack of spread and the overprediction of precipitation by the ensemble members are still problems that need to be solved.

  15. Shallow cumuli ensemble statistics for development of a stochastic parameterization

    Science.gov (United States)

    Sakradzija, Mirjana; Seifert, Axel; Heus, Thijs

    2014-05-01

    According to a conventional deterministic approach to the parameterization of moist convection in numerical atmospheric models, a given large scale forcing produces an unique response from the unresolved convective processes. This representation leaves out the small-scale variability of convection, as it is known from the empirical studies of deep and shallow convective cloud ensembles, there is a whole distribution of sub-grid states corresponding to the given large scale forcing. Moreover, this distribution gets broader with the increasing model resolution. This behavior is also consistent with our theoretical understanding of a coarse-grained nonlinear system. We propose an approach to represent the variability of the unresolved shallow-convective states, including the dependence of the sub-grid states distribution spread and shape on the model horizontal resolution. Starting from the Gibbs canonical ensemble theory, Craig and Cohen (2006) developed a theory for the fluctuations in a deep convective ensemble. The micro-states of a deep convective cloud ensemble are characterized by the cloud-base mass flux, which, according to the theory, is exponentially distributed (Boltzmann distribution). Following their work, we study the shallow cumulus ensemble statistics and the distribution of the cloud-base mass flux. We employ a Large-Eddy Simulation model (LES) and a cloud tracking algorithm, followed by a conditional sampling of clouds at the cloud base level, to retrieve the information about the individual cloud life cycles and the cloud ensemble as a whole. In the case of shallow cumulus cloud ensemble, the distribution of micro-states is a generalized exponential distribution. Based on the empirical and theoretical findings, a stochastic model has been developed to simulate the shallow convective cloud ensemble and to test the convective ensemble theory. Stochastic model simulates a compound random process, with the number of convective elements drawn from a

  16. Harnessing Disordered-Ensemble Quantum Dynamics for Machine Learning

    Science.gov (United States)

    Fujii, Keisuke; Nakajima, Kohei

    2017-08-01

    The quantum computer has an amazing potential of fast information processing. However, the realization of a digital quantum computer is still a challenging problem requiring highly accurate controls and key application strategies. Here we propose a platform, quantum reservoir computing, to solve these issues successfully by exploiting the natural quantum dynamics of ensemble systems, which are ubiquitous in laboratories nowadays, for machine learning. This framework enables ensemble quantum systems to universally emulate nonlinear dynamical systems including classical chaos. A number of numerical experiments show that quantum systems consisting of 5-7 qubits possess computational capabilities comparable to conventional recurrent neural networks of 100-500 nodes. This discovery opens up a paradigm for information processing with artificial intelligence powered by quantum physics.

  17. Power flow prediction in vibrating systems via model reduction

    Science.gov (United States)

    Li, Xianhui

    This dissertation focuses on power flow prediction in vibrating systems. Reduced order models (ROMs) are built based on rational Krylov model reduction which preserve power flow information in the original systems over a specified frequency band. Stiffness and mass matrices of the ROMs are obtained by projecting the original system matrices onto the subspaces spanned by forced responses. A matrix-free algorithm is designed to construct ROMs directly from the power quantities at selected interpolation frequencies. Strategies for parallel implementation of the algorithm via message passing interface are proposed. The quality of ROMs is iteratively refined according to the error estimate based on residual norms. Band capacity is proposed to provide a priori estimate of the sizes of good quality ROMs. Frequency averaging is recast as ensemble averaging and Cauchy distribution is used to simplify the computation. Besides model reduction for deterministic systems, details of constructing ROMs for parametric and nonparametric random systems are also presented. Case studies have been conducted on testbeds from Harwell-Boeing collections. Input and coupling power flow are computed for the original systems and the ROMs. Good agreement is observed in all cases.

  18. On the contribution of local feedback mechanisms to the range of climate sensitivity in two GCM ensembles

    Energy Technology Data Exchange (ETDEWEB)

    Webb, M.J.; Senior, C.A.; Sexton, D.M.H.; Ingram, W.J.; Williams, K.D.; Ringer, M.A. [Hadley Centre for Climate Prediction and Research, Met Office, Exeter (United Kingdom); McAvaney, B.J.; Colman, R. [Bureau of Meteorology Research Centre (BMRC), Melbourne (Australia); Soden, B.J. [University of Miami, Rosenstiel School for Marine and Atmospheric Science, Miami, FL (United States); Gudgel, R.; Knutson, T. [Geophysical Fluid Dynamics Laboratory (GFDL), Princeton, NJ (United States); Emori, S.; Ogura, T. [National Institute for Environmental Studies (NIES), Tsukuba (Japan); Tsushima, Y. [Japan Agency for Marine-Earth Science and Technology, Frontier Research Center for Global Change (FRCGC), Kanagawa (Japan); Andronova, N. [University of Michigan, Department of Atmospheric, Oceanic and Space Sciences, Ann Arbor, MI (United States); Li, B. [University of Illinois at Urbana-Champaign (UIUC), Department of Atmospheric Sciences, Urbana, IL (United States); Musat, I.; Bony, S. [Institut Pierre Simon Laplace (IPSL), Paris (France); Taylor, K.E. [Program for Climate Model Diagnosis and Intercomparison (PCMDI), Livermore, CA (United States)

    2006-07-15

    Global and local feedback analysis techniques have been applied to two ensembles of mixed layer equilibrium CO{sub 2} doubling climate change experiments, from the CFMIP (Cloud Feedback Model Intercomparison Project) and QUMP (Quantifying Uncertainty in Model Predictions) projects. Neither of these new ensembles shows evidence of a statistically significant change in the ensemble mean or variance in global mean climate sensitivity when compared with the results from the mixed layer models quoted in the Third Assessment Report of the IPCC. Global mean feedback analysis of these two ensembles confirms the large contribution made by inter-model differences in cloud feedbacks to those in climate sensitivity in earlier studies; net cloud feedbacks are responsible for 66% of the inter-model variance in the total feedback in the CFMIP ensemble and 85% in the QUMP ensemble. The ensemble mean global feedback components are all statistically indistinguishable between the two ensembles, except for the clear-sky shortwave feedback which is stronger in the CFMIP ensemble. While ensemble variances of the shortwave cloud feedback and both clear-sky feedback terms are larger in CFMIP, there is considerable overlap in the cloud feedback ranges; QUMP spans 80% or more of the CFMIP ranges in longwave and shortwave cloud feedback. We introduce a local cloud feedback classification system which distinguishes different types of cloud feedbacks on the basis of the relative strengths of their longwave and shortwave components, and interpret these in terms of responses of different cloud types diagnosed by the International Satellite Cloud Climatology Project simulator. In the CFMIP ensemble, areas where low-top cloud changes constitute the largest cloud response are responsible for 59% of the contribution from cloud feedback to the variance in the total feedback. A similar figure is found for the QUMP ensemble. Areas of positive low cloud feedback (associated with reductions in low level

  19. Data Pre-Analysis and Ensemble of Various Artificial Neural Networks for Monthly Streamflow Forecasting

    Directory of Open Access Journals (Sweden)

    Jianzhong Zhou

    2018-05-01

    Full Text Available This paper introduces three artificial neural network (ANN architectures for monthly streamflow forecasting: a radial basis function network, an extreme learning machine, and the Elman network. Three ensemble techniques, a simple average ensemble, a weighted average ensemble, and an ANN-based ensemble, were used to combine the outputs of the individual ANN models. The objective was to highlight the performance of the general regression neural network-based ensemble technique (GNE through an improvement of monthly streamflow forecasting accuracy. Before the construction of an ANN model, data preanalysis techniques, such as empirical wavelet transform (EWT, were exploited to eliminate the oscillations of the streamflow series. Additionally, a theory of chaos phase space reconstruction was used to select the most relevant and important input variables for forecasting. The proposed GNE ensemble model has been applied for the mean monthly streamflow observation data from the Wudongde hydrological station in the Jinsha River Basin, China. Comparisons and analysis of this study have demonstrated that the denoised streamflow time series was less disordered and unsystematic than was suggested by the original time series according to chaos theory. Thus, EWT can be adopted as an effective data preanalysis technique for the prediction of monthly streamflow. Concurrently, the GNE performed better when compared with other ensemble techniques.

  20. Climate Prediction Center (CPC)Ensemble Canonical Correlation Analysis 90-Day Seasonal Forecast of Precipitation

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) precipitation forecast is a 90-day (seasonal) outlook of US surface precipitation anomalies. The ECCA uses...

  1. Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.

    Science.gov (United States)

    Sørensen, Lauge; Nielsen, Mads

    2018-05-15

    The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Nonequilibrium statistical mechanics ensemble method

    CERN Document Server

    Eu, Byung Chan

    1998-01-01

    In this monograph, nonequilibrium statistical mechanics is developed by means of ensemble methods on the basis of the Boltzmann equation, the generic Boltzmann equations for classical and quantum dilute gases, and a generalised Boltzmann equation for dense simple fluids The theories are developed in forms parallel with the equilibrium Gibbs ensemble theory in a way fully consistent with the laws of thermodynamics The generalised hydrodynamics equations are the integral part of the theory and describe the evolution of macroscopic processes in accordance with the laws of thermodynamics of systems far removed from equilibrium Audience This book will be of interest to researchers in the fields of statistical mechanics, condensed matter physics, gas dynamics, fluid dynamics, rheology, irreversible thermodynamics and nonequilibrium phenomena

  3. A Single-column Model Ensemble Approach Applied to the TWP-ICE Experiment

    Science.gov (United States)

    Davies, L.; Jakob, C.; Cheung, K.; DelGenio, A.; Hill, A.; Hume, T.; Keane, R. J.; Komori, T.; Larson, V. E.; Lin, Y.; hide

    2013-01-01

    Single-column models (SCM) are useful test beds for investigating the parameterization schemes of numerical weather prediction and climate models. The usefulness of SCM simulations are limited, however, by the accuracy of the best estimate large-scale observations prescribed. Errors estimating the observations will result in uncertainty in modeled simulations. One method to address the modeled uncertainty is to simulate an ensemble where the ensemble members span observational uncertainty. This study first derives an ensemble of large-scale data for the Tropical Warm Pool International Cloud Experiment (TWP-ICE) based on an estimate of a possible source of error in the best estimate product. These data are then used to carry out simulations with 11 SCM and two cloud-resolving models (CRM). Best estimate simulations are also performed. All models show that moisture-related variables are close to observations and there are limited differences between the best estimate and ensemble mean values. The models, however, show different sensitivities to changes in the forcing particularly when weakly forced. The ensemble simulations highlight important differences in the surface evaporation term of the moisture budget between the SCM and CRM. Differences are also apparent between the models in the ensemble mean vertical structure of cloud variables, while for each model, cloud properties are relatively insensitive to forcing. The ensemble is further used to investigate cloud variables and precipitation and identifies differences between CRM and SCM particularly for relationships involving ice. This study highlights the additional analysis that can be performed using ensemble simulations and hence enables a more complete model investigation compared to using the more traditional single best estimate simulation only.

  4. An Organic Computing Approach to Self-organising Robot Ensembles

    Directory of Open Access Journals (Sweden)

    Sebastian Albrecht von Mammen

    2016-11-01

    Full Text Available Similar to the Autonomous Computing initiative, that has mainly been advancing techniques for self-optimisation focussing on computing systems and infrastructures, Organic Computing (OC has been driving the development of system design concepts and algorithms for self-adaptive systems at large. Examples of application domains include, for instance, traffic management and control, cloud services, communication protocols, and robotic systems. Such an OC system typically consists of a potentially large set of autonomous and self-managed entities, where each entity acts with a local decision horizon. By means of cooperation of the individual entities, the behaviour of the entire ensemble system is derived. In this article, we present our work on how autonomous, adaptive robot ensembles can benefit from OC technology. Our elaborations are aligned with the different layers of an observer/controller framework which provides the foundation for the individuals' adaptivity at system design-level. Relying on an extended Learning Classifier System (XCS in combination with adequate simulation techniques, this basic system design empowers robot individuals to improve their individual and collaborative performances, e.g. by means of adapting to changing goals and conditions.Not only for the sake of generalisability, but also because of its enormous transformative potential, we stage our research in the domain of robot ensembles that are typically comprised of several quad-rotors and that organise themselves to fulfil spatial tasks such as maintenance of building facades or the collaborative search for mobile targets. Our elaborations detail the architectural concept, provide examples of individual self-optimisation as well as of the optimisation of collaborative efforts, and we show how the user can control the ensembles at multiple levels of abstraction. We conclude with a summary of our approach and an outlook on possible future steps.

  5. Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul; Javed, Syed Gibran; Sattar, Mohsin

    2016-06-01

    Early prediction of breast cancer is important for effective treatment and survival. We developed an effective Cost-Sensitive Classifier with GentleBoost Ensemble (Can-CSC-GBE) for the classification of breast cancer using protein amino acid features. In this work, first, discriminant information of the protein sequences related to breast tissue is extracted. Then, the physicochemical properties hydrophobicity and hydrophilicity of amino acids are employed to generate molecule descriptors in different feature spaces. For comparison, we obtained results by combining Cost-Sensitive learning with conventional ensemble of AdaBoostM1 and Bagging. The proposed Can-CSC-GBE system has effectively reduced the misclassification costs and thereby improved the overall classification performance. Our novel approach has highlighted promising results as compared to the state-of-the-art ensemble approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Discussion on Regression Methods Based on Ensemble Learning and Applicability Domains of Linear Submodels.

    Science.gov (United States)

    Kaneko, Hiromasa

    2018-02-26

    To develop a new ensemble learning method and construct highly predictive regression models in chemoinformatics and chemometrics, applicability domains (ADs) are introduced into the ensemble learning process of prediction. When estimating values of an objective variable using subregression models, only the submodels with ADs that cover a query sample, i.e., the sample is inside the model's AD, are used. By constructing submodels and changing a list of selected explanatory variables, the union of the submodels' ADs, which defines the overall AD, becomes large, and the prediction performance is enhanced for diverse compounds. By analyzing a quantitative structure-activity relationship data set and a quantitative structure-property relationship data set, it is confirmed that the ADs can be enlarged and the estimation performance of regression models is improved compared with traditional methods.

  7. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  8. Hartree and Exchange in Ensemble Density Functional Theory: Avoiding the Nonuniqueness Disaster.

    Science.gov (United States)

    Gould, Tim; Pittalis, Stefano

    2017-12-15

    Ensemble density functional theory is a promising method for the efficient and accurate calculation of excitations of quantum systems, at least if useful functionals can be developed to broaden its domain of practical applicability. Here, we introduce a guaranteed single-valued "Hartree-exchange" ensemble density functional, E_{Hx}[n], in terms of the right derivative of the universal ensemble density functional with respect to the coupling constant at vanishing interaction. We show that E_{Hx}[n] is straightforwardly expressible using block eigenvalues of a simple matrix [Eq. (14)]. Specialized expressions for E_{Hx}[n] from the literature, including those involving superpositions of Slater determinants, can now be regarded as originating from the unifying picture presented here. We thus establish a clear and practical description for Hartree and exchange in ensemble systems.

  9. Ensemble methods for handwritten digit recognition

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Liisberg, Christian; Salamon, P.

    1992-01-01

    Neural network ensembles are applied to handwritten digit recognition. The individual networks of the ensemble are combinations of sparse look-up tables (LUTs) with random receptive fields. It is shown that the consensus of a group of networks outperforms the best individual of the ensemble....... It is further shown that it is possible to estimate the ensemble performance as well as the learning curve on a medium-size database. In addition the authors present preliminary analysis of experiments on a large database and show that state-of-the-art performance can be obtained using the ensemble approach...... by optimizing the receptive fields. It is concluded that it is possible to improve performance significantly by introducing moderate-size ensembles; in particular, a 20-25% improvement has been found. The ensemble random LUTs, when trained on a medium-size database, reach a performance (without rejects) of 94...

  10. Ensemble based system for whole-slide prostate cancer probability mapping using color texture features.

    LENUS (Irish Health Repository)

    DiFranco, Matthew D

    2011-01-01

    We present a tile-based approach for producing clinically relevant probability maps of prostatic carcinoma in histological sections from radical prostatectomy. Our methodology incorporates ensemble learning for feature selection and classification on expert-annotated images. Random forest feature selection performed over varying training sets provides a subset of generalized CIEL*a*b* co-occurrence texture features, while sample selection strategies with minimal constraints reduce training data requirements to achieve reliable results. Ensembles of classifiers are built using expert-annotated tiles from training images, and scores for the probability of cancer presence are calculated from the responses of each classifier in the ensemble. Spatial filtering of tile-based texture features prior to classification results in increased heat-map coherence as well as AUC values of 95% using ensembles of either random forests or support vector machines. Our approach is designed for adaptation to different imaging modalities, image features, and histological decision domains.

  11. A multi-scale ensemble-based framework for forecasting compound coastal-riverine flooding: The Hackensack-Passaic watershed and Newark Bay

    Science.gov (United States)

    Saleh, F.; Ramaswamy, V.; Wang, Y.; Georgas, N.; Blumberg, A.; Pullen, J.

    2017-12-01

    Estuarine regions can experience compound impacts from coastal storm surge and riverine flooding. The challenges in forecasting flooding in such areas are multi-faceted due to uncertainties associated with meteorological drivers and interactions between hydrological and coastal processes. The objective of this work is to evaluate how uncertainties from meteorological predictions propagate through an ensemble-based flood prediction framework and translate into uncertainties in simulated inundation extents. A multi-scale framework, consisting of hydrologic, coastal and hydrodynamic models, was used to simulate two extreme flood events at the confluence of the Passaic and Hackensack rivers and Newark Bay. The events were Hurricane Irene (2011), a combination of inland flooding and coastal storm surge, and Hurricane Sandy (2012) where coastal storm surge was the dominant component. The hydrodynamic component of the framework was first forced with measured streamflow and ocean water level data to establish baseline inundation extents with the best available forcing data. The coastal and hydrologic models were then forced with meteorological predictions from 21 ensemble members of the Global Ensemble Forecast System (GEFS) to retrospectively represent potential future conditions up to 96 hours prior to the events. Inundation extents produced by the hydrodynamic model, forced with the 95th percentile of the ensemble-based coastal and hydrologic boundary conditions, were in good agreement with baseline conditions for both events. The USGS reanalysis of Hurricane Sandy inundation extents was encapsulated between the 50th and 95th percentile of the forecasted inundation extents, and that of Hurricane Irene was similar but with caveats associated with data availability and reliability. This work highlights the importance of accounting for meteorological uncertainty to represent a range of possible future inundation extents at high resolution (∼m).

  12. Tailored Random Graph Ensembles

    International Nuclear Information System (INIS)

    Roberts, E S; Annibale, A; Coolen, A C C

    2013-01-01

    Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.

  13. Statistical ensembles for money and debt

    Science.gov (United States)

    Viaggiu, Stefano; Lionetto, Andrea; Bargigli, Leonardo; Longo, Michele

    2012-10-01

    We build a statistical ensemble representation of two economic models describing respectively, in simplified terms, a payment system and a credit market. To this purpose we adopt the Boltzmann-Gibbs distribution where the role of the Hamiltonian is taken by the total money supply (i.e. including money created from debt) of a set of interacting economic agents. As a result, we can read the main thermodynamic quantities in terms of monetary ones. In particular, we define for the credit market model a work term which is related to the impact of monetary policy on credit creation. Furthermore, with our formalism we recover and extend some results concerning the temperature of an economic system, previously presented in the literature by considering only the monetary base as a conserved quantity. Finally, we study the statistical ensemble for the Pareto distribution.

  14. Eigenfunction statistics of Wishart Brownian ensembles

    International Nuclear Information System (INIS)

    Shukla, Pragya

    2017-01-01

    We theoretically analyze the eigenfunction fluctuation measures for a Hermitian ensemble which appears as an intermediate state of the perturbation of a stationary ensemble by another stationary ensemble of Wishart (Laguerre) type. Similar to the perturbation by a Gaussian stationary ensemble, the measures undergo a diffusive dynamics in terms of the perturbation parameter but the energy-dependence of the fluctuations is different in the two cases. This may have important consequences for the eigenfunction dynamics as well as phase transition studies in many areas of complexity where Brownian ensembles appear. (paper)

  15. Towards quantum optics and entanglement with electron spin ensembles in semiconductors

    NARCIS (Netherlands)

    van der Wal, Caspar H.; Sladkov, Maksym

    We discuss a technique and a material system that enable the controlled realization of quantum entanglement between spin-wave modes of electron ensembles in two spatially separated pieces of semiconductor material. The approach uses electron ensembles in GaAs quantum wells that are located inside

  16. Measuring social interaction in music ensembles.

    Science.gov (United States)

    Volpe, Gualtiero; D'Ausilio, Alessandro; Badino, Leonardo; Camurri, Antonio; Fadiga, Luciano

    2016-05-05

    Music ensembles are an ideal test-bed for quantitative analysis of social interaction. Music is an inherently social activity, and music ensembles offer a broad variety of scenarios which are particularly suitable for investigation. Small ensembles, such as string quartets, are deemed a significant example of self-managed teams, where all musicians contribute equally to a task. In bigger ensembles, such as orchestras, the relationship between a leader (the conductor) and a group of followers (the musicians) clearly emerges. This paper presents an overview of recent research on social interaction in music ensembles with a particular focus on (i) studies from cognitive neuroscience; and (ii) studies adopting a computational approach for carrying out automatic quantitative analysis of ensemble music performances. © 2016 The Author(s).

  17. Linking 1D coastal ocean modelling to environmental management: an ensemble approach

    Science.gov (United States)

    Mussap, Giulia; Zavatarelli, Marco; Pinardi, Nadia

    2017-12-01

    The use of a one-dimensional interdisciplinary numerical model of the coastal ocean as a tool contributing to the formulation of ecosystem-based management (EBM) is explored. The focus is on the definition of an experimental design based on ensemble simulations, integrating variability linked to scenarios (characterised by changes in the system forcing) and to the concurrent variation of selected, and poorly constrained, model parameters. The modelling system used was previously specifically designed for the use in "data-rich" areas, so that horizontal dynamics can be resolved by a diagnostic approach and external inputs can be parameterised by nudging schemes properly calibrated. Ensembles determined by changes in the simulated environmental (physical and biogeochemical) dynamics, under joint forcing and parameterisation variations, highlight the uncertainties associated to the application of specific scenarios that are relevant to EBM, providing an assessment of the reliability of the predicted changes. The work has been carried out by implementing the coupled modelling system BFM-POM1D in an area of Gulf of Trieste (northern Adriatic Sea), considered homogeneous from the point of view of hydrological properties, and forcing it by changing climatic (warming) and anthropogenic (reduction of the land-based nutrient input) pressure. Model parameters affected by considerable uncertainties (due to the lack of relevant observations) were varied jointly with the scenarios of change. The resulting large set of ensemble simulations provided a general estimation of the model uncertainties related to the joint variation of pressures and model parameters. The information of the model result variability aimed at conveying efficiently and comprehensibly the information on the uncertainties/reliability of the model results to non-technical EBM planners and stakeholders, in order to have the model-based information effectively contributing to EBM.

  18. A target recognition method for maritime surveillance radars based on hybrid ensemble selection

    Science.gov (United States)

    Fan, Xueman; Hu, Shengliang; He, Jingbo

    2017-11-01

    In order to improve the generalisation ability of the maritime surveillance radar, a novel ensemble selection technique, termed Optimisation and Dynamic Selection (ODS), is proposed. During the optimisation phase, the non-dominated sorting genetic algorithm II for multi-objective optimisation is used to find the Pareto front, i.e. a set of ensembles of classifiers representing different tradeoffs between the classification error and diversity. During the dynamic selection phase, the meta-learning method is used to predict whether a candidate ensemble is competent enough to classify a query instance based on three different aspects, namely, feature space, decision space and the extent of consensus. The classification performance and time complexity of ODS are compared against nine other ensemble methods using a self-built full polarimetric high resolution range profile data-set. The experimental results clearly show the effectiveness of ODS. In addition, the influence of the selection of diversity measures is studied concurrently.

  19. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    Science.gov (United States)

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  20. Improving Climate Projections Using "Intelligent" Ensembles

    Science.gov (United States)

    Baker, Noel C.; Taylor, Patrick C.

    2015-01-01

    Recent changes in the climate system have led to growing concern, especially in communities which are highly vulnerable to resource shortages and weather extremes. There is an urgent need for better climate information to develop solutions and strategies for adapting to a changing climate. Climate models provide excellent tools for studying the current state of climate and making future projections. However, these models are subject to biases created by structural uncertainties. Performance metrics-or the systematic determination of model biases-succinctly quantify aspects of climate model behavior. Efforts to standardize climate model experiments and collect simulation data-such as the Coupled Model Intercomparison Project (CMIP)-provide the means to directly compare and assess model performance. Performance metrics have been used to show that some models reproduce present-day climate better than others. Simulation data from multiple models are often used to add value to projections by creating a consensus projection from the model ensemble, in which each model is given an equal weight. It has been shown that the ensemble mean generally outperforms any single model. It is possible to use unequal weights to produce ensemble means, in which models are weighted based on performance (called "intelligent" ensembles). Can performance metrics be used to improve climate projections? Previous work introduced a framework for comparing the utility of model performance metrics, showing that the best metrics are related to the variance of top-of-atmosphere outgoing longwave radiation. These metrics improve present-day climate simulations of Earth's energy budget using the "intelligent" ensemble method. The current project identifies several approaches for testing whether performance metrics can be applied to future simulations to create "intelligent" ensemble-mean climate projections. It is shown that certain performance metrics test key climate processes in the models, and

  1. The ensemble variance of pure-tone measurements in reverberation rooms

    DEFF Research Database (Denmark)

    Jacobsen, Finn; Molares, Alfonso Rodriguez

    2010-01-01

    Reverberation rooms are often used for measuring the sound power emitted by sources of sound. At medium and high frequencies, where the modal overlap is high, a fairly simple model based on sums of waves from random directions having random phase relations gives good predictions of the ensemble s...

  2. Joys of Community Ensemble Playing: The Case of the Happy Roll Elastic Ensemble in Taiwan

    Science.gov (United States)

    Hsieh, Yuan-Mei; Kao, Kai-Chi

    2012-01-01

    The Happy Roll Elastic Ensemble (HREE) is a community music ensemble supported by Tainan Culture Centre in Taiwan. With enjoyment and friendship as its primary goals, it aims to facilitate the joys of ensemble playing and the spirit of social networking. This article highlights the key aspects of HREE's development in its first two years…

  3. Prediction of the Arctic Oscillation in Boreal Winter by Dynamical Seasonal Forecasting Systems

    Science.gov (United States)

    Kang, Daehyun; Lee, Myong-In; Im, Jungho; Kim, Daehyun; Kim, Hye-Mi; Kang, Hyun-Suk; Schubert, Siegfried D.; Arribas, Alberto; MacLachlan, Craig

    2014-01-01

    This study assesses the skill of boreal winter Arctic Oscillation (AO) predictions with state-of-the-art dynamical ensemble prediction systems (EPSs): GloSea4, CFSv2, GEOS-5, CanCM3, CanCM4, and CM2.1. Long-term reforecasts with the EPSs are used to evaluate how well they represent the AO and to assess the skill of both deterministic and probabilistic forecasts of the AO. The reforecasts reproduce the observed changes in the large-scale patterns of the Northern Hemispheric surface temperature, upper level wind, and precipitation associated with the different phases of the AO. The results demonstrate that most EPSs improve upon persistence skill scores for lead times up to 2 months in boreal winter, suggesting some potential for skillful prediction of the AO and its associated climate anomalies at seasonal time scales. It is also found that the skill of AO forecasts during the recent period (1997-2010) is higher than that of the earlier period (1983-1996).

  4. The online performance estimation framework: heterogeneous ensemble learning for data streams

    NARCIS (Netherlands)

    van Rijn, J.N.; Holmes, G.; Pfahringer, B.; Vanschoren, J.

    2018-01-01

    Ensembles of classifiers are among the best performing classifiers available in many data mining applications, including the mining of data streams. Rather than training one classifier, multiple classifiers are trained, and their predictions are combined according to a given voting schedule. An

  5. An iterative stochastic ensemble method for parameter estimation of subsurface flow models

    International Nuclear Information System (INIS)

    Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim

    2013-01-01

    Parameter estimation for subsurface flow models is an essential step for maximizing the value of numerical simulations for future prediction and the development of effective control strategies. We propose the iterative stochastic ensemble method (ISEM) as a general method for parameter estimation based on stochastic estimation of gradients using an ensemble of directional derivatives. ISEM eliminates the need for adjoint coding and deals with the numerical simulator as a blackbox. The proposed method employs directional derivatives within a Gauss–Newton iteration. The update equation in ISEM resembles the update step in ensemble Kalman filter, however the inverse of the output covariance matrix in ISEM is regularized using standard truncated singular value decomposition or Tikhonov regularization. We also investigate the performance of a set of shrinkage based covariance estimators within ISEM. The proposed method is successfully applied on several nonlinear parameter estimation problems for subsurface flow models. The efficiency of the proposed algorithm is demonstrated by the small size of utilized ensembles and in terms of error convergence rates

  6. An iterative stochastic ensemble method for parameter estimation of subsurface flow models

    KAUST Repository

    Elsheikh, Ahmed H.

    2013-06-01

    Parameter estimation for subsurface flow models is an essential step for maximizing the value of numerical simulations for future prediction and the development of effective control strategies. We propose the iterative stochastic ensemble method (ISEM) as a general method for parameter estimation based on stochastic estimation of gradients using an ensemble of directional derivatives. ISEM eliminates the need for adjoint coding and deals with the numerical simulator as a blackbox. The proposed method employs directional derivatives within a Gauss-Newton iteration. The update equation in ISEM resembles the update step in ensemble Kalman filter, however the inverse of the output covariance matrix in ISEM is regularized using standard truncated singular value decomposition or Tikhonov regularization. We also investigate the performance of a set of shrinkage based covariance estimators within ISEM. The proposed method is successfully applied on several nonlinear parameter estimation problems for subsurface flow models. The efficiency of the proposed algorithm is demonstrated by the small size of utilized ensembles and in terms of error convergence rates. © 2013 Elsevier Inc.

  7. Inhomogeneous ensembles of radical pairs in chemical compasses

    Science.gov (United States)

    Procopio, Maria; Ritz, Thorsten

    2016-11-01

    The biophysical basis for the ability of animals to detect the geomagnetic field and to use it for finding directions remains a mystery of sensory biology. One much debated hypothesis suggests that an ensemble of specialized light-induced radical pair reactions can provide the primary signal for a magnetic compass sensor. The question arises what features of such a radical pair ensemble could be optimized by evolution so as to improve the detection of the direction of weak magnetic fields. Here, we focus on the overlooked aspect of the noise arising from inhomogeneity of copies of biomolecules in a realistic biological environment. Such inhomogeneity leads to variations of the radical pair parameters, thereby deteriorating the signal arising from an ensemble and providing a source of noise. We investigate the effect of variations in hyperfine interactions between different copies of simple radical pairs on the directional response of a compass system. We find that the choice of radical pair parameters greatly influences how strongly the directional response of an ensemble is affected by inhomogeneity.

  8. Benefits of an ultra large and multiresolution ensemble for estimating available wind power

    Science.gov (United States)

    Berndt, Jonas; Hoppe, Charlotte; Elbern, Hendrik

    2016-04-01

    In this study we investigate the benefits of an ultra large ensemble with up to 1000 members including multiple nesting with a target horizontal resolution of 1 km. The ensemble shall be used as a basis to detect events of extreme errors in wind power forecasting. Forecast value is the wind vector at wind turbine hub height (~ 100 m) in the short range (1 to 24 hour). Current wind power forecast systems rest already on NWP ensemble models. However, only calibrated ensembles from meteorological institutions serve as input so far, with limited spatial resolution (˜10 - 80 km) and member number (˜ 50). Perturbations related to the specific merits of wind power production are yet missing. Thus, single extreme error events which are not detected by such ensemble power forecasts occur infrequently. The numerical forecast model used in this study is the Weather Research and Forecasting Model (WRF). Model uncertainties are represented by stochastic parametrization of sub-grid processes via stochastically perturbed parametrization tendencies and in conjunction via the complementary stochastic kinetic-energy backscatter scheme already provided by WRF. We perform continuous ensemble updates by comparing each ensemble member with available observations using a sequential importance resampling filter to improve the model accuracy while maintaining ensemble spread. Additionally, we use different ensemble systems from global models (ECMWF and GFS) as input and boundary conditions to capture different synoptic conditions. Critical weather situations which are connected to extreme error events are located and corresponding perturbation techniques are applied. The demanding computational effort is overcome by utilising the supercomputer JUQUEEN at the Forschungszentrum Juelich.

  9. A Ruby API to query the Ensembl database for genomic features.

    Science.gov (United States)

    Strozzi, Francesco; Aerts, Jan

    2011-04-01

    The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api.

  10. A variational ensemble scheme for noisy image data assimilation

    Science.gov (United States)

    Yang, Yin; Robinson, Cordelia; Heitz, Dominique; Mémin, Etienne

    2014-05-01

    Data assimilation techniques aim at recovering a system state variables trajectory denoted as X, along time from partially observed noisy measurements of the system denoted as Y. These procedures, which couple dynamics and noisy measurements of the system, fulfill indeed a twofold objective. On one hand, they provide a denoising - or reconstruction - procedure of the data through a given model framework and on the other hand, they provide estimation procedures for unknown parameters of the dynamics. A standard variational data assimilation problem can be formulated as the minimization of the following objective function with respect to the initial discrepancy, η, from the background initial guess: δ« J(η(x)) = 1∥Xb (x) - X (t ,x)∥2 + 1 tf∥H(X (t,x ))- Y (t,x)∥2dt. 2 0 0 B 2 t0 R (1) where the observation operator H links the state variable and the measurements. The cost function can be interpreted as the log likelihood function associated to the a posteriori distribution of the state given the past history of measurements and the background. In this work, we aim at studying ensemble based optimal control strategies for data assimilation. Such formulation nicely combines the ingredients of ensemble Kalman filters and variational data assimilation (4DVar). It is also formulated as the minimization of the objective function (1), but similarly to ensemble filter, it introduces in its objective function an empirical ensemble-based background-error covariance defined as: B ≡ )(Xb - )T>. (2) Thus, it works in an off-line smoothing mode rather than on the fly like sequential filters. Such resulting ensemble variational data assimilation technique corresponds to a relatively new family of methods [1,2,3]. It presents two main advantages: first, it does not require anymore to construct the adjoint of the dynamics tangent linear operator, which is a considerable advantage with respect to the method's implementation, and second, it enables the handling of a flow

  11. A model ensemble for projecting multi‐decadal coastal cliff retreat during the 21st century

    Science.gov (United States)

    Limber, Patrick; Barnard, Patrick; Vitousek, Sean; Erikson, Li

    2018-01-01

    Sea cliff retreat rates are expected to accelerate with rising sea levels during the 21st century. Here we develop an approach for a multi‐model ensemble that efficiently projects time‐averaged sea cliff retreat over multi‐decadal time scales and large (>50 km) spatial scales. The ensemble consists of five simple 1‐D models adapted from the literature that relate sea cliff retreat to wave impacts, sea level rise (SLR), historical cliff behavior, and cross‐shore profile geometry. Ensemble predictions are based on Monte Carlo simulations of each individual model, which account for the uncertainty of model parameters. The consensus of the individual models also weights uncertainty, such that uncertainty is greater when predictions from different models do not agree. A calibrated, but unvalidated, ensemble was applied to the 475 km‐long coastline of Southern California (USA), with 4 SLR scenarios of 0.5, 0.93, 1.5, and 2 m by 2100. Results suggest that future retreat rates could increase relative to mean historical rates by more than two‐fold for the higher SLR scenarios, causing an average total land loss of 19 – 41 m by 2100. However, model uncertainty ranges from +/‐ 5 – 15 m, reflecting the inherent difficulties of projecting cliff retreat over multiple decades. To enhance ensemble performance, future work could include weighting each model by its skill in matching observations in different morphological settings

  12. Ensemble modeling of E. coli in the Charles River, Boston, Massachusetts, USA.

    Science.gov (United States)

    Hellweger, F L

    2007-01-01

    A case study of ensemble modeling of Escherichia coli (E. coli) densities in surface waters in the context of public health risk prediction is presented. The output of two different models, mechanistic and empirical, are combined and compared to data. The mechanistic model is a high-resolution, time-variable, three-dimensional coupled hydrodynamic and water quality model. It generally reproduces the mechanisms of E. coli fate and transport in the river, including the presence and absence of a plume in the study area under similar input, but different hydrodynamic conditions caused by the operation of a downstream dam and wind. At the time series station, the model has a root mean square error (RMSE) of 370 CFU/100mL, a total error rate (with respect to the EPA-recommended single sample criteria value of 235 CFU/100mL) (TER) of 15% and negative error rate (NER) of 30%. The empirical model is based on multiple linear regression using the forcing functions of the mechanistic model as independent variables. It has better overall performance (at the time series station), due to a strong correlation of E. coli density with upstream inflow for this time period (RMSE =200 CFU/100mL, TER =13%, NER =1.6%). However, the model is mechanistically incorrect in that it predicts decreasing densities with increasing Combined Sewer Overflow (CSO) input. The two models are fundamentally different and their errors are uncorrelated (R(2) =0.02), which motivates their combination in an ensemble. Two combination approaches, a geometric mean ensemble (GME) and an "either exceeds" ensemble (EEE), are explored. The GME model outperforms the mechanistic and empirical models in terms of RMSE (190 CFU/100mL) and TER (11%), but has a higher NER (23%). The EEE has relatively high TER (16%), but low NER (0.8%) and may be the best method for a conservative prediction. The study demonstrates the potential utility of ensemble modeling for pathogen indicators, but significant further research is

  13. An OSSE Study for Deep Argo Array using the GFDL Ensemble Coupled Data Assimilation System

    Science.gov (United States)

    Chang, You-Soon; Zhang, Shaoqing; Rosati, Anthony; Vecchi, Gabriel A.; Yang, Xiaosong

    2018-03-01

    An observing system simulation experiment (OSSE) using an ensemble coupled data assimilation system was designed to investigate the impact of deep ocean Argo profile assimilation in a biased numerical climate system. Based on the modern Argo observational array and an artificial extension to full depth, "observations" drawn from one coupled general circulation model (CM2.0) were assimilated into another model (CM2.1). Our results showed that coupled data assimilation with simultaneous atmospheric and oceanic constraints plays a significant role in preventing deep ocean drift. However, the extension of the Argo array to full depth did not significantly improve the quality of the oceanic climate estimation within the bias magnitude in the twin experiment. Even in the "identical" twin experiment for the deep Argo array from the same model (CM2.1) with the assimilation model, no significant changes were shown in the deep ocean, such as in the Atlantic meridional overturning circulation and the Antarctic bottom water cell. The small ensemble spread and corresponding weak constraints by the deep Argo profiles with medium spatial and temporal resolution may explain why the deep Argo profiles did not improve the deep ocean features in the assimilation system. Additional studies using different assimilation methods with improved spatial and temporal resolution of the deep Argo array are necessary in order to more thoroughly understand the impact of the deep Argo array on the assimilation system.

  14. A multidimensional pseudospectral method for optimal control of quantum ensembles

    International Nuclear Information System (INIS)

    Ruths, Justin; Li, Jr-Shin

    2011-01-01

    In our previous work, we have shown that the pseudospectral method is an effective and flexible computation scheme for deriving pulses for optimal control of quantum systems. In practice, however, quantum systems often exhibit variation in the parameters that characterize the system dynamics. This leads us to consider the control of an ensemble (or continuum) of quantum systems indexed by the system parameters that show variation. We cast the design of pulses as an optimal ensemble control problem and demonstrate a multidimensional pseudospectral method with several challenging examples of both closed and open quantum systems from nuclear magnetic resonance spectroscopy in liquid. We give particular attention to the ability to derive experimentally viable pulses of minimum energy or duration.

  15. The Predictability of Dry-Season Precipitation in Tropical West Africa

    Science.gov (United States)

    Knippertz, P.; Davis, J.; Fink, A. H.

    2012-04-01

    Precipitation during the boreal winter dry season in tropical West Africa is rare but occasionally connected to high-impacts for the local population. Previous work has shown that these events are usually connected to a trough over northwestern Africa, an extensive cloud plume on its eastern side, unusual precipitation at the northern and western fringes of the Sahara, and reduced surface pressure over the southern Sahara and Sahel, which allows an inflow of moist southerlies from the Gulf of Guinea to feed the unusual dry-season rainfalls. These results also suggest that the extratropical influence enhances the predictability of these events on the synoptic timescale. Here we further investigate this question for the 11 dry seasons (November-March) 1998/99-2008/09 using rainfall estimates from TRMM (Tropical Rainfall Measuring Mission) and GPCP (Global Precipitation Climatology Project), and operational ensemble predictions from the European Centre for Medium-Range Forecasts (ECMWF). All fields are averaged over the study area 7.5-15°N, 10°W-10°E that spans most of southern West Africa. For each 0000 UTC analysis time, the daily precipitation estimates are accumulated to pentads and compared with 120-hour predictions starting at the same time. Compared to TRMM, the ensemble mean shows a weak positive bias, whereas there is a substantial negative bias with regard to GPCP. Temporal correlations reach a high value of 0.8 for both datasets, showing similar synoptic variability despite the differences in total amount. Standard probabilistic evaluation methods such as relative operating characteristic (ROC) diagrams indicate remarkably good reliability, resolution and skill, particularly for lower precipitation thresholds. Not surprisingly, forecasts cluster at low probabilities for higher thresholds, but the reliability and ROC score are still reasonably high. The results show that global ensemble prediction systems are capable to predict dry-season rainfall events

  16. The classicality and quantumness of a quantum ensemble

    International Nuclear Information System (INIS)

    Zhu Xuanmin; Pang Shengshi; Wu Shengjun; Liu Quanhui

    2011-01-01

    In this Letter, we investigate the classicality and quantumness of a quantum ensemble. We define a quantity called ensemble classicality based on classical cloning strategy (ECCC) to characterize how classical a quantum ensemble is. An ensemble of commuting states has a unit ECCC, while a general ensemble can have a ECCC less than 1. We also study how quantum an ensemble is by defining a related quantity called quantumness. We find that the classicality of an ensemble is closely related to how perfectly the ensemble can be cloned, and that the quantumness of the ensemble used in a quantum key distribution (QKD) protocol is exactly the attainable lower bound of the error rate in the sifted key. - Highlights: → A quantity is defined to characterize how classical a quantum ensemble is. → The classicality of an ensemble is closely related to the cloning performance. → Another quantity is also defined to investigate how quantum an ensemble is. → This quantity gives the lower bound of the error rate in a QKD protocol.

  17. Ensemble Streamflow Forecast Improvements in NYC's Operations Support Tool

    Science.gov (United States)

    Wang, L.; Weiss, W. J.; Porter, J.; Schaake, J. C.; Day, G. N.; Sheer, D. P.

    2013-12-01

    Like most other water supply utilities, New York City's Department of Environmental Protection (DEP) has operational challenges associated with drought and wet weather events. During drought conditions, DEP must maintain water supply reliability to 9 million customers as well as meet environmental release requirements downstream of its reservoirs. During and after wet weather events, DEP must maintain turbidity compliance in its unfiltered Catskill and Delaware reservoir systems and minimize spills to mitigate downstream flooding. Proactive reservoir management - such as release restrictions to prepare for a drought or preventative drawdown in advance of a large storm - can alleviate negative impacts associated with extreme events. It is important for water managers to understand the risks associated with proactive operations so unintended consequences such as endangering water supply reliability with excessive drawdown prior to a storm event are minimized. Probabilistic hydrologic forecasts are a critical tool in quantifying these risks and allow water managers to make more informed operational decisions. DEP has recently completed development of an Operations Support Tool (OST) that integrates ensemble streamflow forecasts, real-time observations, and a reservoir system operations model into a user-friendly graphical interface that allows its water managers to take robust and defensible proactive measures in the face of challenging system conditions. Since initial development of OST was first presented at the 2011 AGU Fall Meeting, significant improvements have been made to the forecast system. First, the monthly AR1 forecasts ('Hirsch method') were upgraded with a generalized linear model (GLM) utilizing historical daily correlations ('Extended Hirsch method' or 'eHirsch'). The development of eHirsch forecasts improved predictive skill over the Hirsch method in the first week to a month from the forecast date and produced more realistic hydrographs on the tail

  18. Predicting Heat Stress to Inform Reef Management: NOAA Coral Reef Watch's 4-Month Coral Bleaching Outlook

    Directory of Open Access Journals (Sweden)

    Gang Liu

    2018-03-01

    Full Text Available The U.S. National Oceanic and Atmospheric Administration's (NOAA Coral Reef Watch (CRW operates a global 4-Month Coral Bleaching Outlook system for shallow-water coral reefs in collaboration with NOAA's National Centers for Environmental Prediction (NCEP. The Outlooks are generated by applying the algorithm used in CRW's operational satellite coral bleaching heat stress monitoring, with slight modifications, to the sea surface temperature (SST predictions from NCEP's operational Climate Forecast System Version 2 (CFSv2. Once a week, the probability of heat stress capable of causing mass coral bleaching is predicted for 4-months in advance. Each day, CFSv2 generates an ensemble of 16 forecasts, with nine runs out to 45-days, three runs out to 3-months, and four runs out to 9-months. This results in 28–112 ensemble members produced each week. A composite for each predicted week is derived from daily predictions within each ensemble member. The probability of each of four heat stress ranges (Watch and higher, Warning and higher, Alert Level 1 and higher, and Alert Level 2 is determined from all the available ensemble members for the week to form the weekly probabilistic Outlook. The probabilistic 4-Month Outlook is the highest weekly probability predicted among all the weekly Outlooks during a 4-month period for each of the stress ranges. An initial qualitative skill analysis of the Outlooks for 2011–2015, compared with CRW's satellite-based coral bleaching heat stress products, indicated the Outlook has performed well with high hit rates and low miss rates for most coral reef areas. Regions identified with high false alarm rates will guide future improvements. This Outlook system, as the first and only freely available global coral bleaching prediction system, has been providing critical early warning to marine resource managers, scientists, and decision makers around the world to guide management, protection, and monitoring of coral reefs

  19. Relaxation in a two-body Fermi-Pasta-Ulam system in the canonical ensemble

    Science.gov (United States)

    Sen, Surajit; Barrett, Tyler

    The study of the dynamics of the Fermi-Pasta-Ulam (FPU) chain remains a challenging problem. Inspired by the recent work of Onorato et al. on thermalization in the FPU system, we report a study of relaxation processes in a two-body FPU system in the canonical ensemble. The studies have been carried out using the Recurrence Relations Method introduced by Zwanzig, Mori, Lee and others. We have obtained exact analytical expressions for the first thirteen levels of the continued fraction representation of the Laplace transformed velocity autocorrelation function of the system. Using simple and reasonable extrapolation schemes and known limits we are able to estimate the relaxation behavior of the oscillators in the two-body FPU system and recover the expected behavior in the harmonic limit. Generalizations of the calculations to larger systems will be discussed.

  20. Bio-Optical Data Assimilation With Observational Error Covariance Derived From an Ensemble of Satellite Images

    Science.gov (United States)

    Shulman, Igor; Gould, Richard W.; Frolov, Sergey; McCarthy, Sean; Penta, Brad; Anderson, Stephanie; Sakalaukus, Peter

    2018-03-01

    An ensemble-based approach to specify observational error covariance in the data assimilation of satellite bio-optical properties is proposed. The observational error covariance is derived from statistical properties of the generated ensemble of satellite MODIS-Aqua chlorophyll (Chl) images. The proposed observational error covariance is used in the Optimal Interpolation scheme for the assimilation of MODIS-Aqua Chl observations. The forecast error covariance is specified in the subspace of the multivariate (bio-optical, physical) empirical orthogonal functions (EOFs) estimated from a month-long model run. The assimilation of surface MODIS-Aqua Chl improved surface and subsurface model Chl predictions. Comparisons with surface and subsurface water samples demonstrate that data assimilation run with the proposed observational error covariance has higher RMSE than the data assimilation run with "optimistic" assumption about observational errors (10% of the ensemble mean), but has smaller or comparable RMSE than data assimilation run with an assumption that observational errors equal to 35% of the ensemble mean (the target error for satellite data product for chlorophyll). Also, with the assimilation of the MODIS-Aqua Chl data, the RMSE between observed and model-predicted fractions of diatoms to the total phytoplankton is reduced by a factor of two in comparison to the nonassimilative run.