Combining 2-m temperature nowcasting and short range ensemble forecasting
A. Kann
2011-12-01
During recent years, numerical ensemble prediction systems have become an important tool for estimating the uncertainties of dynamical and physical processes as represented in numerical weather models. The latest generation of limited area ensemble prediction systems (LAM-EPSs allows for probabilistic forecasts at high resolution in both space and time. However, these systems still suffer from systematic deficiencies. Especially for nowcasting (0–6 h applications the ensemble spread is smaller than the actual forecast error. This paper tries to generate probabilistic short range 2-m temperature forecasts by combining a state-of-the-art nowcasting method and a limited area ensemble system, and compares the results with statistical methods. The Integrated Nowcasting Through Comprehensive Analysis (INCA system, which has been in operation at the Central Institute for Meteorology and Geodynamics (ZAMG since 2006 (Haiden et al., 2011, provides short range deterministic forecasts at high temporal (15 min–60 min and spatial (1 km resolution. An INCA Ensemble (INCA-EPS of 2-m temperature forecasts is constructed by applying a dynamical approach, a statistical approach, and a combined dynamic-statistical method. The dynamical method takes uncertainty information (i.e. ensemble variance from the operational limited area ensemble system ALADIN-LAEF (Aire Limitée Adaptation Dynamique Développement InterNational Limited Area Ensemble Forecasting which is running operationally at ZAMG (Wang et al., 2011. The purely statistical method assumes a well-calibrated spread-skill relation and applies ensemble spread according to the skill of the INCA forecast of the most recent past. The combined dynamic-statistical approach adapts the ensemble variance gained from ALADIN-LAEF with non-homogeneous Gaussian regression (NGR which yields a statistical correction of the first and second moment (mean bias and dispersion for Gaussian distributed continuous variables.
Viney, N.R.; Bormann, H.; Breuer, L.; Bronstert, A.; Croke, B.F.W.; Frede, H.; Graff, T.; Hubrechts, L.; Huisman, J.A.; Jakeman, A.J.; Kite, G.W.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Willems, P.
2009-01-01
This paper reports on a project to compare predictions from a range of catchment models applied to a mesoscale river basin in central Germany and to assess various ensemble predictions of catchment streamflow. The models encompass a large range in inherent complexity and input requirements. In approximate order of decreasing complexity, they are DHSVM, MIKE-SHE, TOPLATS, WASIM-ETH, SWAT, PRMS, SLURP, HBV, LASCAM and IHACRES. The models are calibrated twice using different sets of input data. The two predictions from each model are then combined by simple averaging to produce a single-model ensemble. The 10 resulting single-model ensembles are combined in various ways to produce multi-model ensemble predictions. Both the single-model ensembles and the multi-model ensembles are shown to give predictions that are generally superior to those of their respective constituent models, both during a 7-year calibration period and a 9-year validation period. This occurs despite a considerable disparity in performance of the individual models. Even the weakest of models is shown to contribute useful information to the ensembles they are part of. The best model combination methods are a trimmed mean (constructed using the central four or six predictions each day) and a weighted mean ensemble (with weights calculated from calibration performance) that places relatively large weights on the better performing models. Conditional ensembles, in which separate model weights are used in different system states (e.g. summer and winter, high and low flows) generally yield little improvement over the weighted mean ensemble. However a conditional ensemble that discriminates between rising and receding flows shows moderate improvement. An analysis of ensemble predictions shows that the best ensembles are not necessarily those containing the best individual models. Conversely, it appears that some models that predict well individually do not necessarily combine well with other models in
Neuronal and muscular electrical signals contain useful information about the neuromuscular system, with which researchers have been investigating the relationship of various neurological disorders and the neuromuscular system. However, neuromuscular signals can be critically contaminated by cardiac electrical activity (CEA) such as the electrocardiogram (ECG) which confounds data analysis. The purpose of our study is to provide a method for removing cardiac electrical artifacts from the neuromuscular signals recorded. We propose a new method for cardiac artifact removal which modifies the algorithm combining ensemble empirical mode decomposition (EEMD) and independent component analysis (ICA). We compare our approach with a cubic smoothing spline method and the previous combined EEMD and ICA for various signal-to-noise ratio measures in simulated noisy physiological signals using a surface electromyogram (sEMG). Finally, we apply the proposed method to two real-life sets of data such as sEMG with ECG artifacts and ambulatory dog cardiac autonomic nervous signals measured from the ganglia near the heart, which are also contaminated with CEA. Our method can not only extract and remove artifacts, but can also preserve the spectral content of the neuromuscular signals. (paper)
On Ensemble Nonlinear Kalman Filtering with Symmetric Analysis Ensembles
Luo, Xiaodong
2010-09-19
The ensemble square root filter (EnSRF) [1, 2, 3, 4] is a popular method for data assimilation in high dimensional systems (e.g., geophysics models). Essentially the EnSRF is a Monte Carlo implementation of the conventional Kalman filter (KF) [5, 6]. It is mainly different from the KF at the prediction steps, where it is some ensembles, rather then the means and covariance matrices, of the system state that are propagated forward. In doing this, the EnSRF is computationally more efficient than the KF, since propagating a covariance matrix forward in high dimensional systems is prohibitively expensive. In addition, the EnSRF is also very convenient in implementation. By propagating the ensembles of the system state, the EnSRF can be directly applied to nonlinear systems without any change in comparison to the assimilation procedures in linear systems. However, by adopting the Monte Carlo method, the EnSRF also incurs certain sampling errors. One way to alleviate this problem is to introduce certain symmetry to the ensembles, which can reduce the sampling errors and spurious modes in evaluation of the means and covariances of the ensembles [7]. In this contribution, we present two methods to produce symmetric ensembles. One is based on the unscented transform [8, 9], which leads to the unscented Kalman filter (UKF) [8, 9] and its variant, the ensemble unscented Kalman filter (EnUKF) [7]. The other is based on Stirling’s interpolation formula (SIF), which results in the divided difference filter (DDF) [10]. Here we propose a simplified divided difference filter (sDDF) in the context of ensemble filtering. The similarity and difference between the sDDF and the EnUKF will be discussed. Numerical experiments will also be conducted to investigate the performance of the sDDF and the EnUKF, and compare them to a well‐established EnSRF, the ensemble transform Kalman filter (ETKF) [2].
Space Applications for Ensemble Detection and Analysis Project
National Aeronautics and Space Administration — Ensemble Detection is both a measurement technique and analysis tool. Like a prism that separates light into spectral bands, an ensemble detector mixes a signal...
Gradient Flow Analysis on MILC HISQ Ensembles
Bazavov, A; Brown, N; DeTar, C; Foley, J; Gottlieb, Steven; Heller, U M; Hetrick, J E; Komijani, J; Laiho, J; Levkova, L; Oktay, M; Sugar, R L; Toussaint, D; Van de Water, R S; Zhou, R
2014-01-01
We report on a preliminary scale determination with gradient-flow techniques on the $N_f = 2 + 1 + 1$ HISQ ensembles generated by the MILC collaboration. The ensembles include four lattice spacings, ranging from 0.15 to 0.06 fm, and both physical and unphysical values of the quark masses. The scales $\\sqrt{t_0}/a$ and $w_0/a$ are computed using Symanzik flow and the cloverleaf definition of $\\langle E \\rangle$ on each ensemble. Then both scales and the meson masses $aM_\\pi$ and $aM_K$ are adjusted for mistunings in the charm mass. Using a combination of continuum chiral perturbation theory and a Taylor series ansatz in the lattice spacing, the results are simultaneously extrapolated to the continuum and interpolated to physical quark masses. Our preliminary results are $\\sqrt{t_0} = 0.1422(7)$fm and $w_0 = 0.1732(10)$fm. We also find the continuum mass-dependence of $w_0$.
Impact of hybrid GSI analysis using ETR ensembles
V S Prasad; C J Johny
2016-04-01
Performance of a hybrid assimilation system combining 3D Var based NGFS (NCMRWF Global ForecastSystem) with ETR (Ensemble Transform with Rescaling) based Global Ensemble Forecast (GEFS) ofresolution T-190L28 is investigated. The experiment is conducted for a period of one week in June 2013and forecast skills over different spatial domains are compared with respect to mean analysis state.Rainfall forecast is verified over Indian region against combined observations of IMD and NCMRWF.Hybrid assimilation produced marginal improvements in overall forecast skill in comparison with 3DVar. Hybrid experiment made significant improvement in wind forecasts in all the regions on verificationagainst mean analysis. The verification of forecasts with radiosonde observations also show improvementin wind forecasts with the hybrid assimilation. On verification against observations, hybrid experimentshows more improvement in temperature and wind forecasts at upper levels. Both hybrid and operational3D Var failed in prediction of extreme rainfall event over Uttarakhand on 17 June, 2013.
Re, Matteo; Valentini, Giorgio
2012-03-01
proposed to explain the characteristics and the successful application of ensembles to different application domains. For instance, Allwein, Schapire, and Singer interpreted the improved generalization capabilities of ensembles of learning machines in the framework of large margin classifiers [4,177], Kleinberg in the context of stochastic discrimination theory [112], and Breiman and Friedman in the light of the bias-variance analysis borrowed from classical statistics [21,70]. Empirical studies showed that both in classification and regression problems, ensembles improve on single learning machines, and moreover large experimental studies compared the effectiveness of different ensemble methods on benchmark data sets [10,11,49,188]. The interest in this research area is motivated also by the availability of very fast computers and networks of workstations at a relatively low cost that allow the implementation and the experimentation of complex ensemble methods using off-the-shelf computer platforms. However, as explained in Section 26.2 there are deeper reasons to use ensembles of learning machines, motivated by the intrinsic characteristics of the ensemble methods. The main aim of this chapter is to introduce ensemble methods and to provide an overview and a bibliography of the main areas of research, without pretending to be exhaustive or to explain the detailed characteristics of each ensemble method. The paper is organized as follows. In the next section, the main theoretical and practical reasons for combining multiple learners are introduced. Section 26.3 depicts the main taxonomies on ensemble methods proposed in the literature. In Section 26.4 and 26.5, we present an overview of the main supervised ensemble methods reported in the literature, adopting a simple taxonomy, originally proposed in Ref. [201]. Applications of ensemble methods are only marginally considered, but a specific section on some relevant applications of ensemble methods in astronomy and
Re, Matteo; Valentini, Giorgio
proposed to explain the characteristics and the successful application of ensembles to different application domains. For instance, Allwein, Schapire, and Singer interpreted the improved generalization capabilities of ensembles of learning machines in the framework of large margin classifiers [4,177], Kleinberg in the context of stochastic discrimination theory [112], and Breiman and Friedman in the light of the bias-variance analysis borrowed from classical statistics [21,70]. Empirical studies showed that both in classification and regression problems, ensembles improve on single learning machines, and moreover large experimental studies compared the effectiveness of different ensemble methods on benchmark data sets [10,11,49,188]. The interest in this research area is motivated also by the availability of very fast computers and networks of workstations at a relatively low cost that allow the implementation and the experimentation of complex ensemble methods using off-the-shelf computer platforms. However, as explained in Section 26.2 there are deeper reasons to use ensembles of learning machines, motivated by the intrinsic characteristics of the ensemble methods. The main aim of this chapter is to introduce ensemble methods and to provide an overview and a bibliography of the main areas of research, without pretending to be exhaustive or to explain the detailed characteristics of each ensemble method. The paper is organized as follows. In the next section, the main theoretical and practical reasons for combining multiple learners are introduced. Section 26.3 depicts the main taxonomies on ensemble methods proposed in the literature. In Section 26.4 and 26.5, we present an overview of the main supervised ensemble methods reported in the literature, adopting a simple taxonomy, originally proposed in Ref. [201]. Applications of ensemble methods are only marginally considered, but a specific section on some relevant applications of ensemble methods in astronomy and
An educational model for ensemble streamflow simulation and uncertainty analysis
A. AghaKouchak
2013-02-01
Full Text Available This paper presents the hands-on modeling toolbox, HBV-Ensemble, designed as a complement to theoretical hydrology lectures, to teach hydrological processes and their uncertainties. The HBV-Ensemble can be used for in-class lab practices and homework assignments, and assessment of students' understanding of hydrological processes. Using this modeling toolbox, students can gain more insights into how hydrological processes (e.g., precipitation, snowmelt and snow accumulation, soil moisture, evapotranspiration and runoff generation are interconnected. The educational toolbox includes a MATLAB Graphical User Interface (GUI and an ensemble simulation scheme that can be used for teaching uncertainty analysis, parameter estimation, ensemble simulation and model sensitivity. HBV-Ensemble was administered in a class for both in-class instruction and a final project, and students submitted their feedback about the toolbox. The results indicate that this educational software had a positive impact on students understanding and knowledge of uncertainty in hydrological modeling.
Vrugt, Jasper A [Los Alamos National Laboratory; Wohling, Thomas [NON LANL
2008-01-01
Most studies in vadose zone hydrology use a single conceptual model for predictive inference and analysis. Focusing on the outcome of a single model is prone to statistical bias and underestimation of uncertainty. In this study, we combine multi-objective optimization and Bayesian Model Averaging (BMA) to generate forecast ensembles of soil hydraulic models. To illustrate our method, we use observed tensiometric pressure head data at three different depths in a layered vadose zone of volcanic origin in New Zealand. A set of seven different soil hydraulic models is calibrated using a multi-objective formulation with three different objective functions that each measure the mismatch between observed and predicted soil water pressure head at one specific depth. The Pareto solution space corresponding to these three objectives is estimated with AMALGAM, and used to generate four different model ensembles. These ensembles are post-processed with BMA and used for predictive analysis and uncertainty estimation. Our most important conclusions for the vadose zone under consideration are: (1) the mean BMA forecast exhibits similar predictive capabilities as the best individual performing soil hydraulic model, (2) the size of the BMA uncertainty ranges increase with increasing depth and dryness in the soil profile, (3) the best performing ensemble corresponds to the compromise (or balanced) solution of the three-objective Pareto surface, and (4) the combined multi-objective optimization and BMA framework proposed in this paper is very useful to generate forecast ensembles of soil hydraulic models.
A multi-model ensemble method that combines imperfect models through learning
Berge, L.A.; F. M. Selten; Wiegerinck, W.; Duane, G. S.
2010-01-01
In the current multi-model ensemble approach climate model simulations are combined a posteriori. In the method of this study the models in the ensemble exchange information during simulations and learn from historical observations to combine their strengths into a best representation of the observed climate. The method is developed and tested in the context of small chaotic dynamical systems, like the Lorenz 63 system. Imperfect models are created by perturbing the standard parameter ...
J. I. Rubin; Reid, J. S.; Hansen, J A; Anderson, J. L.; Collins, N.; Hoar, T. J.; Hogan, T; Lynch, P.; McLay, J; Reynolds, C. A.; W. R. Sessions; D. L. Westphal; Zhang, J.
2015-01-01
An ensemble-based forecast and data assimilation system has been developed for use in Navy aerosol forecasting. The system makes use of an ensemble of the Navy Aerosol Analysis Prediction System (ENAAPS) at 1° × 1°, combined with an Ensemble Adjustment Kalman Filter from NCAR's Data Assimilation Research Testbed (DART). The base ENAAPS-DART system discussed in this work utilizes the Navy Operational Global Analysis Prediction System (NOGAPS) meteorological ensemble to ...
Ensemble vs. time averages in financial time series analysis
Seemann, Lars; Hua, Jia-Chen; McCauley, Joseph L.; Gunaratne, Gemunu H.
2012-12-01
Empirical analysis of financial time series suggests that the underlying stochastic dynamics are not only non-stationary, but also exhibit non-stationary increments. However, financial time series are commonly analyzed using the sliding interval technique that assumes stationary increments. We propose an alternative approach that is based on an ensemble over trading days. To determine the effects of time averaging techniques on analysis outcomes, we create an intraday activity model that exhibits periodic variable diffusion dynamics and we assess the model data using both ensemble and time averaging techniques. We find that ensemble averaging techniques detect the underlying dynamics correctly, whereas sliding intervals approaches fail. As many traded assets exhibit characteristic intraday volatility patterns, our work implies that ensemble averages approaches will yield new insight into the study of financial markets’ dynamics.
Ensemble Methods in Data Mining Improving Accuracy Through Combining Predictions
Seni, Giovanni
2010-01-01
This book is aimed at novice and advanced analytic researchers and practitioners -- especially in Engineering, Statistics, and Computer Science. Those with little exposure to ensembles will learn why and how to employ this breakthrough method, and advanced practitioners will gain insight into building even more powerful models. Throughout, snippets of code in R are provided to illustrate the algorithms described and to encourage the reader to try the techniques. The authors are industry experts in data mining and machine learning who are also adjunct professors and popular speakers. Although e
An educational model for ensemble streamflow simulation and uncertainty analysis
A. AghaKouchak
2012-06-01
Full Text Available This paper presents a hands-on modeling toolbox, HBV-Ensemble, designed as a complement to theoretical hydrology lectures, to teach hydrological processes and their uncertainties. The HBV-Ensemble can be used for in-class lab practices and homework assignments, and assessment of students' understanding of hydrological processes. Using this model, students can gain more insights into how hydrological processes (e.g., precipitation, snowmelt and snow accumulation, soil moisture, evapotranspiration and runoff generation are interconnected. The model includes a MATLAB Graphical User Interface (GUI and an ensemble simulation scheme that can be used for not only hydrological processes, but also for teaching uncertainty analysis, parameter estimation, ensemble simulation and model sensitivity.
Maximization of seasonal forecasts performance combining Grand Multi-Model Ensembles
Alessandri, Andrea; De Felice, Matteo; Catalano, Franco; Lee, Doo Young; Yoo, Jin Ho; Lee, June-Yi; Wang, Bin
2014-05-01
Multi-Model Ensembles (MMEs) are powerful tools in dynamical climate prediction as they account for the overconfidence and the uncertainties related to single-model errors. Previous works suggested that the potential benefit that can be expected by using a MME amplify with the increase of the independence of the contributing Seasonal Prediction Systems. In this work we combine the two Multi Model Ensemble (MME) Seasonal Prediction Systems (SPSs) independently developed by the European (ENSEMBLES) and by the Asian-Pacific (CliPAS/APCC) communities. To this aim, all the possible multi-model combinations obtained by putting together the 5 models from ENSEMBLES and the 11 models from CliPAS/APCC have been evaluated. The grand ENSEMBLES-CliPAS/APCC Multi-Model enhances significantly the skill compared to previous estimates from the contributing MMEs. The combinations of SPSs maximizing the skill that is currently attainable for specific predictands/phenomena is evaluated. Our results show that, in general, the better combinations of SPSs are obtained by mixing ENSEMBLES and CliPAS/APCC models and that only a limited number of SPSs is required to obtain the maximum performance. The number and selection of models that perform better is usually different depending on the region/phenomenon under consideration. As an example for the tropical Pacific, the maximum performance is obtained with only the combination of 5-to-6 SPSs from the grand ENSEMBLES-CliPAS/APCC MME. With particular focus over Tropical Pacific, the relationship between performance and bias of the grand-MME combinations is evaluated. The skill of the grand-MME combinations over Euro-Mediterranean and East-Asia regions is further evaluated as a function of the capability of the selected contributing SPSs to forecast anomalies of the Polar/Siberian highs during winter and of the Asian summer monsoon precipitation during summer. Our results indicate that, combining SPSs from independent MME sources is a good
Sebastian Sippel
2015-09-01
In conclusion, our study shows that EVT and empirical estimates based on numerical simulations can indeed be used to productively inform each other, for instance to derive appropriate EVT parameters for short observational time series. Further, the combination of ensemble simulations with EVT allows us to significantly reduce the number of simulations needed for statements about the tails.
ANALYSIS OF SST IMAGES BY WEIGHTED ENSEMBLE TRANSFORM KALMAN FILTER
Sai, Gorthi; Beyou, Sébastien; Memin, Etienne
2011-01-01
International audience This paper presents a novel, efficient scheme for the analysis of Sea Surface Temperature (SST) ocean images. We consider the estimation of the velocity fields and vorticity values from a sequence of oceanic images. The contribution of this paper lies in proposing a novel, robust and simple approach based onWeighted Ensemble Transform Kalman filter (WETKF) data assimilation technique for the analysis of real SST images, that may contain coast regions or large areas o...
Nasseri, M.; Zahraie, B.; Ajami, N. K.; Solomatine, D. P.
2014-04-01
Multi-model (ensemble, or committee) techniques have shown to be an effective way to improve hydrological prediction performance and provide uncertainty information. This paper presents two novel multi-model ensemble techniques, one probabilistic, Modified Bootstrap Ensemble Model (MBEM), and one possibilistic, FUzzy C-means Ensemble based on data Pattern (FUCEP). The paper also explores utilization of the Ordinary Kriging (OK) method as a multi-model combination scheme for hydrological simulation/prediction. These techniques are compared against Bayesian Model Averaging (BMA) and Weighted Average (WA) methods to demonstrate their effectiveness. The mentioned techniques are applied to the three monthly water balance models used to generate stream flow simulations for two mountainous basins in the South-West of Iran. For both basins, the results demonstrate that MBEM and FUCEP generate more skillful and reliable probabilistic predictions, outperforming all the other techniques. We have also found that OK did not demonstrate any improved skill as a simple combination method over WA scheme for neither of the basins.
Thiboult, Antoine; Anctil, François; Boucher, Marie-Amélie
2015-04-01
Hydrological ensemble prediction systems offer the possibility to dynamically assess forecast uncertainty. An ensemble may be issued wherever the uncertainty is situated along the meteorological chain. We commonly identify three main sources of uncertainty: meteorological forcing, hydrological initial conditions, and structural and parameter uncertainty. To address these uncertainties, different techniques have been developed. Meteorological ensemble prediction systems gained in popularity among researchers and operational forecasters as it allows to account for forcing uncertainties. Many data assimilation techniques have been applied to hydrology to reinitialize model states in order to issue more accurate and sharper predictive density functions. At last, multimodel simulation allows to get away from the quest of single best parameter and structure pitfall. The knowledge about these individual techniques is getting extensive and many individual applications can be found in the literature. Even though they proved to improve upon traditional forecasting, they frequently fail to issue fully reliable hydrological forecast as all sources of uncertainty are not tackled. Therefore, an improvement can be obtained in combining them, as it provides a more comprehensive handling of errors. Moreover, using these techniques separately or in combination allows to issue more reliable forecasts but also to identify explicitly the amount of total uncertainty that each technique accounts for. At the end, these sources of error can be characterized in terms of magnitude and lead time influence. As these techniques are frequently used alone, they are usually tuned to perform individually. To reach optimal performance, they should be set jointly. Among them, the data assimilation technique offers a large flexibility in its setting and therefore requires a proper setting considering the other ensemble techniques used. This question is also raised for the hydrological model selection
Wu, Zhiyong; Wu, Juan; Lu, Guihua
2015-11-01
Coupled hydrological and atmospheric modeling is an effective tool for providing advanced flood forecasting. However, the uncertainties in precipitation forecasts are still considerable. To address uncertainties, a one-way coupled atmospheric-hydrological modeling system, with a combination of high-resolution and ensemble precipitation forecasting, has been developed. It consists of three high-resolution single models and four sets of ensemble forecasts from the THORPEX Interactive Grande Global Ensemble database. The former provides higher forecasting accuracy, while the latter provides the range of forecasts. The combined precipitation forecasting was then implemented to drive the Chinese National Flood Forecasting System in the 2007 and 2008 Huai River flood hindcast analysis. The encouraging results demonstrated that the system can clearly give a set of forecasting hydrographs for a flood event and has a promising relative stability in discharge peaks and timing for warning purposes. It not only gives a deterministic prediction, but also generates probability forecasts. Even though the signal was not persistent until four days before the peak discharge was observed in the 2007 flood event, the visualization based on threshold exceedance provided clear and concise essential warning information at an early stage. Forecasters could better prepare for the possibility of a flood at an early stage, and then issue an actual warning if the signal strengthened. This process may provide decision support for civil protection authorities. In future studies, different weather forecasts will be assigned various weight coefficients to represent the covariance of predictors and the extremes of distributions.
Yu, Kai; Schwaighofer, Anton; Tresp, Volker; Ma, Wei-Ying; Zhang, Hongjiang
2012-01-01
Collaborative filtering (CF) and content-based filtering (CBF) have widely been used in information filtering applications. Both approaches have their strengths and weaknesses which is why researchers have developed hybrid systems. This paper proposes a novel approach to unify CF and CBF in a probabilistic framework, named collaborative ensemble learning. It uses probabilistic SVMs to model each user's profile (as CBF does).At the prediction phase, it combines a society OF users profiles, rep...
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis
Łukasz Augustyniak
2015-12-01
Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.
Ovis: A framework for visual analysis of ocean forecast ensembles
Hollt, Thomas
2014-08-01
We present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations of the sea surface height that is used in ocean forecasting. The position of eddies can be derived directly from the sea surface height and our visualization approach enables their interactive exploration and analysis.The behavior of eddies is important in different application settings of which we present two in this paper. First, we show an application for interactive planning of placement as well as operation of off-shore structures using real-world ensemble simulation data of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by eddies, and the oil and gas industry relies on ocean forecasts for efficient operations. We enable analysis of the spatial domain, as well as the temporal evolution, for planning the placement and operation of structures.Eddies are also important for marine life. They transport water over large distances and with it also heat and other physical properties as well as biological organisms. In the second application we present the usefulness of our tool, which could be used for planning the paths of autonomous underwater vehicles, so called gliders, for marine scientists to study simulation data of the largely unexplored Red Sea. © 1995-2012 IEEE.
Effective Visualization of Temporal Ensembles.
Hao, Lihua; Healey, Christopher G; Bass, Steffen A
2016-01-01
An ensemble is a collection of related datasets, called members, built from a series of runs of a simulation or an experiment. Ensembles are large, temporal, multidimensional, and multivariate, making them difficult to analyze. Another important challenge is visualizing ensembles that vary both in space and time. Initial visualization techniques displayed ensembles with a small number of members, or presented an overview of an entire ensemble, but without potentially important details. Recently, researchers have suggested combining these two directions, allowing users to choose subsets of members to visualization. This manual selection process places the burden on the user to identify which members to explore. We first introduce a static ensemble visualization system that automatically helps users locate interesting subsets of members to visualize. We next extend the system to support analysis and visualization of temporal ensembles. We employ 3D shape comparison, cluster tree visualization, and glyph based visualization to represent different levels of detail within an ensemble. This strategy is used to provide two approaches for temporal ensemble analysis: (1) segment based ensemble analysis, to capture important shape transition time-steps, clusters groups of similar members, and identify common shape changes over time across multiple members; and (2) time-step based ensemble analysis, which assumes ensemble members are aligned in time by combining similar shapes at common time-steps. Both approaches enable users to interactively visualize and analyze a temporal ensemble from different perspectives at different levels of detail. We demonstrate our techniques on an ensemble studying matter transition from hadronic gas to quark-gluon plasma during gold-on-gold particle collisions. PMID:26529728
We propose a novel computer-aided detection (CAD) framework of breast masses in mammography. To increase detection sensitivity for various types of mammographic masses, we propose the combined use of different detection algorithms. In particular, we develop a region-of-interest combination mechanism that integrates detection information gained from unsupervised and supervised detection algorithms. Also, to significantly reduce the number of false-positive (FP) detections, the new ensemble classification algorithm is developed. Extensive experiments have been conducted on a benchmark mammogram database. Results show that our combined detection approach can considerably improve the detection sensitivity with a small loss of FP rate, compared to representative detection algorithms previously developed for mammographic CAD systems. The proposed ensemble classification solution also has a dramatic impact on the reduction of FP detections; as much as 70% (from 15 to 4.5 per image) at only cost of 4.6% sensitivity loss (from 90.0% to 85.4%). Moreover, our proposed CAD method performs as well or better (70.7% and 80.0% per 1.5 and 3.5 FPs per image respectively) than the results of mammography CAD algorithms previously reported in the literature. (paper)
Ensemble approach combining multiple methods improves human transcription start site prediction
Dineen, David G
2010-11-30
Abstract Background The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier (\\'Profisi Ensemble\\') using predictions from 7 programs, along with 2 other data sources. Support vector machines using \\'full\\' and \\'reduced\\' data sets are combined in an either\\/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions Supervised learning methods are a useful way to combine predictions from diverse sources.
Senjean, Bruno; Alam, Md Mehboob; Knecht, Stefan; Fromager, Emmanuel
2015-01-01
The combination of a recently proposed linear interpolation method (LIM) [Senjean et al., Phys. Rev. A 92, 012518 (2015)], which enables the calculation of weight-independent excitation energies in range-separated ensemble density-functional approximations, with the extrapolation scheme of Savin [J. Chem. Phys. 140, 18A509 (2014)] is presented in this work. It is shown that LIM excitation energies vary quadratically with the inverse of the range-separation parameter mu when the latter is large. As a result, the extrapolation scheme, which is usually applied to long-range interacting energies, can be adapted straightforwardly to LIM. This extrapolated LIM (ELIM) has been tested on a small test set consisting of He, Be, H2 and HeH+. Relatively accurate results have been obtained for the first singlet excitation energies with the typical mu=0.4 value. The improvement of LIM after extrapolation is remarkable, in particular for the doubly-excited 2^1Sigma+g state in the stretched H2 molecule. Three-state ensemble ...
Xian, Lu; He, Kaijian; Lai, Kin Keung
2016-07-01
In recent years, the increasing level of volatility of the gold price has received the increasing level of attention from the academia and industry alike. Due to the complexity and significant fluctuations observed in the gold market, however, most of current approaches have failed to produce robust and consistent modeling and forecasting results. Ensemble Empirical Model Decomposition (EEMD) and Independent Component Analysis (ICA) are novel data analysis methods that can deal with nonlinear and non-stationary time series. This study introduces a new methodology which combines the two methods and applies it to gold price analysis. This includes three steps: firstly, the original gold price series is decomposed into several Intrinsic Mode Functions (IMFs) by EEMD. Secondly, IMFs are further processed with unimportant ones re-grouped. Then a new set of data called Virtual Intrinsic Mode Functions (VIMFs) is reconstructed. Finally, ICA is used to decompose VIMFs into statistically Independent Components (ICs). The decomposition results reveal that the gold price series can be represented by the linear combination of ICs. Furthermore, the economic meanings of ICs are analyzed and discussed in detail, according to the change trend and ICs' transformation coefficients. The analyses not only explain the inner driving factors and their impacts but also conduct in-depth analysis on how these factors affect gold price. At the same time, regression analysis has been conducted to verify our analysis. Results from the empirical studies in the gold markets show that the EEMD-ICA serve as an effective technique for gold price analysis from a new perspective.
Time and ensemble averaging in time series analysis
Latka, Miroslaw; Jernajczyk, Wojciech; West, Bruce J
2010-01-01
In many applications expectation values are calculated by partitioning a single experimental time series into an ensemble of data segments of equal length. Such single trajectory ensemble (STE) is a counterpart to a multiple trajectory ensemble (MTE) used whenever independent measurements or realizations of a stochastic process are available. The equivalence of STE and MTE for stationary systems was postulated by Wang and Uhlenbeck in their classic paper on Brownian motion (Rev. Mod. Phys. 17, 323 (1945)) but surprisingly has not yet been proved. Using the stationary and ergodic paradigm of statistical physics -- the Ornstein-Uhlenbeck (OU) Langevin equation, we revisit Wang and Uhlenbeck's postulate. In particular, we find that the variance of the solution of this equation is different for these two ensembles. While the variance calculated using the MTE quantifies the spreading of independent trajectories originating from the same initial point, the variance for STE measures the spreading of two correlated r...
Rubin, Juli I.; Reid, Jeffrey S.; Hansen, James A.; Anderson, Jeffrey L.; Collins, Nancy; Hoar, Timothy J.; Hogan, Timothy; Lynch, Peng; McLay, Justin; Reynolds, Carolyn A.; Sessions, Walter R.; Westphal, Douglas L.; Zhang, Jianglong
2016-03-01
An ensemble-based forecast and data assimilation system has been developed for use in Navy aerosol forecasting. The system makes use of an ensemble of the Navy Aerosol Analysis Prediction System (ENAAPS) at 1 × 1°, combined with an ensemble adjustment Kalman filter from NCAR's Data Assimilation Research Testbed (DART). The base ENAAPS-DART system discussed in this work utilizes the Navy Operational Global Analysis Prediction System (NOGAPS) meteorological ensemble to drive offline NAAPS simulations coupled with the DART ensemble Kalman filter architecture to assimilate bias-corrected MODIS aerosol optical thickness (AOT) retrievals. This work outlines the optimization of the 20-member ensemble system, including consideration of meteorology and source-perturbed ensemble members as well as covariance inflation. Additional tests with 80 meteorological and source members were also performed. An important finding of this work is that an adaptive covariance inflation method, which has not been previously tested for aerosol applications, was found to perform better than a temporally and spatially constant covariance inflation. Problems were identified with the constant inflation in regions with limited observational coverage. The second major finding of this work is that combined meteorology and aerosol source ensembles are superior to either in isolation and that both are necessary to produce a robust system with sufficient spread in the ensemble members as well as realistic correlation fields for spreading observational information. The inclusion of aerosol source ensembles improves correlation fields for large aerosol source regions, such as smoke and dust in Africa, by statistically separating freshly emitted from transported aerosol species. However, the source ensembles have limited efficacy during long-range transport. Conversely, the meteorological ensemble generates sufficient spread at the synoptic scale to enable observational impact through the ensemble data
Climate Prediction Center(CPC)Ensemble Canonical Correlation Analysis Forecast of Temperature
National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) temperature forecast is a 90-day (seasonal) outlook of US surface temperature anomalies. The ECCA uses Canonical...
National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) precipitation forecast is a 90-day (seasonal) outlook of US surface precipitation anomalies. The ECCA uses...
Nanoelectrode ensemble based on multiwalled carbon nanotubes for electrochemical analysis
Музика, Катерина Миколаївна; Білаш, Олена Михайлівна
2012-01-01
The technique of nanoelectrode ensembles development based on multiwall carbon nanotubes has been demonstrated. The obtained NEE has higher Faraday/capacitive current ratio compared to conventional electrodes of the same area, indicating a lower limit of redox-active compounds detection
Li, Y.; Kinzelbach, W.; Zhou, J.; Cheng, G. D.; Li, X.
2011-04-01
The hydrologic model HYDRUS-1D and the crop growth model WOFOST were coupled to efficiently manage water resources in agriculture and improve the prediction of crop production through the accurate estimation of actual transpiration with the root water uptake method and a soil moisture profile computed with the Richards equation during crop growth. The results of the coupled model are validated by experimental studies of irrigated-maize done in the middle reaches of northwest China's Heihe River, a semi-arid to arid region. Good agreement was achieved between the simulated evapotranspiration, soil moisture and crop production and their respective field measurements made under maize crop. However, for regions without detailed observation, the results of the numerical simulation could be unreliable for policy and decision making owing to the uncertainty of model boundary conditions and parameters. So, we developed the method of combining model simulation and ensemble forecasting to analyse and predict the probability of crop production. In our studies, the uncertainty analysis was used to reveal the risk of facing a loss of crop production as irrigation decreases. The global sensitivity analysis was used to test the coupled model and further quantitatively analyse the impact of the uncertainty of coupled model parameters and environmental scenarios on crop production. This method could be used for estimation in regions with no or reduced data availability.
Y. Li
2011-04-01
Full Text Available The hydrologic model HYDRUS-1D and the crop growth model WOFOST were coupled to efficiently manage water resources in agriculture and improve the prediction of crop production through the accurate estimation of actual transpiration with the root water uptake method and a soil moisture profile computed with the Richards equation during crop growth. The results of the coupled model are validated by experimental studies of irrigated-maize done in the middle reaches of northwest China's Heihe River, a semi-arid to arid region. Good agreement was achieved between the simulated evapotranspiration, soil moisture and crop production and their respective field measurements made under maize crop. However, for regions without detailed observation, the results of the numerical simulation could be unreliable for policy and decision making owing to the uncertainty of model boundary conditions and parameters. So, we developed the method of combining model simulation and ensemble forecasting to analyse and predict the probability of crop production. In our studies, the uncertainty analysis was used to reveal the risk of facing a loss of crop production as irrigation decreases. The global sensitivity analysis was used to test the coupled model and further quantitatively analyse the impact of the uncertainty of coupled model parameters and environmental scenarios on crop production. This method could be used for estimation in regions with no or reduced data availability.
Gang Zhang; Yonghui Huang; Ling Zhong; Shanxing Ou; Yi Zhang; Ziping Li
2015-01-01
Objective. This study aims to establish a model to analyze clinical experience of TCM veteran doctors. We propose an ensemble learning based framework to analyze clinical records with ICD-10 labels information for effective diagnosis and acupoints recommendation. Methods. We propose an ensemble learning framework for the analysis task. A set of base learners composed of decision tree (DT) and support vector machine (SVM) are trained by bootstrapping the training dataset. The base learners are...
Luo, Xiaodong; Bhakta, Tuhin; Jakobsen, Morten; Nævdal, Geir
2016-01-01
In this work we propose an ensemble 4D seismic history matching framework for reservoir characterization. Compared to similar existing frameworks in reservoir engineering community, the proposed one consists of some relatively new ingredients, in terms of the type of seismic data in choice, wavelet multiresolution analysis for the chosen seismic data and related data noise estimation, and the use of recently developed iterative ensemble history matching algorithms. Typical seismic data used f...
Fox, Neil I.; Micheas, Athanasios C.; Peng, Yuqiang
2016-07-01
This paper introduces the use of Bayesian full Procrustes shape analysis in object-oriented meteorological applications. In particular, the Procrustes methodology is used to generate mean forecast precipitation fields from a set of ensemble forecasts. This approach has advantages over other ensemble averaging techniques in that it can produce a forecast that retains the morphological features of the precipitation structures and present the range of forecast outcomes represented by the ensemble. The production of the ensemble mean avoids the problems of smoothing that result from simple pixel or cell averaging, while producing credible sets that retain information on ensemble spread. Also in this paper, the full Bayesian Procrustes scheme is used as an object verification tool for precipitation forecasts. This is an extension of a previously presented Procrustes shape analysis based verification approach into a full Bayesian format designed to handle the verification of precipitation forecasts that match objects from an ensemble of forecast fields to a single truth image. The methodology is tested on radar reflectivity nowcasts produced in the Warning Decision Support System - Integrated Information (WDSS-II) by varying parameters in the K-means cluster tracking scheme.
Analysis of ensemble quality of initialzed hindcasts in the global coupled climate model MPI-ESM
Brune, Sebastian; Düsterhus, Andre; Baehr, Johanna
2016-04-01
Global coupled climate models have been used to generate long-term projections of potential climate changes for the next century. On much shorter timescales, numerical weather prediction systems forecast the atmospheric state for the next days. The first approach depends largely on the boundary conditions, i.e., the applied external forcings, while the second depends largely on the initial conditions, i.e., the observed atmospheric state. For medium range climate predictions, on interannual to decadal time scales, both initial and boundary conditions are thought to influence the climate state, because the ocean is expected to have a much larger deterministic timescale than the atmosphere. The respective climate model needs to resemble the observed climate state and its tendency at the start of the prediction. This is realized by incorporating observations into both the oceanic and atmospheric components of the climate model leading to an initialized simulation. Here, we analyze the quality of an initialized ensemble generated with the global coupled Max Planck Institute for Meteorology Earth System Model (MPI-ESM). We initialize for every year for the time period 1960 to 2014 an ensemble run out to 10 yaers length. This hindcast ensemble is conducted within the MiKlip framework for interannual to decadal climate prediction. In this context, the initialization of the oceanic component of the model ensemble is thought to impact the model state within the first years of prediction, however, it remains poorly known, for how much longer this impact can be detected. In our analysis we focus on the North Atlantic ocean variability and assess the evolution in time of both the probability density function (PDF) and the spread-error-ratio of the ensemble. Firstly, by comparing these characteristics of the initialized ensemble with an uninitialized ensemble we aim to (1) measure the difference in the initialized and uninitialized ensemble, (2) assess the evolution of this
The analysis of ensembles of moderately saturated interstellar lines
Jenkins, E. B.
1986-01-01
It is shown that the combined equivalent widths for a large population of Gaussian-like interstellar line components, each with different central optical depths tau(0) and velocity dispersions b, exhibit a curve of growth (COG) which closely mimics that of a single, pure Gaussian distribution in velocity. Two parametric distributions functions for the line populations are considered: a bivariate Gaussian for tau(0) and b and a power law distribution for tau(0) combined with a Gaussian dispersion for b. First, COGs for populations having an extremely large number of nonoverlapping components are derived, and the implications are shown by focusing on the doublet-ratio analysis for a pair of lines whose f-values differ by a factor of two. The consequences of having, instead of an almost infinite number of lines, a relatively small collection of components added together for each member of a doublet are examined. The theory of how the equivalent widths grow for populations of overlapping Gaussian profiles is developed. Examples of the composite COG analysis applied to existing collections of high-resolution interstellar line data are presented.
A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico
Hoteit, Ibrahim
2013-09-01
This paper describes the development of an advanced ensemble Kalman filter (EnKF)-based ocean data assimilation system for prediction of the evolution of the loop current in the Gulf of Mexico (GoM). The system integrates the Data Assimilation Research Testbed (DART) assimilation package with the Massachusetts Institute of Technology ocean general circulation model (MITgcm). The MITgcm/DART system supports the assimilation of a wide range of ocean observations and uses an ensemble approach to solve the nonlinear assimilation problems. The GoM prediction system was implemented with an eddy-resolving 1/10th degree configuration of the MITgcm. Assimilation experiments were performed over a 6-month period between May and October during a strong loop current event in 1999. The model was sequentially constrained with weekly satellite sea surface temperature and altimetry data. Experiments results suggest that the ensemble-based assimilation system shows a high predictive skill in the GoM, with estimated ensemble spread mainly concentrated around the front of the loop current. Further analysis of the system estimates demonstrates that the ensemble assimilation accurately reproduces the observed features without imposing any negative impact on the dynamical balance of the system. Results from sensitivity experiments with respect to the ensemble filter parameters are also presented and discussed. © 2013 Elsevier B.V.
Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M
2014-01-01
The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method. PMID:25571123
Analysis of the interface variability in NMR structure ensembles of protein-protein complexes.
Calvanese, Luisa; D'Auria, Gabriella; Vangone, Anna; Falcigno, Lucia; Oliva, Romina
2016-06-01
NMR structures consist in ensembles of conformers, all satisfying the experimental restraints, which exhibit a certain degree of structural variability. We analyzed here the interface in NMR ensembles of protein-protein heterodimeric complexes and found it to span a wide range of different conservations. The different exhibited conservations do not simply correlate with the size of the systems/interfaces, and are most probably the result of an interplay between different factors, including the quality of experimental data and the intrinsic complex flexibility. In any case, this information is not to be missed when NMR structures of protein-protein complexes are analyzed; especially considering that, as we also show here, the first NMR conformer is usually not the one which best reflects the overall interface. To quantify the interface conservation and to analyze it, we used an approach originally conceived for the analysis and ranking of ensembles of docking models, which has now been extended to directly deal with NMR ensembles. We propose this approach, based on the conservation of the inter-residue contacts at the interface, both for the analysis of the interface in whole ensembles of NMR complexes and for the possible selection of a single conformer as the best representative of the overall interface. In order to make the analyses automatic and fast, we made the protocol available as a web tool at: https://www.molnac.unisa.it/BioTools/consrank/consrank-nmr.html. PMID:26968364
Statistical mechanical analysis of a hierarchical random code ensemble in signal processing
Obuchi, Tomoyuki [Department of Earth and Space Science, Faculty of Science, Osaka University, Toyonaka 560-0043 (Japan); Takahashi, Kazutaka [Department of Physics, Tokyo Institute of Technology, Tokyo 152-8551 (Japan); Takeda, Koujin, E-mail: takeda@sp.dis.titech.ac.jp [Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama 226-8502 (Japan)
2011-02-25
We study a random code ensemble with a hierarchical structure, which is closely related to the generalized random energy model with discrete energy values. Based on this correspondence, we analyze the hierarchical random code ensemble by using the replica method in two situations: lossy data compression and channel coding. For both the situations, the exponents of large deviation analysis characterizing the performance of the ensemble, the distortion rate of lossy data compression and the error exponent of channel coding in Gallager's formalism, are accessible by a generating function of the generalized random energy model. We discuss that the transitions of those exponents observed in the preceding work can be interpreted as phase transitions with respect to the replica number. We also show that the replica symmetry breaking plays an essential role in these transitions.
Uncertainty analysis in building ensemble of RCMs, on water cycle in South East of Spain
García Galiano, Sandra; Olmos Giménez, Patricia; Giraldo Osorio, Juan Diego
2014-05-01
the influence of seasonal and annual variation of the corresponding variables, and is built at each site. A sensitivity analysis of ensemble building method of meteorological variables, is addressed for justifying the more robust and parsimonious methodology. Finally, the impacts on runoff and its trend from historical data and climate projections from the selected method of RCMs ensemble, were assessed. Significant decreases from the plausible scenarios of runoff for 2050 were identified, with the consequent negative impacts in the regional economy.
Ensemble-trained source apportionment of fine particulate matter and method uncertainty analysis
Balachandran, Sivaraman; Pachon, Jorge E.; Hu, Yongtao; Lee, Dongho; Mulholland, James A.; Russell, Armistead G.
2012-12-01
An ensemble-based approach is applied to better estimate source impacts on fine particulate matter (PM2.5) and quantify uncertainties in various source apportionment (SA) methods. The approach combines source impacts from applications of four individual SA methods: three receptor-based models and one chemical transport model (CTM). Receptor models used are the chemical mass balance methods CMB-LGO (Chemical Mass Balance-Lipschitz global optimizer) and CMB-MM (molecular markers) as well as a factor analytic method, Positive Matrix Factorization (PMF). The CTM used is the Community Multiscale Air Quality (CMAQ) model. New source impact estimates and uncertainties in these estimates are calculated in a two-step process. First, an ensemble average is calculated for each source category using results from applying the four individual SA methods. The root mean square error (RMSE) between each method with respect to the average is calculated for each source category; the RMSE is then taken to be the updated uncertainty for each individual SA method. Second, these new uncertainties are used to re-estimate ensemble source impacts and uncertainties. The approach is applied to data from daily PM2.5 measurements at the Atlanta, GA, Jefferson Street (JST) site in July 2001 and January 2002. The procedure provides updated uncertainties for the individual SA methods that are calculated in a consistent way across methods. Overall, the ensemble has lower relative uncertainties as compared to the individual SA methods. Calculated CMB-LGO uncertainties tend to decrease from initial estimates, while PMF and CMB-MM uncertainties increase. Estimated CMAQ source impact uncertainties are comparable to other SA methods for gasoline vehicles and SOC but are larger than other methods for other sources. In addition to providing improved estimates of source impact uncertainties, the ensemble estimates do not have unrealistic extremes as compared to individual SA methods and avoids zero impact
Pathway analysis in attention deficit hyperactivity disorder: An ensemble approach.
Mooney, Michael A; McWeeney, Shannon K; Faraone, Stephen V; Hinney, Anke; Hebebrand, Johannes; Nigg, Joel T; Wilmot, Beth
2016-09-01
Despite a wealth of evidence for the role of genetics in attention deficit hyperactivity disorder (ADHD), specific and definitive genetic mechanisms have not been identified. Pathway analyses, a subset of gene-set analyses, extend the knowledge gained from genome-wide association studies (GWAS) by providing functional context for genetic associations. However, there are numerous methods for association testing of gene sets and no real consensus regarding the best approach. The present study applied six pathway analysis methods to identify pathways associated with ADHD in two GWAS datasets from the Psychiatric Genomics Consortium. Methods that utilize genotypes to model pathway-level effects identified more replicable pathway associations than methods using summary statistics. In addition, pathways implicated by more than one method were significantly more likely to replicate. A number of brain-relevant pathways, such as RhoA signaling, glycosaminoglycan biosynthesis, fibroblast growth factor receptor activity, and pathways containing potassium channel genes, were nominally significant by multiple methods in both datasets. These results support previous hypotheses about the role of regulation of neurotransmitter release, neurite outgrowth and axon guidance in contributing to the ADHD phenotype and suggest the value of cross-method convergence in evaluating pathway analysis results. © 2016 Wiley Periodicals, Inc. PMID:27004716
A. Riccio
2007-04-01
Full Text Available In this paper we present an approach for the statistical analysis of multi-model ensemble results. The models considered here are operational long-range transport and dispersion models, also used for the real-time simulation of pollutant dispersion or the accidental release of radioactive nuclides.
We first introduce the theoretical basis (with its roots sinking into the Bayes theorem and then apply this approach to the analysis of model results obtained during the ETEX-1 exercise. We recover some interesting results, supporting the heuristic approach called "median model", originally introduced in Galmarini et al. (2004a, b.
This approach also provides a way to systematically reduce (and quantify model uncertainties, thus supporting the decision-making process and/or regulatory-purpose activities in a very effective manner.
Luo, Xiaodong; Jakobsen, Morten; Nævdal, Geir
2016-01-01
In this work we propose an ensemble 4D seismic history matching framework for reservoir characterization. Compared to similar existing frameworks in reservoir engineering community, the proposed one consists of some relatively new ingredients, in terms of the type of seismic data in choice, wavelet multiresolution analysis for the chosen seismic data and related data noise estimation, and the use of recently developed iterative ensemble history matching algorithms. Typical seismic data used for history matching, such as acoustic impedance, are inverted quantities, whereas extra uncertainties may arise during the inversion processes. In the proposed framework we avoid such intermediate inversion processes. In addition, we also adopt wavelet-based sparse representation to reduce data size. Concretely, we use intercept and gradient attributes derived from amplitude versus angle (AVA) data, apply multilevel discrete wavelet transforms (DWT) to attribute data, and estimate noise level of resulting wavelet coeffici...
Morphing ensemble Kalman filters
Beezley, Jonathan D.; Mandel, Jan
2008-01-01
A new type of ensemble filter is proposed, which combines an ensemble Kalman filter (EnKF) with the ideas of morphing and registration from image processing. This results in filters suitable for non-linear problems whose solutions exhibit moving coherent features, such as thin interfaces in wildfire modelling. The ensemble members are represented as the composition of one common state with a spatial transformation, called registration mapping, plus a residual. A fully automatic registration m...
Morphing Ensemble Kalman Filters
Beezley, Jonathan D.; Mandel, Jan
2007-01-01
A new type of ensemble filter is proposed, which combines an ensemble Kalman filter (EnKF) with the ideas of morphing and registration from image processing. This results in filters suitable for nonlinear problems whose solutions exhibit moving coherent features, such as thin interfaces in wildfire modeling. The ensemble members are represented as the composition of one common state with a spatial transformation, called registration mapping, plus a residual. A fully automatic registration met...
Flicek, Paul; Amode, M Ridwan; Barrell, Daniel; Beal, Kathryn; Brent, Simon; Carvalho-Silva, Denise; Clapham, Peter; Coates, Guy; Fairley, Susan; Fitzgerald, Stephen; Gil, Laurent; Gordon, Leo; Hendrix, Maurice; Hourlier, Thibaut; Johnson, Nathan; Kähäri, Andreas K; Keefe, Damian; Keenan, Stephen; Kinsella, Rhoda; Komorowska, Monika; Koscielny, Gautier; Kulesha, Eugene; Larsson, Pontus; Longden, Ian; McLaren, William; Muffato, Matthieu; Overduin, Bert; Pignatelli, Miguel; Pritchard, Bethan; Riat, Harpreet Singh; Ritchie, Graham R S; Ruffier, Magali; Schuster, Michael; Sobral, Daniel; Tang, Y Amy; Taylor, Kieron; Trevanion, Stephen; Vandrovcova, Jana; White, Simon; Wilson, Mark; Wilder, Steven P; Aken, Bronwen L; Birney, Ewan; Cunningham, Fiona; Dunham, Ian; Durbin, Richard; Fernández-Suarez, Xosé M; Harrow, Jennifer; Herrero, Javier; Hubbard, Tim J P; Parker, Anne; Proctor, Glenn; Spudich, Giulietta; Vogel, Jan; Yates, Andy; Zadissa, Amonida; Searle, Stephen M J
2012-01-01
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project. PMID:22086963
Hacker, Joshua
2013-04-01
Ensemble sensitivity analysis (ESA) is emerging as a viable alternative to adjoint sensitivity. Several open issues face ESA for forecasts dominated by mesoscale phenomena, including (1) sampling error arising from finite-sized ensembles causing over-estimated sensitivities, and (2) violation of linearity assumptions for strongly nonlinear flows. In an effort to use ESA for predictability studies and observing network design in complex terrain, we present results from experiments designed to address these open issues. Sampling error in ESA arises in two places. First, when hypothetical observations are introduced to test the sensitivity estimates for linearity. Here the same localization that was used in the filter itself can be simply applied. Second and more critical, localization should be considered within the sensitivity calculations. Sensitivity to hypothetical observations, estimated without re-running the ensemble, includes regression of a sample of a final-time (forecast) metric onto a sample of initial states. Derivation to include localization results in two localization coefficients (or factors) applied in separate regression steps. Because the forecast metric is usually a sum, and can also include a sum over a spatial region and multiple physical variables, a spatial localization function is difficult to specify. We present results from experiments to empirically estimate localization factors for ESA to test hypothetical observations for mesoscale data assimilation in complex terrain. Localization factors are first derived for an ensemble filter following the empirical localization methodology. Sensitivities for a fog event over Salt Lake City, and a Colorado downslope wind event, are tested for linearity by approximating assimilation of perfect observations at points of maximum sensitivity, both with and without localization. Observation sensitivity is then estimated, with and without localization, and tested for linearity. The validity of the
Hybrid Data Assimilation without Ensemble Filtering
Todling, Ricardo; Akkraoui, Amal El
2014-01-01
The Global Modeling and Assimilation Office is preparing to upgrade its three-dimensional variational system to a hybrid approach in which the ensemble is generated using a square-root ensemble Kalman filter (EnKF) and the variational problem is solved using the Grid-point Statistical Interpolation system. As in most EnKF applications, we found it necessary to employ a combination of multiplicative and additive inflations, to compensate for sampling and modeling errors, respectively and, to maintain the small-member ensemble solution close to the variational solution; we also found it necessary to re-center the members of the ensemble about the variational analysis. During tuning of the filter we have found re-centering and additive inflation to play a considerably larger role than expected, particularly in a dual-resolution context when the variational analysis is ran at larger resolution than the ensemble. This led us to consider a hybrid strategy in which the members of the ensemble are generated by simply converting the variational analysis to the resolution of the ensemble and applying additive inflation, thus bypassing the EnKF. Comparisons of this, so-called, filter-free hybrid procedure with an EnKF-based hybrid procedure and a control non-hybrid, traditional, scheme show both hybrid strategies to provide equally significant improvement over the control; more interestingly, the filter-free procedure was found to give qualitatively similar results to the EnKF-based procedure.
Flicek, Paul; Amode, M. Ridwan; Barrell, Daniel; Beal, Kathryn; Brent, Simon; Carvalho-Silva, Denise; Clapham, Peter; Coates, Guy; Fairley, Susan; Fitzgerald, Stephen; Gil, Laurent; Gordon, Leo; Hendrix, Maurice; Hourlier, Thibaut; Johnson, Nathan
2011-01-01
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website ...
A glacial systems model configured for large ensemble analysis of Antarctic deglaciation
R. Briggs
2013-04-01
Full Text Available This article describes the Memorial University of Newfoundland/Penn State University (MUN/PSU glacial systems model (GSM that has been developed specifically for large-ensemble data-constrained analysis of past Antarctic Ice Sheet evolution. Our approach emphasizes the introduction of a large set of model parameters to explicitly account for the uncertainties inherent in the modelling of such a complex system. At the core of the GSM is a 3-D thermo-mechanically coupled ice sheet model that solves both the shallow ice and shallow shelf approximations. This enables the different stress regimes of ice sheet, ice shelves, and ice streams to be represented. The grounding line is modelled through an analytical sub-grid flux parametrization. To this dynamical core the following have been added: a heavily parametrized basal drag component; a visco-elastic isostatic adjustment solver; a diverse set of climate forcings (to remove any reliance on any single method; tidewater and ice shelf calving functionality; and a new physically-motivated empirically-derived sub-shelf melt (SSM component. To assess the accuracy of the latter, we compare predicted SSM values against a compilation of published observations. Within parametric and observational uncertainties, computed SSM for the present day ice sheet is in accord with observations for all but the Filchner ice shelf. The GSM has 31 ensemble parameters that are varied to account (in part for the uncertainty in the ice-physics, the climate forcing, and the ice-ocean interaction. We document the parameters and parametric sensitivity of the model to motivate the choice of ensemble parameters in a quest to approximately bound reality (within the limits of 31 parameters.
Huisman, J.A.; Breuer, L.; Bormann, H.; Bronstert, A.; Croke, B.F.W.; Frede, H.-G.; Graff, T.; Hubrechts, L.; Jakeman, A.J.; Kite, G.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Viney, N.R.; Willems, P.
2009-01-01
An ensemble of 10 hydrological models was applied to the same set of land use change scenarios. There was general agreement about the direction of changes in the mean annual discharge and 90% discharge percentile predicted by the ensemble members, although a considerable range in the magnitude of predictions for the scenarios and catchments under consideration was obvious. Differences in the magnitude of the increase were attributed to the different mean annual actual evapotranspiration rates for each land use type. The ensemble of model runs was further analyzed with deterministic and probabilistic ensemble methods. The deterministic ensemble method based on a trimmed mean resulted in a single somewhat more reliable scenario prediction. The probabilistic reliability ensemble averaging (REA) method allowed a quantification of the model structure uncertainty in the scenario predictions. It was concluded that the use of a model ensemble has greatly increased our confidence in the reliability of the model predictions. ?? 2008 Elsevier Ltd.
National Aeronautics and Space Administration — Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve...
Wu, Zhaohua; Feng, Jiaxin; Qiao, Fangli; Tan, Zhe-Min
2016-01-01
In this big data era, it is more urgent than ever to solve two major issues: (i) fast data transmission methods that can facilitate access to data from non-local sources and (ii) fast and efficient data analysis methods that can reveal the key information from the available data for particular purposes. Although approaches in different fields to address these two questions may differ significantly, the common part must involve data compression techniques and a fast algorithm. This paper introduces the recently developed adaptive and spatio-temporally local analysis method, namely the fast multidimensional ensemble empirical mode decomposition (MEEMD), for the analysis of a large spatio-temporal dataset. The original MEEMD uses ensemble empirical mode decomposition to decompose time series at each spatial grid and then pieces together the temporal–spatial evolution of climate variability and change on naturally separated timescales, which is computationally expensive. By taking advantage of the high efficiency of the expression using principal component analysis/empirical orthogonal function analysis for spatio-temporally coherent data, we design a lossy compression method for climate data to facilitate its non-local transmission. We also explain the basic principles behind the fast MEEMD through decomposing principal components instead of original grid-wise time series to speed up computation of MEEMD. Using a typical climate dataset as an example, we demonstrate that our newly designed methods can (i) compress data with a compression rate of one to two orders; and (ii) speed-up the MEEMD algorithm by one to two orders. PMID:26953173
Data-worth analysis through probabilistic collocation-based Ensemble Kalman Filter
Dai, Cheng; Xue, Liang; Zhang, Dongxiao; Guadagnini, Alberto
2016-09-01
We propose a new and computationally efficient data-worth analysis and quantification framework keyed to the characterization of target state variables in groundwater systems. We focus on dynamically evolving plumes of dissolved chemicals migrating in randomly heterogeneous aquifers. An accurate prediction of the detailed features of solute plumes requires collecting a substantial amount of data. Otherwise, constraints dictated by the availability of financial resources and ease of access to the aquifer system suggest the importance of assessing the expected value of data before these are actually collected. Data-worth analysis is targeted to the quantification of the impact of new potential measurements on the expected reduction of predictive uncertainty based on a given process model. Integration of the Ensemble Kalman Filter method within a data-worth analysis framework enables us to assess data worth sequentially, which is a key desirable feature for monitoring scheme design in a contaminant transport scenario. However, it is remarkably challenging because of the (typically) high computational cost involved, considering that repeated solutions of the inverse problem are required. As a computationally efficient scheme, we embed in the data-worth analysis framework a modified version of the Probabilistic Collocation Method-based Ensemble Kalman Filter proposed by Zeng et al. (2011) so that we take advantage of the ability to assimilate data sequentially in time through a surrogate model constructed via the polynomial chaos expansion. We illustrate our approach on a set of synthetic scenarios involving solute migrating in a two-dimensional random permeability field. Our results demonstrate the computational efficiency of our approach and its ability to quantify the impact of the design of the monitoring network on the reduction of uncertainty associated with the characterization of a migrating contaminant plume.
Camporese, M.; Paniconi, C.; Putti, M.; Salandin, P.
2007-12-01
Hydrologic models can largely benefit from the use of data assimilation algorithms, which allow to update the modeled system state incorporating in the solution of the model itself information coming from experimental measurements of various quantities, as soon as the data become available. In this context, data assimilation seems to be well fit for coupled surface--subsurface models, which, considering the watershed as the ensemble of surface and subsurface domains, allow a more accurate description of the hydrological processes at the catchment scale, where soil moisture largely influences the partitioning of rain between runoff and infiltration and thus controls the flow at the outlet. The need for a better determination of the variables of interest (streamflow at the outlet section, water table, soil water content, etc.) has led to a many efforts focused on the development of coupled numerical models, together with field and laboratory observations. Nevertheless, uncertainty in the schematic description of physical processes and inaccuracies on source data collection induce errors in the model predictions. The ensemble Kalman filter (EnKF) represents an extension to nonlinear problems of the classic Kalman filter by means of a Monte Carlo approach. A sequential assimilation procedure based on EnKF is developed and integrated in a process-based numerical model, which couples a three-dimensional finite element Richards equation solver for variably saturated porous media and a finite difference diffusion wave approximation based on a digital elevation data for surface water dynamics. A detailed analysis of the data assimilation algorithm behavior within the coupled model has been carried out on a synthetic 1D test case in order to verify the correct implementation and derive a series of fundamental parameters, such as the minimum ensemble size that can ensure a sufficient accuracy in the statistical estimates. The assimilation frequency, as well as the effects
Sturm, Irene; Treder, Matthias S.; Miklody, Daniel;
2015-01-01
When listening to ensemble music even non-musicians can follow single instruments effortlessly. Electrophysiological indices for neural sensory encoding of separate streams have been described using oddball paradigms which utilize brain reactions to sound events that deviate from a repeating...... standard pattern. Obviously, these paradigms put constraints on the compositional complexity of the musical stimulus. Here, we apply a regression-based method of multivariate EEG analysis in order to reveal the neural encoding of separate voices of naturalistic ensemble music that is based on cortical...... responses to tone onsets, such as N1/P2 ERP components. Music clips (resembling minimalistic electro-pop) were presented to 11 subjects, either in an ensemble version (drums, bass, keyboard) or in the corresponding three solo versions. For each instrument we train a spatio-temporal regression filter that...
Ensemble clustering in deterministic ensemble Kalman filters
Javier Amezcua
2012-07-01
Full Text Available Ensemble clustering (EC can arise in data assimilation with ensemble square root filters (EnSRFs using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M–1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.
2002-01-01
NYYD Ensemble'i duost Traksmann - Lukk E.-S. Tüüri teosega "Symbiosis", mis on salvestatud ka hiljuti ilmunud NYYD Ensemble'i CDle. 2. märtsil Rakvere Teatri väikeses saalis ja 3. märtsil Rotermanni Soolalaos, kavas Tüür, Kaumann, Berio, Reich, Yun, Hauta-aho, Buckinx
Ensemble habitat mapping of invasive plant species
Stohlgren, T.J.; Ma, P.; Kumar, S.; Rocca, M.; Morisette, J.T.; Jarnevich, C.S.; Benson, N.
2010-01-01
Ensemble species distribution models combine the strengths of several species environmental matching models, while minimizing the weakness of any one model. Ensemble models may be particularly useful in risk analysis of recently arrived, harmful invasive species because species may not yet have spread to all suitable habitats, leaving species-environment relationships difficult to determine. We tested five individual models (logistic regression, boosted regression trees, random forest, multivariate adaptive regression splines (MARS), and maximum entropy model or Maxent) and ensemble modeling for selected nonnative plant species in Yellowstone and Grand Teton National Parks, Wyoming; Sequoia and Kings Canyon National Parks, California, and areas of interior Alaska. The models are based on field data provided by the park staffs, combined with topographic, climatic, and vegetation predictors derived from satellite data. For the four invasive plant species tested, ensemble models were the only models that ranked in the top three models for both field validation and test data. Ensemble models may be more robust than individual species-environment matching models for risk analysis. ?? 2010 Society for Risk Analysis.
Bayesian ensemble refinement by replica simulations and reweighting
Hummer, Gerhard; Köfinger, Jürgen
2015-12-01
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
High-resolution knee joint vibroarthrographic (VAG) signals can help physicians accurately evaluate the pathological condition of a degenerative knee joint, in order to prevent unnecessary exploratory surgery. Artifact cancellation is vital to preserve the quality of VAG signals prior to further computer-aided analysis. This paper describes a novel method that effectively utilizes ensemble empirical mode decomposition (EEMD) and detrended fluctuation analysis (DFA) algorithms for the removal of baseline wander and white noise in VAG signal processing. The EEMD method first successively decomposes the raw VAG signal into a set of intrinsic mode functions (IMFs) with fast and low oscillations, until the monotonic baseline wander remains in the last residue. Then, the DFA algorithm is applied to compute the fractal scaling index parameter for each IMF, in order to identify the anti-correlation and the long-range correlation components. Next, the DFA algorithm can be used to identify the anti-correlated and the long-range correlated IMFs, which assists in reconstructing the artifact-reduced VAG signals. Our experimental results showed that the combination of EEMD and DFA algorithms was able to provide averaged signal-to-noise ratio (SNR) values of 20.52 dB (standard deviation: 1.14 dB) and 20.87 dB (standard deviation: 1.89 dB) for 45 normal signals in healthy subjects and 20 pathological signals in symptomatic patients, respectively. The combination of EEMD and DFA algorithms can ameliorate the quality of VAG signals with great SNR improvements over the raw signal, and the results were also superior to those achieved by wavelet matching pursuit decomposition and time-delay neural filter. (paper)
Fernandez, J.; Cofino, A.S. [University of Cantabria, Department of Applied Mathematics and Computing Sciences, Santander (Spain); Primo, C. [European Centre for Medium-Range Weather Forecasts, Reading (United Kingdom); Gutierrez, J.M.; Rodriguez, M.A. [Instituto de Fisica de Cantabria, CSIC-UC, Santander (Spain)
2009-08-15
In a recent paper, Gutierrez et al. (Nonlinear Process Geophys 15(1):109-114, 2008) introduced a new characterization of spatiotemporal error growth - the so called mean-variance logarithmic (MVL) diagram - and applied it to study ensemble prediction systems (EPS); in particular, they analyzed single-model ensembles obtained by perturbing the initial conditions. In the present work, the MVL diagram is applied to multi-model ensembles analyzing also the effect of model formulation differences. To this aim, the MVL diagram is systematically applied to the multi-model ensemble produced in the EU-funded DEMETER project. It is shown that the shared building blocks (atmospheric and ocean components) impose similar dynamics among different models and, thus, contribute to poorly sampling the model formulation uncertainty. This dynamical similarity should be taken into account, at least as a pre-screening process, before applying any objective weighting method. (orig.)
Combination Clustering Analysis Method and its Application
Bang-Chun Wen; Li-Yuan Dong; Qin-Liang Li; Yang Liu
2013-01-01
The traditional clustering analysis method can not automatically determine the optimal clustering number. In this study, we provided a new clustering analysis method which is combination clustering analysis method to solve this problem. Through analyzed 25 kinds of automobile data samples by combination clustering analysis method, the correctness of the analysis result was verified. It showed that combination clustering analysis method could objectively determine the number of clustering firs...
Sea surface temperature predictions using a multi-ocean analysis ensemble scheme
Zhang, Ying; Zhu, Jieshun; Li, Zhongxian; Chen, Haishan; Zeng, Gang
2016-04-01
This study examined the global sea surface temperature (SST) predictions by a so-called multiple-ocean analysis ensemble (MAE) initialization method which was applied in the National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2). Different from most operational climate prediction practices which are initialized by a specific ocean analysis system, the MAE method is based on multiple ocean analyses. In the paper, the MAE method was first justified by analyzing the ocean temperature variability in four ocean analyses which all are/were applied for operational climate predictions either at the European Centre for Medium-range Weather Forecasts or at NCEP. It was found that these systems exhibit substantial uncertainties in estimating the ocean states, especially at the deep layers. Further, a set of MAE hindcasts was conducted based on the four ocean analyses with CFSv2, starting from each April during 1982-2007. The MAE hindcasts were verified against a subset of hindcasts from the NCEP CFS Reanalysis and Reforecast (CFSRR) Project. Comparisons suggested that MAE shows better SST predictions than CFSRR over most regions where ocean dynamics plays a vital role in SST evolutions, such as the El Niño and Atlantic Niño regions. Furthermore, significant improvements were also found in summer precipitation predictions over the equatorial eastern Pacific and Atlantic oceans, for which the local SST prediction improvements should be responsible. The prediction improvements by MAE imply a problem for most current climate predictions which are based on a specific ocean analysis system. That is, their predictions would drift towards states biased by errors inherent in their ocean initialization system, and thus have large prediction errors. In contrast, MAE arguably has an advantage by sampling such structural uncertainties, and could efficiently cancel these errors out in their predictions.
Jerez, Sonia; Montavez, Juan P.; Gomez-Navarro, Juan J.; Jimenez-Guerrero, Pedro; Lorente, Raquel; Garcia-Valero, Juan A.; Jimenez, Pedro A.; Gonzalez-Rouco, Jose F.; Zorita, Eduardo
2010-05-01
Regional climate change projections are affected by several sources of uncertainty. Some of them come from Global Circulation Models and scenarios.; others come from the downscaling process. In the case of dynamical downscaling, mainly using Regional Climate Models (RCM), the sources of uncertainty may involve nesting strategies, related to the domain position and resolution, soil characterization, internal variability, methods of solving the equations, and the configuration of model physics. Therefore, a probabilistic approach seems to be recommendable when projecting regional climate change. This problem is usually faced by performing an ensemble of simulations. The aim of this study is to evaluate the range of uncertainty in regional climate projections associated to changing the physical configuration in a RCM (MM5) as well as the capability when reproducing the observed climate. This study is performed over the Iberian Peninsula and focuses on the reproduction of the Probability Density Functions (PDFs) of daily mean temperature. The experiments consist on a multi-physics ensemble of high resolution climate simulations (30 km over the target region) for the periods 1970-1999 (present) and 2070-2099 (future). Two sets of simulations for the present have been performed using ERA40 (MM5-ERA40) and ECHAM5-3CM run1 (MM5-E5-PR) as boundary conditions. The future the experiments are driven by ECHAM5-A2-run1 (MM5-E5-A2). The ensemble has a total of eight members, as the result of combining the schemes for PBL (MRF and ETA), cumulus (GRELL and Kain-Fritch) and microphysics (Simple-Ice and Mixed phase). In a previous work this multi-physics ensemble has been analyzed focusing on the seasonal mean values of both temperature and precipitation. The main results indicate that those physics configurations that better reproduce the observed climate project the most dramatic changes for the future (i.e, the largest temperature increase and precipitation decrease). Among the
van Driel, A. F.; Nikolaev, I. S.; Vergeer, P.; Lodahl, Peter; Vanmaelkelbergh, D.; Vos, W.L.
2007-01-01
We present a statistical analysis of time-resolved spontaneous emission decay curves from ensembles of emitters, such as semiconductor quantum dots, with the aim to interpret ubiquitous non-single-exponential decay. Contrary to what is widely assumed, the density of excited emitters and the intensity in an emission decay curve are not proportional, but the density is a time-integral of the intensity. The integral relation is crucial to correctly interpret non-single-exponential decay. We deri...
MAVENs: Motion analysis and visualization of elastic networks and structural ensembles
Zimmermann Michael T
2011-06-01
Full Text Available Abstract Background The ability to generate, visualize, and analyze motions of biomolecules has made a significant impact upon modern biology. Molecular Dynamics has gained substantial use, but remains computationally demanding and difficult to setup for many biologists. Elastic network models (ENMs are an alternative and have been shown to generate the dominant equilibrium motions of biomolecules quickly and efficiently. These dominant motions have been shown to be functionally relevant and also to indicate the likely direction of conformational changes. Most structures have a small number of dominant motions. Comparing computed motions to the structure's conformational ensemble derived from a collection of static structures or frames from an MD trajectory is an important way to understand functional motions as well as evaluate the models. Modes of motion computed from ENMs can be visualized to gain functional and mechanistic understanding and to compute useful quantities such as average positional fluctuations, internal distance changes, collectiveness of motions, and directional correlations within the structure. Results Our new software, MAVEN, aims to bring ENMs and their analysis to a broader audience by integrating methods for their generation and analysis into a user friendly environment that automates many of the steps. Models can be constructed from raw PDB files or density maps, using all available atomic coordinates or by employing various coarse-graining procedures. Visualization can be performed either with our software or exported to molecular viewers. Mixed resolution models allow one to study atomic effects on the system while retaining much of the computational speed of the coarse-grained ENMs. Analysis options are available to further aid the user in understanding the computed motions and their importance for its function. Conclusion MAVEN has been developed to simplify ENM generation, allow for diverse models to be used, and
An Introduction to Ensemble Methods for Data Analysis (Revised July, 2004)
Berk, Richard
2004-01-01
This paper provides an introduction to ensemble statistical procedures as a special case of algorithmic methods. The discussion beings with classification and regression trees (CART) as a didactic device to introduce many of the key issues. Following the material on CART is a consideration of cross-validation, bagging, random forests and boosting. Major points are illustrated with analyses of real data.
Ensemble learning incorporating uncertain registration.
Simpson, Ivor J A; Woolrich, Mark W; Andersson, Jesper L R; Groves, Adrian R; Schnabel, Julia A
2013-04-01
This paper proposes a novel approach for improving the accuracy of statistical prediction methods in spatially normalized analysis. This is achieved by incorporating registration uncertainty into an ensemble learning scheme. A probabilistic registration method is used to estimate a distribution of probable mappings between subject and atlas space. This allows the estimation of the distribution of spatially normalized feature data, e.g., grey matter probability maps. From this distribution, samples are drawn for use as training examples. This allows the creation of multiple predictors, which are subsequently combined using an ensemble learning approach. Furthermore, extra testing samples can be generated to measure the uncertainty of prediction. This is applied to separating subjects with Alzheimer's disease from normal controls using a linear support vector machine on a region of interest in magnetic resonance images of the brain. We show that our proposed method leads to an improvement in discrimination using voxel-based morphometry and deformation tensor-based morphometry over bootstrap aggregating, a common ensemble learning framework. The proposed approach also generates more reasonable soft-classification predictions than bootstrap aggregating. We expect that this approach could be applied to other statistical prediction tasks where registration is important. PMID:23288332
Simona Temereanca
2014-02-01
Full Text Available Understanding how ensembles of neurons represent and transmit information in the patterns of their joint spiking activity is a fundamental question in computational neuroscience. At present, analyses of spiking activity from neuronal ensembles are limited because multivariate point process (MPP models cannot represent simultaneous occurrences of spike events at an arbitrarily small time resolution. Solo recently reported a simultaneous-event multivariate point process (SEMPP model to correct this key limitation. In this paper, we show how Solo's discrete-time formulation of the SEMPP model can be efficiently fit to ensemble neural spiking activity using a multinomial generalized linear model (mGLM. Unlike existing approximate procedures for fitting the discrete-time SEMPP model, the mGLM is an exact algorithm. The MPP time-rescaling theorem can be used to assess model goodness-of-fit. We also derive a new marked point-process (MkPP representation of the SEMPP model that leads to new thinning and time-rescaling algorithms for simulating an SEMPP stochastic process. These algorithms are much simpler than multivariate extensions of algorithms for simulating a univariate point process, and could not be arrived at without the MkPP representation. We illustrate the versatility of the SEMPP model by analyzing neural spiking activity from pairs of simultaneously-recorded rat thalamic neurons stimulated by periodic whisker deflections, and by simulating SEMPP data. In the data analysis example, the SEMPP model demonstrates that whisker motion significantly modulates simultaneous spiking activity at the one millisecond time scale and that the stimulus effect is more than one order of magnitude greater for simultaneous activity compared with non-simultaneous activity. Together, the mGLM, the MPP time-rescaling theorem and the MkPP representation of the SEMPP model offer a theoretically sound, practical tool for measuring joint spiking propensity in a
A Classifier Ensemble of Binary Classifier Ensembles
Sajad Parvin
2011-09-01
Full Text Available This paper proposes an innovative combinational algorithm to improve the performance in multiclass classification domains. Because the more accurate classifier the better performance of classification, the researchers in computer communities have been tended to improve the accuracies of classifiers. Although a better performance for classifier is defined the more accurate classifier, but turning to the best classifier is not always the best option to obtain the best quality in classification. It means to reach the best classification there is another alternative to use many inaccurate or weak classifiers each of them is specialized for a sub-space in the problem space and using their consensus vote as the final classifier. So this paper proposes a heuristic classifier ensemble to improve the performance of classification learning. It is specially deal with multiclass problems which their aim is to learn the boundaries of each class from many other classes. Based on the concept of multiclass problems classifiers are divided into two different categories: pairwise classifiers and multiclass classifiers. The aim of a pairwise classifier is to separate one class from another one. Because of pairwise classifiers just train for discrimination between two classes, decision boundaries of them are simpler and more effective than those of multiclass classifiers.The main idea behind the proposed method is to focus classifier in the erroneous spaces of problem and use of pairwise classification concept instead of multiclass classification concept. Indeed although usage of pairwise classification concept instead of multiclass classification concept is not new, we propose a new pairwise classifier ensemble with a very lower order. In this paper, first the most confused classes are determined and then some ensembles of classifiers are created. The classifiers of each of these ensembles jointly work using majority weighting votes. The results of these ensembles
Papiotis, Panos; Marchini, Marco; Perez-Carrillo, Alfonso; Maestre, Esteban
2014-01-01
In a musical ensemble such as a string quartet, the musicians interact and influence each other's actions in several aspects of the performance simultaneously in order to achieve a common aesthetic goal. In this article, we present and evaluate a computational approach for measuring the degree to which these interactions exist in a given performance. We recorded a number of string quartet exercises under two experimental conditions (solo and ensemble), acquiring both audio and bowing motion data. Numerical features in the form of time series were extracted from the data as performance descriptors representative of four distinct dimensions of the performance: Intonation, Dynamics, Timbre, and Tempo. Four different interdependence estimation methods (two linear and two nonlinear) were applied to the extracted features in order to assess the overall level of interdependence between the four musicians. The obtained results suggest that it is possible to correctly discriminate between the two experimental conditions by quantifying interdependence between the musicians in each of the studied performance dimensions; the nonlinear methods appear to perform best for most of the numerical features tested. Moreover, by using the solo recordings as a reference to which the ensemble recordings are contrasted, it is feasible to compare the amount of interdependence that is established between the musicians in a given performance dimension across all exercises, and relate the results to the underlying goal of the exercise. We discuss our findings in the context of ensemble performance research, the current limitations of our approach, and the ways in which it can be expanded and consolidated. PMID:25228894
Panos ePapiotis
2014-09-01
Full Text Available In a musical ensemble such as a string quartet, the musicians interact and influence each other’s actions in several aspects of the performance simultaneously in order to achieve a common aesthetic goal. In this article, we present and evaluate a computational approach for measuring the degree to which these interactions exist in a given performance. We recorded a number of string quartet exercises under two experimental conditions (solo and ensemble, acquiring both audio and bowing motion data. Numerical features in the form of time series were extracted from the data as performance descriptors representative of four distinct dimensions of the performance: Intonation, Dynamics, Timbre and Tempo. Four different interdependence estimation methods (two linear and two nonlinear were applied to the extracted features in order to assess the overall level of interdependence between the four musicians. The obtained results suggest that it is possible to correctly discriminate between the two experimental conditions by quantifying interdependence between the musicians in each of the studied performance dimensions; the nonlinear methods appear to perform best for most of the numerical features tested. Moreover, by using the solo recordings as a reference to which the ensemble recordings are contrasted, it is feasible to compare the amount of interdependence that is established between the musicians in a given performance dimension across all exercises, and relate the results to the underlying goal of the exercise. We discuss our findings in the context of ensemble performance research, the current limitations of our approach, and the ways in which it can be expanded and consolidated.
Convergence analysis of combinations of different methods
Kang, Y. [Clarkson Univ., Potsdam, NY (United States)
1994-12-31
This paper provides a convergence analysis for combinations of different numerical methods for solving systems of differential equations. The author proves that combinations of two convergent linear multistep methods or Runge-Kutta methods produce a new convergent method of which the order is equal to the smaller order of the two original methods.
Kasiviswanathan, K.; Sudheer, K.
2013-05-01
Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived
Xue, Xin; Wei, Jin-Lian; Xu, Li-Li; Xi, Mei-Yang; Xu, Xiao-Li; Liu, Fang; Guo, Xiao-Ke; Wang, Lei; Zhang, Xiao-Jin; Zhang, Ming-Ye; Lu, Meng-Chen; Sun, Hao-Peng; You, Qi-Dong
2013-10-28
Protein-protein interactions (PPIs) play a crucial role in cellular function and form the backbone of almost all biochemical processes. In recent years, protein-protein interaction inhibitors (PPIIs) have represented a treasure trove of potential new drug targets. Unfortunately, there are few successful drugs of PPIIs on the market. Structure-based pharmacophore (SBP) combined with docking has been demonstrated as a useful Virtual Screening (VS) strategy in drug development projects. However, the combination of target complexity and poor binding affinity prediction has thwarted the application of this strategy in the discovery of PPIIs. Here we report an effective VS strategy on p53-MDM2 PPI. First, we built a SBP model based on p53-MDM2 complex cocrystal structures. The model was then simplified by using a Receptor-Ligand complex-based pharmacophore model considering the critical binding features between MDM2 and its small molecular inhibitors. Cascade docking was subsequently applied to improve the hit rate. Based on this strategy, we performed VS on NCI and SPECS databases and successfully discovered 6 novel compounds from 15 hits with the best, compound 1 (NSC 5359), K(i) = 180 ± 50 nM. These compounds can serve as lead compounds for further optimization. PMID:24050442
A Flexible Approach for the Statistical Visualization of Ensemble Data
Potter, K. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute; Wilson, A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bremer, P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Williams, Dean N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, V. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute; Johnson, C. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute
2009-09-29
Scientists are increasingly moving towards ensemble data sets to explore relationships present in dynamic systems. Ensemble data sets combine spatio-temporal simulation results generated using multiple numerical models, sampled input conditions and perturbed parameters. While ensemble data sets are a powerful tool for mitigating uncertainty, they pose significant visualization and analysis challenges due to their complexity. We present a collection of overview and statistical displays linked through a high level of interactivity to provide a framework for gaining key scientific insight into the distribution of the simulation results as well as the uncertainty associated with the data. In contrast to methods that present large amounts of diverse information in a single display, we argue that combining multiple linked statistical displays yields a clearer presentation of the data and facilitates a greater level of visual data analysis. We demonstrate this approach using driving problems from climate modeling and meteorology and discuss generalizations to other fields.
Van Driel, A.F.; Nikolaev, I.S.; Vergeer, P.; Lodahl, Peter; Vanmaelkelbergh, D.; Vos, W.L.
2007-01-01
We present a statistical analysis of time-resolved spontaneous emission decay curves from ensembles of emitters, such as semiconductor quantum dots, with the aim of interpreting ubiquitous non-single-exponential decay. Contrary to what is widely assumed, the density of excited emitters and the...... intensity in an emission decay curve are not proportional, but the density is a time integral of the intensity. The integral relation is crucial to correctly interpret non-single-exponential decay. We derive the proper normalization for both a discrete and a continuous distribution of rates, where every...... single number, but is also distributed. We derive a practical description of non-single-exponential emission decay curves in terms of a single distribution of decay rates; the resulting distribution is identified as the distribution of total decay rates weighted with the radiative rates. We apply our...
Spin–Orbit Alignment of Exoplanet Systems: Ensemble Analysis Using Asteroseismology
Campante, T. L.; Lund, M. N.; Kuszlewicz, James S.;
2016-01-01
The angle ψ between a planet’s orbital axis and the spin axis of its parent star is an important diagnostic of planet formation, migration, and tidal evolution. We seek empirical constraints on ψ by measuring the stellar inclination i s via asteroseismology for an ensemble of 25 solar-type hosts...... observed with NASA’s Kepler satellite. Our results for i s are consistent with alignment at the 2 σ level for all stars in the sample, meaning that the system surrounding the red-giant star Kepler-56 remains as the only unambiguous misaligned multiple-planet system detected to date. The availability of a...... measurement of the projected spin–orbit angle λ for two of the systems allows us to estimate ψ . We find that the orbit of the hot Jupiter HAT-P-7b is likely to be retrograde ( ##IMG## [http://ej.iop.org/images/0004-637X/819/1/85/apj522683ieqn1.gif] $psi =116rc. 4_-14.7^+30.2$ ), whereas that of Kepler-25c...
Spin-orbit alignment of exoplanet systems: ensemble analysis using asteroseismology
Campante, T L; Kuszlewicz, J S; Davies, G R; Chaplin, W J; Albrecht, S; Winn, J N; Bedding, T R; Benomar, O; Bossini, D; Handberg, R; Santos, A R G; Van Eylen, V; Basu, S; Christensen-Dalsgaard, J; Elsworth, Y P; Hekker, S; Hirano, T; Huber, D; Karoff, C; Kjeldsen, H; Lundkvist, M S; North, T S H; Aguirre, V Silva; Stello, D; White, T R
2016-01-01
The angle $\\psi$ between a planet's orbital axis and the spin axis of its parent star is an important diagnostic of planet formation, migration, and tidal evolution. We seek empirical constraints on $\\psi$ by measuring the stellar inclination $i_{\\rm s}$ via asteroseismology for an ensemble of 25 solar-type hosts observed with NASA's Kepler satellite. Our results for $i_{\\rm s}$ are consistent with alignment at the 2-$\\sigma$ level for all stars in the sample, meaning that the system surrounding the red-giant star Kepler-56 remains as the only unambiguous misaligned multiple-planet system detected to date. The availability of a measurement of the projected spin-orbit angle $\\lambda$ for two of the systems allows us to estimate $\\psi$. We find that the orbit of the hot-Jupiter HAT-P-7b is likely to be retrograde ($\\psi=116.4^{+30.2}_{-14.7}\\:{\\rm deg}$), whereas that of Kepler-25c seems to be well aligned with the stellar spin axis ($\\psi=12.6^{+6.7}_{-11.0}\\:{\\rm deg}$). While the latter result is in apparent...
Hierarchical Bayes Ensemble Kalman Filtering
Tsyrulnikov, Michael
2015-01-01
Ensemble Kalman filtering (EnKF), when applied to high-dimensional systems, suffers from an inevitably small affordable ensemble size, which results in poor estimates of the background error covariance matrix ${\\bf B}$. The common remedy is a kind of regularization, usually an ad-hoc spatial covariance localization (tapering) combined with artificial covariance inflation. Instead of using an ad-hoc regularization, we adopt the idea by Myrseth and Omre (2010) and explicitly admit that the ${\\bf B}$ matrix is unknown and random and estimate it along with the state (${\\bf x}$) in an optimal hierarchical Bayes analysis scheme. We separate forecast errors into predictability errors (i.e. forecast errors due to uncertainties in the initial data) and model errors (forecast errors due to imperfections in the forecast model) and include the two respective components ${\\bf P}$ and ${\\bf Q}$ of the ${\\bf B}$ matrix into the extended control vector $({\\bf x},{\\bf P},{\\bf Q})$. Similarly, we break the traditional backgrou...
Ensemble algorithms in reinforcement learning
Wiering, Marco A; van Hasselt, Hado
2008-01-01
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and imple
Ensemble algorithms in reinforcement learning.
Wiering, Marco A; van Hasselt, Hado
2008-08-01
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and implemented four different ensemble methods combining the following five different RL algorithms: Q-learning, Sarsa, actor-critic (AC), QV-learning, and AC learning automaton. The intuitively designed ensemble methods, namely, majority voting (MV), rank voting, Boltzmann multiplication (BM), and Boltzmann addition, combine the policies derived from the value functions of the different RL algorithms, in contrast to previous work where ensemble methods have been used in RL for representing and learning a single value function. We show experiments on five maze problems of varying complexity; the first problem is simple, but the other four maze tasks are of a dynamic or partially observable nature. The results indicate that the BM and MV ensembles significantly outperform the single RL algorithms. PMID:18632380
Research on ocean internal waves using seismic oceanography is a frontier issue both for marine geophysicists and physical oceanographers. Images of the ocean water layer obtained by conventional processing of multichannel seismic reflection data can show the overall patterns of internal waves. However, in order to extract more information from the seismic data, new tools need to be developed. Here, we use the ensemble empirical mode decomposition (EEMD) method to decompose vertical displacement data from seismic sections and apply this method to a seismic section from the northeastern South China Sea, where clear internal waves are observed. Compared with the conventional empirical mode decomposition method, EEMD has greatly reduced the scale mixing problems induced in the decomposition results. The results obtained show that the internal waves in this area are composed of different characteristic wavelengths at different depths. The depth range of 200–1050 m contains internal waves with a wavelength of 1.25 km that are very well coupled in the vertical direction. The internal waves with a wavelength of 3 km, in the depth range of 200–600 m, are also well coupled, but in an oblique direction; this suggests that the propagation speed of internal waves of this scale changes with depth in this area. Finally, the internal waves with a wavelength of 6.5 km, observed in the depth range of 200–800 m, are separated into two parts with a phase difference of about 90°, by a clear interface at a depth of 650 m; this allows us to infer an oblique propagation of wave energy of this scale. (paper)
Xue, Yan; Balmaseda, Magdalena A.; Boyer, Tim; Ferry, Nicolas; Good, Simon; Ishikawa, Ichiro; Rienecker, Michele; Rosati, Tony; Yin, Yonghong; Kumar, Arun
2012-01-01
Upper ocean heat content (HC) is one of the key indicators of climate variability on many time-scales extending from seasonal to interannual to long-term climate trends. For example, HC in the tropical Pacific provides information on thermocline anomalies that is critical for the longlead forecast skill of ENSO. Since HC variability is also associated with SST variability, a better understanding and monitoring of HC variability can help us understand and forecast SST variability associated with ENSO and other modes such as Indian Ocean Dipole (IOD), Pacific Decadal Oscillation (PDO), Tropical Atlantic Variability (TAV) and Atlantic Multidecadal Oscillation (AMO). An accurate ocean initialization of HC anomalies in coupled climate models could also contribute to skill in decadal climate prediction. Errors, and/or uncertainties, in the estimation of HC variability can be affected by many factors including uncertainties in surface forcings, ocean model biases, and deficiencies in data assimilation schemes. Changes in observing systems can also leave an imprint on the estimated variability. The availability of multiple operational ocean analyses (ORA) that are routinely produced by operational and research centers around the world provides an opportunity to assess uncertainties in HC analyses, to help identify gaps in observing systems as they impact the quality of ORAs and therefore climate model forecasts. A comparison of ORAs also gives an opportunity to identify deficiencies in data assimilation schemes, and can be used as a basis for development of real-time multi-model ensemble HC monitoring products. The OceanObs09 Conference called for an intercomparison of ORAs and use of ORAs for global ocean monitoring. As a follow up, we intercompared HC variations from ten ORAs -- two objective analyses based on in-situ data only and eight model analyses based on ocean data assimilation systems. The mean, annual cycle, interannual variability and longterm trend of HC have
Energy Analysis in Combined Reforming of Propane
K. Moon
2013-01-01
Full Text Available Combined (steam and CO2 reforming is one of the methods to produce syngas for different applications. An energy requirement analysis of steam reforming to dry reforming with intermediate steps of steam reduction and equivalent CO2 addition to the feed fuel for syngas generation has been done to identify condition for optimum process operation. Thermodynamic equilibrium data for combined reforming was generated for temperature range of 400–1000°C at 1 bar pressure and combined oxidant (CO2 + H2O stream to propane (fuel ratio of 3, 6, and 9 by employing the Gibbs free energy minimization algorithm of HSC Chemistry software 5.1. Total energy requirement including preheating and reaction enthalpy calculations were done using the equilibrium product composition. Carbon and methane formation was significantly reduced in combined reforming than pure dry reforming, while the energy requirements were lower than pure steam reforming. Temperatures of minimum energy requirement were found in the data analysis of combined reforming which were optimum for the process.
ZHOU Fu-chang; CHEN Jin; HE Jun; BI Guo; LI Fu-cai; ZHANG Gui-cai
2005-01-01
The vibration signals of rolling element bearing are produced by a combination of periodic and random processes due to the machine's rotation cycle and interaction with the real world. The combination of such components can give rise to signals, which have periodically time-varying ensemble statistical and are best considered as cyclostationary. When the early fault occurs, the background noise is very heavy, it is difficult to disclose the latent periodic components successfully using cyclostationary analysis alone. In this paper the degree of cyclostationarity is combined with wavelet filtering for detection of rolling element bearing early faults. Using the proposed entropy minimization rule. The parameters of the wavelet filter are optimized. This method is shown to be effective in detecting rolling element bearing early fault when cyclostationary analysis by itself fails.
A Selective Fuzzy Clustering Ensemble Algorithm
Kai Li; Peng Li
2013-01-01
To improve the performance of clustering ensemble method, a selective fuzzy clustering ensemble algorithm is proposed. It mainly includes selection of clustering ensemble members and combination of clustering results. In the process of member selection, measure method is defined to select the better clustering members. Then some selected clustering members are viewed as hyper-graph in order to select the more influential hyper-edges (or features) and to weight the selected features. For proce...
CINR difference analysis of optimal combining versus maximal ratio combining
Burke, J. P.; Zeidler, J R; Rao, B D
2005-01-01
The statistical gain differences between two common spatial combining algorithms: optimum combining (OC) and maximal ratio combining (MRC) are analyzed using a gain ratio method. Using the receive carrier-to-interference plus noise ratio (CINR), the gain ratio CINROC/CINRMRC is evaluated in a flat Rayeligh fading communications system with multiple interferers. Exact analytical solutions are derived for the probability density function (PDF) and the average gain ratio with one interferer. Whe...
Analysis of fractals with combined partition
Dedovich, T. G.; Tokarev, M. V.
2016-03-01
The space—time properties in the general theory of relativity, as well as the discreteness and non-Archimedean property of space in the quantum theory of gravitation, are discussed. It is emphasized that the properties of bodies in non-Archimedean spaces coincide with the properties of the field of P-adic numbers and fractals. It is suggested that parton showers, used for describing interactions between particles and nuclei at high energies, have a fractal structure. A mechanism of fractal formation with combined partition is considered. The modified SePaC method is offered for the analysis of such fractals. The BC, PaC, and SePaC methods for determining a fractal dimension and other fractal characteristics (numbers of levels and values of a base of forming a fractal) are considered. It is found that the SePaC method has advantages for the analysis of fractals with combined partition.
A mollified Ensemble Kalman filter
Bergemann, Kay
2010-01-01
It is well recognized that discontinuous analysis increments of sequential data assimilation systems, such as ensemble Kalman filters, might lead to spurious high frequency adjustment processes in the model dynamics. Various methods have been devised to continuously spread out the analysis increments over a fixed time interval centered about analysis time. Among these techniques are nudging and incremental analysis updates (IAU). Here we propose another alternative, which may be viewed as a hybrid of nudging and IAU and which arises naturally from a recently proposed continuous formulation of the ensemble Kalman analysis step. A new slow-fast extension of the popular Lorenz-96 model is introduced to demonstrate the properties of the proposed mollified ensemble Kalman filter.
J. Dietrich
2008-03-01
Full Text Available Flood forecasts are essential to issue reliable flood warnings and to initiate flood control measures on time. The accuracy and the lead time of the predictions for head waters primarily depend on the meteorological forecasts. Ensemble forecasts are a means of framing the uncertainty of the potential future development of the hydro-meteorological situation.
This contribution presents a flood management strategy based on probabilistic hydrological forecasts driven by operational meteorological ensemble prediction systems. The meteorological ensemble forecasts are transformed into discharge ensemble forecasts by a rainfall-runoff model. Exceedance probabilities for critical discharge values and probabilistic maps of inundation areas can be computed and presented to decision makers. These results can support decision makers in issuing flood alerts. The flood management system integrates ensemble forecasts with different spatial resolution and different lead times. The hydrological models are controlled in an adaptive way, mainly depending on the lead time of the forecast, the expected magnitude of the flood event and the availability of measured data.
The aforementioned flood forecast techniques have been applied to a case study. The Mulde River Basin (South-Eastern Germany, Czech Republic has often been affected by severe flood events including local flash floods. Hindcasts for the large scale extreme flood in August 2002 have been computed using meteorological predictions from both the COSMO-LEPS ensemble prediction system and the deterministic COSMO-DE local model. The temporal evolution of a the meteorological forecast uncertainty and b the probability of exceeding flood alert levels is discussed. Results from the hindcast simulations demonstrate, that the systems would have predicted a high probability of an extreme flood event, if they would already have been operational in 2002. COSMO-LEPS showed a reasonably good
Combined XRF and PIXE analysis of flour
Combined X-Ray Fluorescence (XRF) and Proton Induced X-ray Emission (PIXE) techniques were used for the determination of trace and minor elements in two different samples of flour purchased at the local market. The significance of some of the elements found in the samples is discussed from the viewpoint of nutrition. It is also shown that XRF can be a useful complementary technique for PIXE analysis of flour
The Fukushima-137Cs deposition case study: properties of the multi-model ensemble
In this paper we analyse the properties of an eighteen-member ensemble generated by the combination of five atmospheric dispersion modelling systems and six meteorological data sets. The models have been applied to the total deposition of 137Cs, following the nuclear accident of the Fukushima power plant in March 2011. Analysis is carried out with the scope of determining whether the ensemble is reliable, sufficiently diverse and if its accuracy and precision can be improved. Although ensemble practice is becoming more and more popular in many geophysical applications, good practice guidelines are missing as to how models should be combined for the ensembles to offer an improvement over single model realisations. We show that the ensemble of models share large portions of bias and variance and make use of several techniques to further show that subsets of models can explain the same amount of variance as the full ensemble mean with the advantage of being poorly correlated, allowing to save computational resources and reduce noise (and thus improving accuracy). We further propose and discuss two methods for selecting subsets of skilful and diverse members, and prove that, in the contingency of the present analysis, their mean outscores the full ensemble mean in terms of both accuracy (error) and precision (variance)
Mandel, Jan; Kondratenko, Volodymyr Y
2010-01-01
We propose a new type of the Ensemble Kalman Filter (EnKF), which uses the Fast Fourier Transform (FFT) for covariance estimation from a very small ensemble with automatic tapering, and for a fast computation of the analysis ensemble by convolution, avoiding the need to solve a sparse system with the tapered matrix. The FFT EnKF is combined with the morphing EnKF to enable the correction of position errors, in addition to amplitude errors, and demonstrated on WRF-Fire, the Weather Research Forecasting (WRF) model coupled with a fire spread model implemented by the level set method.
Jogendra Kushwah
2013-06-01
Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.
Bouallegue, Zied Ben; Theis, Susanne E; Pinson, Pierre
2015-01-01
Probabilistic forecasts in the form of ensemble of scenarios are required for complex decision making processes. Ensemble forecasting systems provide such products but the spatio-temporal structures of the forecast uncertainty is lost when statistical calibration of the ensemble forecasts is applied for each lead time and location independently. Non-parametric approaches allow the reconstruction of spatio-temporal joint probability distributions at a low computational cost.For example, the ensemble copula coupling (ECC) method consists in rebuilding the multivariate aspect of the forecast from the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error. The new approach which preserves the dynamical development of the ensemble members is called dynamic ensemble copula coupling (...
Measuring social interaction in music ensembles.
Volpe, Gualtiero; D'Ausilio, Alessandro; Badino, Leonardo; Camurri, Antonio; Fadiga, Luciano
2016-05-01
Music ensembles are an ideal test-bed for quantitative analysis of social interaction. Music is an inherently social activity, and music ensembles offer a broad variety of scenarios which are particularly suitable for investigation. Small ensembles, such as string quartets, are deemed a significant example of self-managed teams, where all musicians contribute equally to a task. In bigger ensembles, such as orchestras, the relationship between a leader (the conductor) and a group of followers (the musicians) clearly emerges. This paper presents an overview of recent research on social interaction in music ensembles with a particular focus on (i) studies from cognitive neuroscience; and (ii) studies adopting a computational approach for carrying out automatic quantitative analysis of ensemble music performances. PMID:27069054
Evaluation of LDA Ensembles Classifiers for Brain Computer Interface
The Brain Computer Interface (BCI) translates brain activity into computer commands. To increase the performance of the BCI, to decode the user intentions it is necessary to get better the feature extraction and classification techniques. In this article the performance of a three linear discriminant analysis (LDA) classifiers ensemble is studied. The system based on ensemble can theoretically achieved better classification results than the individual counterpart, regarding individual classifier generation algorithm and the procedures for combine their outputs. Classic algorithms based on ensembles such as bagging and boosting are discussed here. For the application on BCI, it was concluded that the generated results using ER and AUC as performance index do not give enough information to establish which configuration is better.
Thermodynamic Analysis of Combined Cycle Power Plant
A.K.Tiwari,
2010-04-01
Full Text Available Air Bottoming Cycle (ABC can replace the heat recovery steam generator and the steam turbine of the conventionalcombined cycle plant. The exhaust energy of the topping gas turbine of existing combine cycle is sent to gas-air heat exchange, which heats the air in the secondary gas turbine cycle. In 1980’s the ABC was proposed as an alternative for the conventional steam bottoming cycle. In spite of the cost of reducing hardware installations it could achieve a thermal efficiency of 80%. The complete thermodynamic analysis of the system has been performed by using specially designed programme, enabling the variation of main independent variables. The result shows the gain in net work output as well as efficiency of combined cycle is 35% to 68%.
Medical images usually suffer from a partial volume effect (PVE), which may degrade the accuracy of any quantitative information extracted from the images. Our aim was to recreate accurate radioactivity concentration and time-activity curves (TACs) by microPET R4 quantification using ensemble learning independent component analysis (EL-ICA). We designed a digital cardiac phantom for this simulation and in order to evaluate the ability of EL-ICA to correct the PVE, the simulated images were convoluted using a Gaussian function (FWHM = 1-4 mm). The robustness of the proposed method towards noise was investigated by adding statistical noise (SNR = 2-16). During further evaluation, another set of cardiac phantoms were generated from the reconstructed images, and Poisson noise at different levels was added to the sinogram. In real experiments, four rat microPET images and a number of arterial blood samples were obtained; these were used to estimate the metabolic rate of FDG (MRFDG). Input functions estimated using the FastICA method were used for comparison. The results showed that EL-ICA could correct PVE in both the simulated and real cases. After correcting for the PVE, the errors for MRFDG, when estimated by the EL-ICA method, were smaller than those when TACs were directly derived from the PET images and when the FastICA approach was used.
Morzfeld, Matthias
2015-01-01
In data assimilation one updates the state of a numerical model with information from sparse and noisy observations of the model's state. A popular approach to data assimilation in geophysical applications is the ensemble Kalman filter (EnKF). An alternative approach is particle filtering and, recently, much theoretical work has been done to understand the abilities and limitations of particle filters. Here we extend this work to EnKF. First we explain that EnKF and particle filters solve different problems: the EnKF approximates a specific marginal of the joint posterior of particle filters. We then perform a linear analysis of the EnKF as a sequential sampling algorithm for the joint posterior (i.e. as a particle filter), and show that the EnKF collapses on this problem in the exact same way and under similar conditions as particle filters. However, it is critical to realize that the collapse of the EnKF on the joint posterior does not imply its collapse on the marginal posterior. This raises the question, ...
PHARMACOECONOMIC ANALYSIS OF ANTIHYPERTENSIVE DRUG COMBINATIONS USE
E. I. Tarlovskaya
2015-09-01
Full Text Available Aim. To pursue pharmacoeconomic analysis of two drug combinations of ACE inhibitor (enalapril and diuretic.Material and methods. Patients with arterial hypertension degree 2 and diabetes mellitus type 2 without ischemic heart disease (n=56 were included into the study. Blood pressure (BP dynamics and cost/effectiveness ratio were evaluated.Results. In group A (fixed combination of original enalapril/hydrochlorothiazide 61% of patients achieved target BP level with initial dose, and the rest 39% of patients – with double dose. In group B (non-fixed combination of generic enalapril/indapamide 60% of patients achieved the target BP with initial dose of drugs, 33% - with double dose of ACE inhibitor, and 7% - with additional amlodipine administration. In patients of group A systolic BP (SBP reduction was 45.82±1.23 mm Hg by the 12th week vs. 40.0±0.81 mm Hg in patients of group B; diastolic BP (DBP reduction was 22.47±1.05 mm Hg and 18.76±0.70 mm Hg, respectively, by the 12th week of treatment. In the first month of treatment costs of target BP achievement was 298.62 rubles per patient in group A, and 299.50 rubles – in group B; by the 12th week of treatment – 629.45 and 631.22 rubles, respectively. Costs of SBP and DBP reduction by 1 mm Hg during 12 weeks of therapy were 13 and 27 rubles per patient, respectively, in group A, and 16 and 34 rubles per patient, respectively, in group B.Conclusion. The original fixed combination (enalapril+hydrochlorothiazide proved to be more clinically effective and more cost effective in the treatment of hypertensive patients in comparison with the non-fixed combination of generic drugs (enalapril+indapamide.
Supervised Ensemble Classification of Kepler Variable Stars
Bass, Gideon
2016-01-01
Variable star analysis and classification is an important task in the understanding of stellar features and processes. While historically classifications have been done manually by highly skilled experts, the recent and rapid expansion in the quantity and quality of data has demanded new techniques, most notably automatic classification through supervised machine learning. We present an expansion of existing work on the field by analyzing variable stars in the {\\em Kepler} field using an ensemble approach, combining multiple characterization and classification techniques to produce improved classification rates. Classifications for each of the roughly 150,000 stars observed by {\\em Kepler} are produced separating the stars into one of 14 variable star classes.
A Gaussian mixture ensemble transform filter
Reich, Sebastian
2011-01-01
We generalize the popular ensemble Kalman filter to an ensemble transform filter where the prior distribution can take the form of a Gaussian mixture or a Gaussian kernel density estimator. The design of the filter is based on a continuous formulation of the Bayesian filter analysis step. We call the new filter algorithm the ensemble Gaussian mixture filter (EGMF). The EGMF is implemented for three simple test problems (Brownian dynamics in one dimension, Langevin dynamics in two dimensions, ...
The Ensembl Variant Effect Predictor.
McLaren, William; Gil, Laurent; Hunt, Sarah E; Riat, Harpreet Singh; Ritchie, Graham R S; Thormann, Anja; Flicek, Paul; Cunningham, Fiona
2016-01-01
The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs. PMID:27268795
Laursen, Bo
1998-01-01
In this article the author proposes a solution to the classical problem in European lexical semantics of delimiting lexical fields, a problem that most field-oriented semanticists involved in practical lexico-semantic analysis have found themselves confronted with. What are the criteria for sayin...
ZHENG Xiaogu; WU Guocan; ZHANG Shupeng; LIANG Xiao; DAI Yongjiu; LI Yong
2013-01-01
Correctly estimating the forecast error covariance matrix is a key step in any data assimilation scheme.If it is not correctly estimated,the assimilated states could be far from the true states.A popular method to address this problem is error covariance matrix inflation.That is,to multiply the forecast error covariance matrix by an appropriate factor.In this paper,analysis states are used to construct the forecast error covariance matrix and an adaptive estimation procedure associated with the error covariance matrix inflation technique is developed.The proposed assimilation scheme was tested on the Lorenz-96 model and 2D Shallow Water Equation model,both of which are associated with spatially correlated observational systems.The experiments showed that by introducing the proposed structure of the forecast error covariance matrix and applying its adaptive estimation procedure,the assimilation results were further improved.
Roberto F Galán
2010-09-01
Full Text Available We have combined neurophysiologic recording, statistical analysis, and computational modeling to investigate the dynamics of the respiratory network in the brainstem. Using a multielectrode array, we recorded ensembles of respiratory neurons in perfused in situ rat preparations that produce spontaneous breathing patterns, focusing on inspiratory pre-motor neurons. We compared firing rates and neuronal synchronization among these neurons before and after a brief hypoxic stimulus. We observed a significant decrease in the number of spikes after stimulation, in part due to a transient slowing of the respiratory pattern. However, the median interspike interval did not change, suggesting that the firing threshold of the neurons was not affected but rather the synaptic input was. A bootstrap analysis of synchrony between spike trains revealed that, both before and after brief hypoxia, up to 45 % (but typically less than 5 % of coincident spikes across neuronal pairs was not explained by chance. Most likely, this synchrony resulted from common synaptic input to the pre-motor population, an example of stochastic synchronization. After brief hypoxia most pairs were less synchronized, although some were more, suggesting that the respiratory network was “rewired” transiently after the stimulus. To investigate this hypothesis, we created a simple computational model with feed-forward divergent connections along the inspiratory pathway. Assuming that 1 the number of divergent projections was not the same for all presynaptic cells, but rather spanned a wide range and 2 that the stimulus increased inhibition at the top of the network; this model reproduced the reduction in firing rate and bootstrap-corrected synchrony subsequent to hypoxic stimulation observed in our experimental data.
Multinomial logistic regression ensembles.
Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J
2013-05-01
This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model. PMID:23611203
Evaluation of an ensemble-based incremental variational data assimilation
Yang, Yin; Robinson, Cordelia; Heitz, Dominique; Mémin, Etienne
2014-01-01
In this work, we aim at studying ensemble based optimal control strategies for data assimilation. Such formulation nicely combines the ingredients of ensemble Kalman filters and variational data assimilation (4DVar). In the same way as variational assimilation schemes, it is formulated as the minimization of an objective function, but similarly to ensemble filter, it introduces in its objective function an empirical ensemble-based background-error covariance and works in an off-line smoothing...
Geophysical inversion with a neighbourhood algorithm-II. Appraising the ensemble
Sambridge, Malcolm
1999-09-01
Monte Carlo direct search methods, such as genetic algorithms, simulated annealing, etc., are often used to explore a finite-dimensional parameter space. They require the solving of the forward problem many times, that is, making predictions of observables from an earth model. The resulting ensemble of earth models represents all `information' collected in the search process. Search techniques have been the subject of much study in geophysics; less attention is given to the appraisal of the ensemble. Often inferences are based on only a small subset of the ensemble, and sometimes a single member. This paper presents a new approach to the appraisal problem. To our knowledge this is the first time the general case has been addressed, that is, how to infer information from a complete ensemble, previously generated by any search method. The essence of the new approach is to use the information in the available ensemble to guide a resampling of the parameter space. This requires no further solving of the forward problem, but from the new `resampled' ensemble we are able to obtain measures of resolution and trade-off in the model parameters, or any combinations of them. The new ensemble inference algorithm is illustrated on a highly non-linear wave-form inversion problem. It is shown how the computation time and memory requirements scale with the dimension of the parameter space and size of the ensemble. The method is highly parallel, and may easily be distributed across several computers. Since little is assumed about the initial ensemble of earth models, the technique is applicable to a wide variety of situations. For example, it may be applied to perform `error analysis' using the ensemble generated by a genetic algorithm, or any other direct search method.
Sanchez, E.; Zaninelli, P.; Carril, A.; Menendez, C.; Dominguez, M.
2012-04-01
An ensemble of seven regional climate models (RCM) included in the European CLARIS-LPB project (A Europe-South America Network for Climate Change Assessment and Impact Studies in La Plata Basin) are used to study how some features related to climatic extremes are projected to be changed by the end of XXIst century. These RCMs are forced by different IPCC-AR4 global climate models (IPSL, ECHAM5 and HadCM3), covering three different 30-year periods: present (1960-1990), near future (2010-2040) and distant future (2070-2100), with 50km of horizontal resolution. These regional climate models have previously been forced with ERA-Interim reanalysis, in a consistent procedure with CORDEX (A COordinated Regional climate Downscaling EXperiment) initiative for the South-America domain. The analysis shows a good agreement among them and the available observational databases to describe the main features of the mean climate of the continent. Here we focus our analysis on some topics of interest related to extreme events, such as the development of diagnostics related to dry-spells length, the structure of the frequency distribution functions over several subregions defined by more or less homogeneous climatic conditions (four sub-basins over the La Plata Basin, the southern part of the Amazon basin, Northeast Brazil, and the South Atlantic Convergence Zone (SACZ)), the structure of the annual cycle and their main features and relation with the length of the seasons, or the frequency of anomalous hot or cold events. One shortcoming that must be considered is the lack of observational databases with both time and spatial frequency to validate model outputs. At the same time, one challenging issue of this study is the regional modelling description of a continent where a huge variety of climates are present, from desert to mountain conditions, and from tropical to subtropical regimes. Another basic objective of this preliminary work is also to obtain a measure of the spread among
Nonlinear stability and ergodicity of ensemble based Kalman filters
Tong, Xin T.; Majda, Andrew J.; Kelly, David
2016-02-01
The ensemble Kalman filter (EnKF) and ensemble square root filter (ESRF) are data assimilation methods used to combine high dimensional, nonlinear dynamical models with observed data. Despite their widespread usage in climate science and oil reservoir simulation, very little is known about the long-time behavior of these methods and why they are effective when applied with modest ensemble sizes in large dimensional turbulent dynamical systems. By following the basic principles of energy dissipation and controllability of filters, this paper establishes a simple, systematic and rigorous framework for the nonlinear analysis of EnKF and ESRF with arbitrary ensemble size, focusing on the dynamical properties of boundedness and geometric ergodicity. The time uniform boundedness guarantees that the filter estimate will not diverge to machine infinity in finite time, which is a potential threat for EnKF and ESQF known as the catastrophic filter divergence. Geometric ergodicity ensures in addition that the filter has a unique invariant measure and that initialization errors will dissipate exponentially in time. We establish these results by introducing a natural notion of observable energy dissipation. The time uniform bound is achieved through a simple Lyapunov function argument, this result applies to systems with complete observations and strong kinetic energy dissipation, but also to concrete examples with incomplete observations. With the Lyapunov function argument established, the geometric ergodicity is obtained by verifying the controllability of the filter processes; in particular, such analysis for ESQF relies on a careful multivariate perturbation analysis of the covariance eigen-structure.
A Selective Fuzzy Clustering Ensemble Algorithm
Kai Li
2013-12-01
Full Text Available To improve the performance of clustering ensemble method, a selective fuzzy clustering ensemble algorithm is proposed. It mainly includes selection of clustering ensemble members and combination of clustering results. In the process of member selection, measure method is defined to select the better clustering members. Then some selected clustering members are viewed as hyper-graph in order to select the more influential hyper-edges (or features and to weight the selected features. For processing hyper-edges with fuzzy membership, CSPA and MCLA consensus function are generalized. In the experiments, some UCI data sets are chosen to test the presented algorithm’s performance. From the experimental results, it can be seen that the proposed ensemble method can get better clustering ensemble result.
Ben Bouallègue, Zied; Heppelmann, Tobias; Theis, Susanne E.;
2015-01-01
is applied for each lead time and location independently. Non-parametric approaches allow the reconstruction of spatio-temporal joint probability distributions at a low computational cost.For example, the ensemble copula coupling (ECC) method consists in rebuilding the multivariate aspect of the forecast...... from the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error....... The new approach which preserves the dynamical development of the ensemble members is called dynamic ensemble copula coupling (d-ECC). The ensemble based empirical copulas, ECC and d-ECC, are applied to wind forecasts from the high resolution ensemble system COSMO-DEEPS run operationally at the German...
魏巍; 郭晨
2012-01-01
为提高脱机满文手写字体的识别率,提出了基于BP网络的多特征集成分类器识别方法.对扫描成图像的手写满文进行预处理,切分出满文字元;分别提取满文字元的投影特征、链码特征以及端点和交叉点特征,并对这三类特征及其相互组合进行分类识别；通过隐马尔科夫算法对识别结果进行后处理,进一步提高识别的精度.实验结果表明,集成分类器的识别率要比单个特征的识别率要高,同时集成分类器中的特征类别越多,识别效果越好.%To improve the off-line Manchu handwritten character recognition rate, a method of recognition based on the multi-classifier of back propagation neural network ensemble with combination features is presented. Firstly, the preprocessing is performed to segment the Manchu character units aiming at Manchu character image. Secondly, it is implemented to recognize the projection feature, chain code one and begin and end point and cross point one of Manchu character unit and the combination features of these ones. Finally, the post processing of Manchu character recognition result is done by the method of hidden Markov model and the recognition rate further is improved. The result of the experiment shows that the recognition rate of the multi-classifier ensemble is higher than the single one and the more features, the better in the multi-classifier ensemble.
Meier, P.; Tilmant, A.; Boucher, M.; Anctil, F.
2012-12-01
In a reservoir system, benefits are usually increased if the system is operated in a coordinated manner. However, despite ever increasing computational power available to users, the optimization of a large system of reservoirs and hydropower stations remains a challenge, especially if uncertainties are included. When applying optimization methods, such as stochastic dynamic programming, the size of a problem becomes quickly too large to be solved. This situation is also known as the curse of dimensionality which limits the applicability of SDP to systems involving only two to three reservoirs. The fact that by design most reservoirs serve multiple purposes adds another difficulty when the operation is to be optimized. A method which is able to address the optimization of multi-purpose reservoirs even in large systems is stochastic dual dynamic programming (SDDP). This approximative dynamic programming technique represents the future benefit function with a number of hyperplanes. The SDDP model developed in this study maximizes the expected net benefits associated with the operation of a reservoir system on a midterm horizon (several years, monthly time step). SDDP provides, at each time step, estimates of the marginal water value stored in each reservoir. Reservoir operators, however, are interested in day-to-day decisions. To provide an operational optimization framework tailored for short-term decision support, the SDDP optimization can be coupled with a short-term nonlinear programming optimization using hydrological ensemble forecasts. The short-term objective therefore consists of the total electricity production within the forecast horizon and the total value of water stored in all the reservoirs. Thus, maximizing this objective ensures that a short-term decision does not contradict the strategic planning. This optimization framework is implemented for the Gatineau river basin, a sub-basin of the Ottawa river north of the city of Ottawa. The Gatineau river
Layered Ensemble Architecture for Time Series Forecasting.
Rahman, Md Mustafizur; Islam, Md Monirul; Murase, Kazuyuki; Yao, Xin
Time series forecasting (TSF) has been widely used in many application areas such as science, engineering, and finance. The phenomena generating time series are usually unknown and information available for forecasting is only limited to the past values of the series. It is, therefore, necessary to use an appropriate number of past values, termed lag, for forecasting. This paper proposes a layered ensemble architecture (LEA) for TSF problems. Our LEA consists of two layers, each of which uses an ensemble of multilayer perceptron (MLP) networks. While the first ensemble layer tries to find an appropriate lag, the second ensemble layer employs the obtained lag for forecasting. Unlike most previous work on TSF, the proposed architecture considers both accuracy and diversity of the individual networks in constructing an ensemble. LEA trains different networks in the ensemble by using different training sets with an aim of maintaining diversity among the networks. However, it uses the appropriate lag and combines the best trained networks to construct the ensemble. This indicates LEAs emphasis on accuracy of the networks. The proposed architecture has been tested extensively on time series data of neural network (NN)3 and NN5 competitions. It has also been tested on several standard benchmark time series data. In terms of forecasting accuracy, our experimental results have revealed clearly that LEA is better than other ensemble and nonensemble methods. PMID:25751882
The iterative ensemble Kalman smoother (IEnKS) is a data assimilation method meant for efficiently tracking the state ofnonlinear geophysical models. It combines an ensemble of model states to estimate the errors similarly to the ensemblesquare root Kalman filter, with a 4D-variational analysis performed within the ensemble space. As such it belongs tothe class of ensemble variational methods. Recently introduced 4DEnVar or the 4D-LETKF can be seen as particular casesof the scheme. The IEnKS was shown to outperform 4D-Var, the ensemble Kalman filter (EnKF) and smoother, with low-ordermodels in all investigated dynamical regimes. Like any ensemble method, it could require the use of localization of theanalysis when the state space dimension is high. However, localization for the IEnKS is not as straightforward as forthe EnKF. Indeed, localization needs to be defined across time, and it needs to be as much as possible consistent withthe dynamical flow within the data assimilation variational window. We show that a Liouville equation governs the timeevolution of the localization operator, which is linked to the evolution of the error correlations. It is argued thatits time integration strongly depends on the forecast dynamics. Using either covariance localization or domainlocalization, we propose and test several localization strategies meant to address the issue: (i) a constant and uniformlocalization, (ii) the propagation through the window of a restricted set of dominant modes of the error covariancematrix, (iii) the approximate propagation of the localization operator using model covariant local domains. Theseschemes are illustrated on the one-dimensional Lorenz 40-variable model.
Quantitative assessment of climate change risk requires a method for constructing probabilistic time series of changes in physical climate parameters. Here, we develop two such methods, Surrogate/Model Mixed Ensemble (SMME) and Monte Carlo Pattern/Residual (MCPR), and apply them to construct joint probability density functions (PDFs) of temperature and precipitation change over the 21st century for every county in the United States. Both methods produce $likely$ (67% probability) temperature and precipitation projections consistent with the Intergovernmental Panel on Climate Change's interpretation of an equal-weighted Coupled Model Intercomparison Project 5 (CMIP5) ensemble, but also provide full PDFs that include tail estimates. For example, both methods indicate that, under representative concentration pathway (RCP) 8.5, there is a 5% chance that the contiguous United States could warm by at least 8$^\\circ$C. Variance decomposition of SMME and MCPR projections indicate that background variability dominates...
Makowski, D; Asseng, S; Ewert, F.; Bassu, S; Durand, J.L.; Li, T; Martre, P; Adam, M.; Aggarwal, P K; Angulo, C; Baron, C; Basso, B; Bertuzzi, P; Biernath, C; Boogaard, H; Boote, K J; Bouman, B; Bregaglio, S; Brisson, N; Buis, S; Cammarano, D; Challinor, A J; Confalonieri, R; Conijn, J G; Corbeels, M; Deryng, D; De Sanctis, G; Doltra, J; Fumoto, T; Gaydon, D; Gayler, S; Goldberg, R; Grant, R F; Grassini, P; Hatfield, J L; Hasegawa, T; Heng, L; Hoek, S; Hooker, J; Hunt, L A; Ingwersen, J; Izaurralde, R C; Jongschaap, R E E; Jones, J W; Kemanian, R A; Kersebaum, K C; Kim, S.-H.; Lizaso, J; Marcaida Ill, M; Müller, C; Nakagawa, H; Naresh Kumar, S; Nendel, C; O'Leary, G J; Olesen, Jørgen Eivind; Oriol, P; Osborne, T M; Palosuo, T; Pravia, M V; Priesack, E; Ripoche, D; Rosenzweig, C; Ruane, A C; Ruget, F; Sau, F; Semenov, M A; Shcherbak, I; Singh, B; Singh, U; Soo, H K; Steduto, P; Stöckle, C; Stratonovitch, P; Streck, T; Supit, I; Tang, L.; Tao, F; Teixeira, E I; Thorburn, P; Timlin, D; Travasso, M; Rötter, R P; Waha, K; Wallach, D; White, J W; Wilkens, P; Williams, J R; Wolf, J.; Yin, X; Yoshida, H; Zhang, Z; Zhu, Y
concentration levels, and can thus be used to calculate temperature and [CO2] thresholds leading to yield loss or yield gain, without re-running the original complex crop models. Our approach is illustrated with three yield datasets simulated by 19 maize models, 26 wheat models, and 13 rice models. Several......Ensembles of process-based crop models are increasingly used to simulate crop growth for scenarios of temperature and/or precipitation changes corresponding to different projections of atmospheric CO2 concentrations. This approach generates large datasets with thousands of simulated crop yield data...... the simulation protocols. Here we demonstrate that statistical models based on random-coefficient regressions are able to emulate ensembles of process-based crop models. An important advantage of the proposed statistical models is that they can interpolate between temperature levels and between CO2...
Supervised online learning with an ensemble of students randomized by the choice of initial conditions is analyzed. For the case of the perceptron learning rule, asymptotically the same improvement in the generalization error of the ensemble compared to the performance of a single student is found as in Gibbs learning. For more optimized learning rules, however, using an ensemble yields no improvement. This is explained by showing that for any learning rule $f$ a transform $\\tilde{f}$ exists,...
Schemes for implementation of CNOT gates in atomic ensembles are important for realization of quantum computing. We present here a theoretical scheme of a CNOTN gate with an ensemble of three-level atoms in the lambda configuration and a single two-level control atom. We work in the regime of Rydberg blockade for the ensemble atoms due to excitation of the Rydberg control atom. It is shown that using STIRAP, atoms from one ground state of the ensemble can be adiabatically transferred to the other ground state, depending on the state of the control atom. A thorough analysis of adiabatic conditions for this scheme and the influence of the radiative decay is provided. We show that the CNOTN process is immune to the decay rate of the excited level in ensemble atoms. This work is supported by the ARL, the IARPA LogiQ program, and the AFOSR MURI program.
Reliability analysis of rc containment structures under combined loads
This paper discusses a reliability analysis method and load combination design criteria for reinforced concrete containment structures under combined loads. The probability based reliability analysis method is briefly described. For load combination design criteria, derivations of the load factors for accidental pressure due to a design basis accident and safe shutdown earthquake (SSE) for three target limit state probabilities are presented
Ensemble methods for noise in classification problems
Verbaeten, Sofie; Van Assche, Anneleen
Ensemble methods combine a set of classifiers to construct a new classifier that is (often) more accurate than any of its component classifiers. In this paper, we use ensemble methods to identify noisy training examples. More precisely, we consider the problem of mislabeled training examples in classification tasks, and address this problem by pre-processing the training set, i.e. by identifying and removing outliers from the training set. We study a number of filter techniques that are based...
Enhanced ensemble-based 4DVar scheme for data assimilation
Yang, Yin; Robinson, Cordelia; Heitz, Dominique; Mémin, Etienne
International audience Ensemble based optimal control schemes combine the components of ensemble Kalman filters and variational data assimilation (4DVar). They are trendy because they are easier to implement than 4DVar. In this paper, we evaluate a modified version of an ensemble based optimal control strategy for image data assimilation. This modified method is assessed with a Shallow Water model combined with synthetic data and original incomplete experimental depth sensor observations. ...
Control Flow Analysis for SF Combinator Calculus
Lester, Martin
Programs that transform other programs often require access to the internal structure of the program to be transformed. This is at odds with the usual extensional view of functional programming, as embodied by the lambda calculus and SK combinator calculus. The recently-developed SF combinator calculus offers an alternative, intensional model of computation that may serve as a foundation for developing principled languages in which to express intensional computation, including program transfo...
Multilevel ensemble Kalman filtering
Hoel, Håkon; Law, Kody J. H.; Tempone, Raul
This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (ENKF), thereby yielding a multilevel ensemble Kalman filter (MLENKF) which has provably superior asymptotic cost to a given accuracy level. The theoretical results are illustrated numerically.
New York,: ACM, 2015, Article No. 17. ISBN 978-1-4503-3393-1. [ECSAW '15. European Conference on Software Architecture Workshops. Dubrovnik (HR), 07.09.2015-08.09.2015] Institutional support: RVO:67985807 Keywords : distributed coordination * architectural adaptation * ensemble-based component systems * component model * emergent architecture * component ensembles * autonomic systems Subject RIV: JC - Computer Hardware ; Software
Malignancy and Abnormality Detection of Mammograms using Classifier Ensembling
Nawazish Naveed
Full Text Available The breast cancer detection and diagnosis is a critical and complex procedure that demands high degree of accuracy. In computer aided diagnostic systems, the breast cancer detection is a two stage procedure. First, to classify the malignant and benign mammograms, while in second stage, the type of abnormality is detected. In this paper, we have developed a novel architecture to enhance the classification of malignant and benign mammograms using multi-classification of malignant mammograms into six abnormality classes. DWT (Discrete Wavelet Transformation features are extracted from preprocessed images and passed through different classifiers. To improve accuracy, results generated by various classifiers are ensembled. The genetic algorithm is used to find optimal weights rather than assigning weights to the results of classifiers on the basis of heuristics. The mammograms declared as malignant by ensemble classifiers are divided into six classes. The ensemble classifiers are further used for multiclassification using one-against-all technique for classification. The output of all ensemble classifiers is combined by product, median and mean rule. It has been observed that the accuracy of classification of abnormalities is more than 97% in case of mean rule. The Mammographic Image Analysis Society dataset is used for experimentation.
Full Text Available Numerous documents, letters and recommendations of UNESCO discuss the importance of community in the process of revitalization, protection and preservation of architectural ensembles, especially when located in urban areas. The conservation of a particular area become successful when the structural, social, economic and cultural factors are identified, discussed and the solutions applied. In that sense, this article is the result of a research whose object, the Historic Center of Fortaleza-CE-Brazil, was evaluated from questionnaire applied to its residents, workers and users of services in this area aimed at the diagnosis on the value of historic, artistic and architectural representative 19TH century at the region.
2016-01-01
We use the random Green's matrix model to study the scaling properties of the localization transition for scalar waves in a three-dimensional (3D) ensemble of resonant point scatterers. We show that the probability density $p(g)$ of normalized decay rates of quasi-modes $g$ is very broad at the transition and in the localized regime and that it does not obey a single-parameter scaling law. The latter holds, however, for the small-$g$ part of $p(g)$ which we exploit to estimate the critical exponent $\
Towards a GME ensemble forecasting system: Ensemble initialization using the breeding technique
Full Text Available The quantitative forecast of precipitation requires a probabilistic background particularly with regard to forecast lead times of more than 3 days. As only ensemble simulations can provide useful information of the underlying probability density function, we built a new ensemble forecasting system (GME-EFS based on the GME model of the German Meteorological Service (DWD. For the generation of appropriate initial ensemble perturbations we chose the breeding technique developed by Toth and Kalnay (1993, 1997, which develops perturbations by estimating the regions of largest model error induced uncertainty. This method is applied and tested in the framework of quasi-operational forecasts for a three month period in 2007. The performance of the resulting ensemble forecasts are compared to the operational ensemble prediction systems ECMWF EPS and NCEP GFS by means of ensemble spread of free atmosphere parameters (geopotential and temperature and ensemble skill of precipitation forecasting. This comparison indicates that the GME ensemble forecasting system (GME-EFS provides reasonable forecasts with spread skill score comparable to that of the NCEP GFS. An analysis with the continuous ranked probability score exhibits a lack of resolution for the GME forecasts compared to the operational ensembles. However, with significant enhancements during the 3 month test period, the first results of our work with the GME-EFS indicate possibilities for further development as well as the potential for later operational usage.
Bacterial light-harvesting pigment-protein complexes are very efficient at converting photons into excitons and transferring them to reaction centers, where the energy is stored in a chemical form. Optical properties of the complexes are known to change significantly in time and also vary from one complex to another; therefore, a detailed understanding of the variations on the level of single complexes and how they accumulate into effects that can be seen on the macroscopic scale is required. While experimental and theoretical methods exist to study the spectral properties of light-harvesting complexes on both individual complex and bulk ensemble levels, they have been developed largely independently of each other. To fill this gap, we simultaneously analyze experimental low-temperature single-complex and bulk ensemble optical spectra of the light-harvesting complex-2 (LH2) chromoproteins from the photosynthetic bacterium Rhodopseudomonas acidophila in order to find a unique theoretical model consistent with both experimental situations. The model, which satisfies most of the observations, combines strong exciton-phonon coupling with significant disorder, characteristic of the proteins. We establish a detailed disorder model that, in addition to containing a C2-symmetrical modulation of the site energies, distinguishes between static intercomplex and slow conformational intracomplex disorders. The model evaluations also verify that, despite best efforts, the single-LH2-complex measurements performed so far may be biased toward complexes with higher Huang-Rhys factors.
The Local Ensemble Transform Kalman Filter (LETKF) with a Global NWP Model on the Cubed Sphere
Shin, Seoleun; Kang, Ji-Sun; Jo, Youngsoon
We develop an ensemble data assimilation system using the four-dimensional local ensemble transform kalman filter (LEKTF) for a global hydrostatic numerical weather prediction (NWP) model formulated on the cubed sphere. Forecast-analysis cycles run stably and thus provide newly updated initial states for the model to produce ensemble forecasts every 6 h. Performance of LETKF implemented to the global NWP model is verified using the ECMWF reanalysis data and conventional observations. Global mean values of bias and root mean square difference are significantly reduced by the data assimilation. Besides, statistics of forecast and analysis converge well as the forecast-analysis cycles are repeated. These results suggest that the combined system of LETKF and the global NWP formulated on the cubed sphere shows a promising performance for operational uses.
Spectral diagonal ensemble Kalman filters
A new type of ensemble Kalman filter is developed, which is based on replacing the sample covariance in the analysis step by its diagonal in a spectral basis. It is proved that this technique improves the aproximation of the covariance when the covariance itself is diagonal in the spectral basis, as is the case, e.g., for a second-order stationary random field and the Fourier basis. The method is extended by wavelets to the case when the state variables are random fields, which are not spatially homogeneous. Efficient implementations by the fast Fourier transform (FFT) and discrete wavelet transform (DWT) are presented for several types of observations, including high-dimensional data given on a part of the domain, such as radar and satellite images. Computational experiments confirm that the method performs well on the Lorenz 96 problem and the shallow water equations with very small ensembles and over multiple analysis cycles.
Analysis of combining ability in soybean cultivars
Eight soybean cultivars (Doko, Bossier, Ocepar-4, BR-15, FT-Cometa, Savana, Paraná and Cristalina) werecrossed in a diallel design. Plants of the F1 generation and their parents were evaluated under short-day conditions for thedetermination of the general (GCA) and specific (SCA) combining ability. The estimated GCA and SCA values were significantfor the evaluated traits except for the “total cycle”. Highest GCA effects for the traits “days to flowering”, “plant height”,“insertion height”, “n...
Setup Analysis: Combining SMED with Other Tools
Full Text Available The purpose of this paper is to propose the methodology for the setup analysis, which can be implemented mainly in small and medium enterprises which are not convinced to implement the setups development. The methodology was developed after the research which determined the problem. Companies still have difficulties with a long setup time. Many of them do nothing to decrease this time. A long setup is not a sufficient reason for companies to undertake any actions towards the setup time reduction. To encourage companies to implement SMED it is essential to make some analyses of changeovers in order to discover problems. The methodology proposed can really encourage the management to take a decision about the SMED implementation, and that was verified in a production company. The setup analysis methodology is made up of seven steps. Four of them concern a setups analysis in a chosen area of a company, such as a work stand which is a bottleneck with many setups. The goal is to convince the management to begin actions concerning the setups improvement. The last three steps are related to a certain setup and, there, the goal is to reduce a setup time and the risk of problems which can appear during the setup. In this paper, the tools such as SMED, Pareto analysis, statistical analysis, FMEA and other were used.
Controlling balance in an ensemble Kalman filter
We present a method to control unbalanced fast dynamics in an ensemble Kalman filter by introducing a weak constraint on the imbalance in a spatially sparse observational network. We show that the balance constraint produces significantly more balanced analyses than ensemble Kalman filters without balance constraints and than filters implementing incremental analysis updates (IAU). Furthermore, our filter with the weak constraint on imbalance produces good rms error statisti...
Development and testing of the GRAPES regional ensemble-3DVAR hybrid data assimilation system
Based on the GRAPES (Global/Regional Assimilation and Prediction System) regional ensemble prediction system and 3DVAR (three-dimensional variational) data assimilation system, which are implemented operationally at the Numerical Weather Prediction Center of the China Meteorological Administration, an ensemble-based 3DVAR (En-3DVAR) hybrid data assimilation system for GRAPES_Meso (the regional mesoscale numerical prediction system of GRAPES) was developed by using the extended control variable technique to implement a hybrid background error covariance that combines the climatological covariance and ensemble-estimated covariance. Considering the problems of the ensemble-based data assimilation part of the system, including the reduction in the degree of geostrophic balance between variables, and the non-smooth analysis increment and its obviously smaller size compared with the 3DVAR data assimilation, corresponding measures were taken to optimize and ameliorate the system. Accordingly, a single pressure observation ensemble-based data assimilation experiment was conducted to ensure that the ensemble-based data assimilation part of the system is correct and reasonable. A number of localization-scale sensitivity tests of the ensemble-based data assimilation were also conducted to determine the most appropriate localization scale. Then, a number of hybrid data assimilation experiments were carried out. The results showed that it was most appropriate to set the weight factor of the ensemble-estimated covariance in the experiments to be 0.8. Compared with the 3DVAR data assimilation, the geopotential height forecast of the hybrid data assimilation experiments improved very little, but the wind forecast improved slightly at each forecast time, especially over 300 hPa. Overall, the hybrid data assimilation demonstrates some advantages over the 3DVAR data assimilation.
Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.
In this paper, we consider supervised learning under the assumption that the available memory is small compared to the dataset size. This general framework is relevant in the context of big data, distributed databases and embedded systems. We investigate a very simple, yet effective, ensemble framework that builds each individual model of the ensemble from a random patch of data obtained by drawing random subsets of both instances and features from the whole dataset. We carry out an extensive...
A hyper-ensemble forecast of surface drift
The prediction of surface drift of water is an important task, with applications such as marine transport, pollutant dispersion, and search-and-rescue activities. However, it is also very challenging, because it depends on ocean models that (usually) do not completely accurately represent wind-induced current, that do not include wave-driven currents, etc. However, the real surface drift depends on all present physical phenomena, which moreover interact in complex ways. Furthermore, although each of these factors can be forecasted by deterministic models, the latter all suffer from limitations, resulting in imperfect predictions. In the present study, we try and predict the drift of buoys launched during the DART06 (Dynamics of the Adriatic sea in Real-Time 2006) and MREA07 (Maritime Rapid Environmental Assessment 2007) sea trials, using the so-called hyper-ensemble technique: different models are combined in order to minimize departure from independent observations during a training period. The obtained combination is then used in forecasting mode. We review and try out different hyper-ensemble techniques, such as the simple ensemble mean, least-squares weighted linear combinations, and techniques based on data assimilation, which dynamically update the model's weights in the combination when new observations become available. We show that the latter methods alleviate the need of a priori fixing the training length. When the forecast period is relatively short, the discussed methods lead to much smaller forecasting errors compared with individual models (at least 3 times smaller), with the dynamic methods leading to the best results. When many models are available, errors can be further reduced by removing colinearities between them by performing a principal component analysis. At the same time, this reduces the amount of weights to be determined. In complex environments, the skill of individual models may vary over time periods smaller than the desired
Combination of structural reliability and interval analysis
In engineering applications,probabilistic reliability theory appears to be presently the most important method,however,in many cases precise probabilistic reliability theory cannot be considered as adequate and credible model of the real state of actual affairs.In this paper,we developed a hybrid of probabilistic and non-probabilistic reliability theory,which describes the structural uncertain parameters as interval variables when statistical data are found insufficient.By using the interval analysis,a new method for calculating the interval of the structural reliability as well as the reliability index is introduced in this paper,and the traditional probabilistic theory is incorporated with the interval analysis.Moreover,the new method preserves the useful part of the traditional probabilistic reliability theory,but removes the restriction of its strict requirement on data acquisition.Example is presented to demonstrate the feasibility and validity of the proposed theory.
Combining risk analysis and security testing
A systematic integration of risk analysis and security testing allows for optimizing the test process as well as the risk assessment itself. The result of the risk assessment, i.e. the identified vulnerabilities, threat scenarios and unwanted incidents, can be used to guide the test identification and may complement requirements engineering results with systematic information concerning the threats and vulnerabilities of a system and their probabilities and consequences. This information can ...
Data assimilation with the weighted ensemble Kalman filter
In this paper, two data assimilation methods based on sequential Monte Carlo sampling are studied and compared: the ensemble Kalman filter and the particle filter. Each of these techniques has its own advantages and drawbacks. In this work, we try to get the best of each method by combining them. The proposed algorithm, called the weighted ensemble Kalman filter, consists to rely on the Ensemble Kalman Filter updates of samples in order to define a proposal distribution for the particle filte...
The dynamics of exploitation in ensembles of source and sink
The ensemble is a new entity on a higher level of complexity composed of source and sink. When substrate is transferred from source to sink within the transfer space or the ensemble space non-linearity is observed. Saturating production functions of source and sink in combination with linear cost functions generate superadditivity and subadditivity in the productivity of the ensemble. In a reaction chain the source produces a product that will be used by the sink to produce a different pr...
Enhanced Sampling in the Well-Tempered Ensemble
We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well-tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann averages are computed on the ...
Triticeae resources in Ensembl Plants.
Recent developments in DNA sequencing have enabled the large and complex genomes of many crop species to be determined for the first time, even those previously intractable due to their polyploid nature. Indeed, over the course of the last 2 years, the genome sequences of several commercially important cereals, notably barley and bread wheat, have become available, as well as those of related wild species. While still incomplete, comparison with other, more completely assembled species suggests that coverage of genic regions is likely to be high. Ensembl Plants (http://plants.ensembl.org) is an integrative resource organizing, analyzing and visualizing genome-scale information for important crop and model plants. Available data include reference genome sequence, variant loci, gene models and functional annotation. For variant loci, individual and population genotypes, linkage information and, where available, phenotypic information are shown. Comparative analyses are performed on DNA and protein sequence alignments. The resulting genome alignments and gene trees, representing the implied evolutionary history of the gene family, are made available for visualization and analysis. Driven by the case of bread wheat, specific extensions to the analysis pipelines and web interface have recently been developed to support polyploid genomes. Data in Ensembl Plants is accessible through a genome browser incorporating various specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These interfaces are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests and pollinators, facilitating the study of the plant in its environment. PMID:25432969
Conductor gestures influence evaluations of ensemble performance.
Previous research has found that listener evaluations of ensemble performances vary depending on the expressivity of the conductor's gestures, even when performances are otherwise identical. It was the purpose of the present study to test whether this effect of visual information was evident in the evaluation of specific aspects of ensemble performance: articulation and dynamics. We constructed a set of 32 music performances that combined auditory and visual information and were designed to feature a high degree of contrast along one of two target characteristics: articulation and dynamics. We paired each of four music excerpts recorded by a chamber ensemble in both a high- and low-contrast condition with video of four conductors demonstrating high- and low-contrast gesture specifically appropriate to either articulation or dynamics. Using one of two equivalent test forms, college music majors and non-majors (N = 285) viewed sixteen 30 s performances and evaluated the quality of the ensemble's articulation, dynamics, technique, and tempo along with overall expressivity. Results showed significantly higher evaluations for performances featuring high rather than low conducting expressivity regardless of the ensemble's performance quality. Evaluations for both articulation and dynamics were strongly and positively correlated with evaluations of overall ensemble expressivity. PMID:25104944
A Localized Ensemble Kalman Smoother
Numerous geophysical inverse problems prove difficult because the available measurements are indirectly related to the underlying unknown dynamic state and the physics governing the system may involve imperfect models or unobserved parameters. Data assimilation addresses these difficulties by combining the measurements and physical knowledge. The main challenge in such problems usually involves their high dimensionality and the standard statistical methods prove computationally intractable. This paper develops and addresses the theoretical convergence of a new high-dimensional Monte-Carlo approach called the localized ensemble Kalman smoother.
Full Text Available Measuring stride variability and dynamics in children is useful for the quantitative study of gait maturation and neuromotor development in childhood and adolescence. In this paper, we computed the sample entropy (SampEn and average stride interval (ASI parameters to quantify the stride series of 50 gender-matched children participants in three age groups. We also normalized the SampEn and ASI values by leg length and body mass for each participant, respectively. Results show that the original and normalized SampEn values consistently decrease over the significance level of the Mann-Whitney U test (p<0.01 in children of 3–14 years old, which indicates the stride irregularity has been significantly ameliorated with the body growth. The original and normalized ASI values are also significantly changing when comparing between any two groups of young (aged 3–5 years, middle (aged 6–8 years, and elder (aged 10–14 years children. Such results suggest that healthy children may better modulate their gait cadence rhythm with the development of their musculoskeletal and neurological systems. In addition, the AdaBoost.M2 and Bagging algorithms were used to effectively distinguish the children’s gait patterns. These ensemble learning algorithms both provided excellent gait classification results in terms of overall accuracy (≥90%, recall (≥0.8, and precision (≥0.8077.
The semantic similarity ensemble
Full Text Available Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.
Packet combining scheme is a well defined simple error correction scheme for the detection and correction of errors at the receiver. Although it permits a higher throughput when compared to other basic ARQ protocols, packet combining (PC) scheme fails to correct errors when errors occur in the same bit locations of copies. In a previous work, a scheme known as Packet Reversed Packet Combining (PRPC) Scheme that will correct errors which occur at the same bit location of erroneous copies, was studied however PRPC does not handle a situation where a packet has more than 1 error bit. The Modified Packet Combining (MPC) Scheme that can correct double or higher bit errors was studied elsewhere. Both PRPC and MPC schemes are believed to offer higher throughput in previous studies, however neither adequate investigation nor exact analysis was done to substantiate this claim of higher throughput. In this work, an exact analysis of both PRPC and MPC is carried out and the results reported. A combined protocol (PRPC and MPC) is proposed and the analysis shows that it is capable of offering even higher throughput and better error correction capability at high bit error rate (BER) and larger packet size. (author)
Competitive Learning Neural Network Ensemble Weighted by Predicted Performance
Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…
Exergy Analysis of Combined Cycle Power Plant: NTPC Dadri, India
The aim of the present paper is to exergy analysis of combined Brayton/Rankine power cycle of NTPC Dadri India. Theoretical exergy analysis is carried out for different components of dadri combined cycle power plant which consists of a gas turbine unit, heat recovery steam generator without extra fuel consumption and steam turbine unit. The results pinpoint that more exergy losses occurred in the gas turbine combustion chamber. Its reached 35% of the total exergy losses while the exergy losse...
Estimating combining ability in popcorn lines using multivariate analysis
Aiming to estimate the combining ability in tropical and temperate popcorn (Zea mays L. var. everta Sturt.) lines using multivariate analysis, ten popcorn lines were crossed in a complete diallel without reciprocals and the lines and hybrids were tested in two randomized complete block experiments with three replicates. Data were subjected to univariate and multivariate ANOVA, principal component analysis, and univariate and multivariate diallel analysis. For multivariate diallel analysis, va...
Ensemble Forecasting of Major Solar Flares
We present the results from the first ensemble prediction model for major solar flares (M and X classes). Using the probabilistic forecasts from three models hosted at the Community Coordinated Modeling Center (NASA-GSFC) and the NOAA forecasts, we developed an ensemble forecast by linearly combining the flaring probabilities from all four methods. Performance-based combination weights were calculated using a Monte Carlo-type algorithm by applying a decision threshold $P_{th}$ to the combined probabilities and maximizing the Heidke Skill Score (HSS). Using the probabilities and events time series from 13 recent solar active regions (2012 - 2014), we found that a linear combination of probabilities can improve both probabilistic and categorical forecasts. Combination weights vary with the applied threshold and none of the tested individual forecasting models seem to provide more accurate predictions than the others for all values of $P_{th}$. According to the maximum values of HSS, a performance-based weights ...
Imprinting and recalling cortical ensembles.
Neuronal ensembles are coactive groups of neurons that may represent building blocks of cortical circuits. These ensembles could be formed by Hebbian plasticity, whereby synapses between coactive neurons are strengthened. Here we report that repetitive activation with two-photon optogenetics of neuronal populations from ensembles in the visual cortex of awake mice builds neuronal ensembles that recur spontaneously after being imprinted and do not disrupt preexisting ones. Moreover, imprinted ensembles can be recalled by single- cell stimulation and remain coactive on consecutive days. Our results demonstrate the persistent reconfiguration of cortical circuits by two-photon optogenetics into neuronal ensembles that can perform pattern completion. PMID:27516599
Disease-associated mutations that alter the RNA structural ensemble.
Full Text Available Genome-wide association studies (GWAS often identify disease-associated mutations in intergenic and non-coding regions of the genome. Given the high percentage of the human genome that is transcribed, we postulate that for some observed associations the disease phenotype is caused by a structural rearrangement in a regulatory region of the RNA transcript. To identify such mutations, we have performed a genome-wide analysis of all known disease-associated Single Nucleotide Polymorphisms (SNPs from the Human Gene Mutation Database (HGMD that map to the untranslated regions (UTRs of a gene. Rather than using minimum free energy approaches (e.g. mFold, we use a partition function calculation that takes into consideration the ensemble of possible RNA conformations for a given sequence. We identified in the human genome disease-associated SNPs that significantly alter the global conformation of the UTR to which they map. For six disease-states (Hyperferritinemia Cataract Syndrome, beta-Thalassemia, Cartilage-Hair Hypoplasia, Retinoblastoma, Chronic Obstructive Pulmonary Disease (COPD, and Hypertension, we identified multiple SNPs in UTRs that alter the mRNA structural ensemble of the associated genes. Using a Boltzmann sampling procedure for sub-optimal RNA structures, we are able to characterize and visualize the nature of the conformational changes induced by the disease-associated mutations in the structural ensemble. We observe in several cases (specifically the 5' UTRs of FTL and RB1 SNP-induced conformational changes analogous to those observed in bacterial regulatory Riboswitches when specific ligands bind. We propose that the UTR and SNP combinations we identify constitute a "RiboSNitch," that is a regulatory RNA in which a specific SNP has a structural consequence that results in a disease phenotype. Our SNPfold algorithm can help identify RiboSNitches by leveraging GWAS data and an analysis of the mRNA structural ensemble.
Embedded feature ranking for ensemble MLP classifiers
A feature ranking scheme for multilayer perceptron (MLP) ensembles is proposed, along with a stopping criterion based upon the out-of-bootstrap estimate. To solve multi-class problems feature ranking is combined with modified error-correcting output coding. Experimental results on benchmark data demonstrate the versatility of the MLP base classifier in removing irrelevant features.
A multisite seasonal ensemble streamflow forecasting technique
We present a technique for providing seasonal ensemble streamflow forecasts at several locations simultaneously on a river network. The framework is an integration of two recent approaches: the nonparametric multimodel ensemble forecast technique and the nonparametric space-time disaggregation technique. The four main components of the proposed framework are as follows: (1) an index gauge streamflow is constructed as the sum of flows at all the desired spatial locations; (2) potential predictors of the spring season (April-July) streamflow at this index gauge are identified from the large-scale ocean-atmosphere-land system, including snow water equivalent; (3) the multimodel ensemble forecast approach is used to generate the ensemble flow forecast at the index gauge; and (4) the ensembles are disaggregated using a nonparametric space-time disaggregation technique resulting in forecast ensembles at the desired locations and for all the months within the season. We demonstrate the utility of this technique in skillful forecast of spring seasonal streamflows at four locations in the Upper Colorado River Basin at different lead times. Where applicable, we compare the forecasts to the Colorado Basin River Forecast Center's Ensemble Streamflow Prediction (ESP) and the National Resource Conservation Service "coordinated" forecast, which is a combination of the ESP, Statistical Water Supply, a principal component regression technique, and modeler knowledge. We find that overall, the proposed method is equally skillful to existing operational models while tending to better predict wet years. The forecasts from this approach can be a valuable input for efficient planning and management of water resources in the basin.
We propose several means for improving the performance an training of neural networks for classification. We use crossvalidation as a tool for optimizing network parameters and architecture. We show further that the remaining generalization error can be reduced by invoking ensembles of similar...... networks....
Ensemble approach for differentiation of malignant melanoma
Melanoma is the deadliest type of skin cancer, yet it is the most treatable kind depending on its early diagnosis. The early prognosis of melanoma is a challenging task for both clinicians and dermatologists. Due to the importance of early diagnosis and in order to assist the dermatologists, we propose an automated framework based on ensemble learning methods and dermoscopy images to differentiate melanoma from dysplastic and benign lesions. The evaluation of our framework on the recent and public dermoscopy benchmark (PH2 dataset) indicates the potential of proposed method. Our evaluation, using only global features, revealed that ensembles such as random forest perform better than single learner. Using random forest ensemble and combination of color and texture features, our framework achieved the highest sensitivity of 94% and specificity of 92%.
We explore various models for the pattern forming instability in a laser-driven cloud of cold two-level atoms with a plane feedback mirror. Focus is on the combined treatment of nonlinear propagation in a diffractively thick medium and the boundary condition given by feedback. The combined presence of purely transverse transmission gratings and reflection gratings on wavelength scale is addressed. Different truncation levels of the Fourier expansion of the dielectric susceptibility in terms of these gratings are discussed and compared to literature. A formalism to calculate the exact solution for the homogenous state in presence of absorption is presented. The relationship between the counterpropagating beam instability and the feedback instability is discussed. Feedback reduces the threshold by a factor of two under optimal conditions. Envelope curves which bound all possible threshold curves for varying mirror distances are calculated. The results are comparing well to experimental results regarding the obs...
This study aims to explore the utility of the impact response surface (IRS) approach for investigating model ensemble crop yield responses under a large range of changes in climate. IRSs of spring and winter wheat (Triticum aestivum) yields were constructed from a 26-member ensemble of process-based crop simulation models for sites in Finland, Germany and Spain across a latitudinal transect in Europe. The sensitivity of modelled yield to systematic increments of changes in temperature (-2 to ...
Minimalist ensemble algorithms for genome-wide protein localization prediction
2012-07-01
Full Text Available Abstract Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual
Kingston Soundpainting Ensemble
This performance is designed to introduce teachers and school musicians to this live multidisciplinary live composing sign language. Led by Dr. Helen Julia Minors (soundpainter, trumpet, voice), the Kingston Soundpainting Ensemble, led by Dr. Minors at Kington University, is representated by a section a varied set of performers, using woodwind, brass, voice and percussion, spanning popular, classical and world styles. This performance consists of: Philip Warda (electronic instruments,...
Full Text Available Ensemble forecasts aim at framing the uncertainties of the potential future development of the hydro-meteorological situation. A probabilistic evaluation can be used to communicate forecast uncertainty to decision makers. Here an operational system for ensemble based flood forecasting is presented, which combines forecasts from the European COSMO-LEPS, SRNWP-PEPS and COSMO-DE prediction systems. A multi-model lagged average super-ensemble is generated by recombining members from different runs of these meteorological forecast systems. A subset of the super-ensemble is selected based on a priori model weights, which are obtained from ensemble calibration. Flood forecasts are simulated by the conceptual rainfall-runoff-model ArcEGMO. Parameter uncertainty of the model is represented by a parameter ensemble, which is a priori generated from a comprehensive uncertainty analysis during model calibration. The use of a computationally efficient hydrological model within a flood management system allows us to compute the hydro-meteorological model chain for all members of the sub-ensemble. The model chain is not re-computed before new ensemble forecasts are available, but the probabilistic assessment of the output is updated when new information from deterministic short range forecasts or from assimilation of measured data becomes available. For hydraulic modelling, with the desired result of a probabilistic inundation map with high spatial resolution, a replacement model can help to overcome computational limitations. A prototype of the developed framework has been applied for a case study in the Mulde river basin. However these techniques, in particular the probabilistic assessment and the derivation of decision rules are still in their infancy. Further research is necessary and promising.
Optimizing matching and analysis combinations for estimating causal effects
Colson, K. Ellicott; Rudolph, Kara E.; Zimmerman, Scott C.; Goin, Dana E.; Stuart, Elizabeth A.; Laan, Mark Van Der; Ahern, Jennifer
2016-03-01
Matching methods are common in studies across many disciplines. However, there is limited evidence on how to optimally combine matching with subsequent analysis approaches to minimize bias and maximize efficiency for the quantity of interest. We conducted simulations to compare the performance of a wide variety of matching methods and analysis approaches in terms of bias, variance, and mean squared error (MSE). We then compared these approaches in an applied example of an employment training program. The results indicate that combining full matching with double robust analysis performed best in both the simulations and the applied example, particularly when combined with machine learning estimation methods. To reduce bias, current guidelines advise researchers to select the technique with the best post-matching covariate balance, but this work finds that such an approach does not always minimize mean squared error (MSE). These findings have important implications for future research utilizing matching. To minimize MSE, investigators should consider additional diagnostics, and use of simulations tailored to the study of interest to identify the optimal matching and analysis combination.
Monitoring of Orientation in Molecular Ensembles by Polarization Sensitive Nonlinear Microscopy
We present high resolution two-photon excitation microscopy studies combining two-photon fluorescence (TPF) and second harmonic generation (SHG) in order to probe orientational distributions of molecular ensembles at room temperature. A detailed polarization analysis of TPF and SHG signals is used in order to unravel the parameters of the molecular orientational statistical distribution, using a technique which can be extended and generalized to a broad variety of molecular arrangements. A po...
Predictability of Regional Climate: A Bayesian Approach to Analysing a WRF Model Ensemble
This study investigates aspects of climate predictability with a focus on climatic variables and different characteristics of extremes over nine North American climatic regions and two selected Atlantic sectors. An ensemble of state-of-the-art Weather Research and Forecasting Model (WRF) simulations is used for the analysis. The ensemble is comprised of a combination of various physics schemes, initial conditions, domain sizes, boundary conditions and breeding techniques. The main objectives of this research are: 1) to increase our understanding of the ability of WRF to capture regional climate information - both at the individual and collective ensemble members, 2) to investigate the role of different members and their synergy in reproducing regional climate 3) to estimate the associated uncertainty. In this study, we propose a Bayesian framework to study the predictability of extremes and associated uncertainties in order to provide a wealth of knowledge about WRF reliability and provide further clarity and understanding of the sensitivities and optimal combinations. The choice of the Bayesian model, as opposed to standard methods, is made because: a) this method has a mean square error that is less than standard statistics, which makes it a more robust method; b) it allows for the use of small sample sizes, which are typical in high-resolution modeling; c) it provides a probabilistic view of uncertainty, which is useful when making decisions concerning ensemble members.
A prediction increase significantly in value with knowledge of how certain it is, that is the size of its error. In weather forecasts it is often difficult to determine this error, even after the time of validity of the prediction, since the precise true state of the atmosphere remains unknown. For Ensemble Kalman filter methods the forecast spread of the ensemble can be used to estimate the uncertainty. However, most operational weather prediction systems today use the technique of variational data assimilation, which lacks a straight forward way to estimate the uncertainty. Lately the variational data assimilation and the ensemble prediction technique have been combined in the so-called EDA (ensemble of data assimilations) technique, to improve the prediction that the variational analysis can provide, and at the same time give an estimate of the uncertainty. The EDA technique consists of an ensemble of standard 4D-Var data assimilations, where the ensemble members have been randomly perturbed. The uncertainty can then be determined from the size of the ensemble spread, provided that there is a linear relationship between the magnitude of the perturbation and the resulting EDA spread. We show that such a linear relationship indeed exists and that the EDA technique can be scaled to provide a practical alternative to the traditional observing system experiment (OSE) technique, both for estimating the uncertainty of a prediction and a tool for assessing the impact of observations.
Roč. 65, č. 31 (2015), s. 87-105. ISSN 0936-577X R&D Projects: GA MZe QJ1310123; GA MŠk(CZ) LD13030 Grant ostatní: German Federal Ministries of Education and Research , and Food and Agriculture(DE) 2812ERA115 Institutional support: RVO:67179843 Keywords : climate * crop model * impact response surface * IRS * sensitivity analysis * wheat * yield Subject RIV: EH - Ecology, Behaviour Impact factor: 2.496, year: 2014
intensive rises of surface air temperature and average temperature of the troposphere (a thickness of 1000-500hPa layer) were found in the investigated region that together with increase of moisture content of the atmosphere led to rise of free convection level and convectively unstable layers of the atmosphere reached almost to 100hPa. The later resulted in an essential increase (almost twice) of Convective Available Potential Energy (CAPE) and, accordingly, speed of updrafts. Ensemble of seven runs of Regional Climate Models (RCM) driven by four Atmosphere and Ocean General Circulation Models (AOGCM) from the ENSEMBLES database was applied in order to obtain projected values of air temperature and precipitation changes for 2021-2050 period within the Dniester basin on a monthly basis. To make calculations more accurate the Dniester basin was subdivided into 3 regions every with 2 subregions according to river geomorphology and topography. Verification of RCM on control 1971-2000 period by E-Obs and stations' data has allowed to obtain optimum ensembles of RCM for every subregion and climate characteristic. Note, that just two regional climate models REMO and RCA both driven by ECHAM5 provided the best results either for all delineated regions or for the entire Dniester basin. Projections for 2021-2050 period were calculated from the same obtained optimum ensembles of RCM as for the control one. More or less uniform air temperature rise is expected in all subregions and months by 0.7-1.7 oC. But projections for precipitation change are more disperse: within a few per cents for annual sums, but almost 20% less for the middle and lower Dniester in August and October (drought risk) and over 15% more for the high flow of the river in September and December (flood risk). Indices of extremes recommended by ECA&D were calculated from daily data of REMO and RCA A1B runs for control and projected periods. The analysis of precipitation extremes (SDII, RX1day, RX5day, etc.) has
Ensembles and their modules as objects of cartosemiotic inquiry
Full Text Available The structured set of signs in a map face -- here called map-face aggregate or MFA -- and the associated marginal notes make up an ensemble of modules or components (modular ensemble. Such ensembles are recognized where groups of entries are intuitively viewed as complex units, which includes the case that entries are consulted jointly and thus are involved in the same process of sign reception. Modular ensembles are amenable to semiotic study, just as are written or pictorial stories. Four kinds (one of them mentioned above are discussed in detail, two involving single MFAs, the other two being assemblages of maps, such as atlases. In terms of their internal structure, two types are recognized: the combinate (or grouping, in which modules are directly linked by combinatorial relations (example above, and the cumulate (or collection (of documents, in which modules are indirectly related through some conceptual commonality (example: series of geological maps. The discussion then turns to basic points concerning modular ensembles (identification of a module, internal organization of an ensemble, and characteristics which establish an ensemble as a unit and further to a few general semiotic concepts as they relate to the present research. Since this paper originated as a reaction to several of A. Wolodtschenko’s recent publications, it concludes with comments on some of his arguments which pertain to modular ensembles.
Ensemble forecasting of major solar flares: First results
We present the results from the first ensemble prediction model for major solar flares (M and X classes). The primary aim of this investigation is to explore the construction of an ensemble for an initial prototyping of this new concept. Using the probabilistic forecasts from three models hosted at the Community Coordinated Modeling Center (NASA-GSFC) and the NOAA forecasts, we developed an ensemble forecast by linearly combining the flaring probabilities from all four methods. Performance-based combination weights were calculated using a Monte Carlo-type algorithm that applies a decision threshold Pth to the combined probabilities and maximizing the Heidke Skill Score (HSS). Using the data for 13 recent solar active regions between years 2012 and 2014, we found that linear combination methods can improve the overall probabilistic prediction and improve the categorical prediction for certain values of decision thresholds. Combination weights vary with the applied threshold and none of the tested individual forecasting models seem to provide more accurate predictions than the others for all values of Pth. According to the maximum values of HSS, a performance-based weights calculated by averaging over the sample, performed similarly to a equally weighted model. The values Pth for which the ensemble forecast performs the best are 25% for M-class flares and 15% for X-class flares. When the human-adjusted probabilities from NOAA are excluded from the ensemble, the ensemble performance in terms of the Heidke score is reduced.
There are many hydrological data sources that are available from in-situ measurements, remote sensing, and atmospheric modelling, and that can be used to improve water management and understanding of hydrological processes. Each source comes with its own strengths and weaknesses, whether it is in accuracy, availability, measurement frequency, coverage or spatial resolution. By using multiple combinations of available data sources as input to a hydrological model, an ensemble prediction can be generated. Multi-model ensemble methods originate from the use of different numerical weather prediction (and climate) models. Most research on multi-model hydro-meteorological ensemble prediction has been done on the basis of different hydrological models. With the increase of reliable hydro-meteorological data sources, a re-visit of the multi-model approach is warranted, focussing on multiple combinations of inputs and models. In this paper, multiple data sources are fed into a hydrological model, resulting in an ensemble of model outputs. The data sources used to generate ensemble members include 2 precipitation sources from in-situ stations and ground-based radar, 3 land use maps from local origin and from satellite estimates, and 2 evapotranspiration estimates from in-situ measured reference evaporation and from satellite estimates through surface energy balance analysis. The land use data were generated by spectral classification of SPOT satellite images, and the remotely sensed evapotranspiration by solving the surface energy equation using Terra MODIS satellite images. The spatially distributed hydrological modelling system SIMGRO is used. The model simulates hydrological process of the Rijnland area in the Netherlands. The ensemble output is analysed by comparing model outputs with observed discharge. The results will be presented and serve to discuss the advantages and disadvantages of applying the multi-input ensemble approach for hydrological prediction.
Total probabilities of ensemble runoff forecasts
Ensemble forecasting has for a long time been used as a method in meteorological modelling to indicate the uncertainty of the forecasts. However, as the ensembles often exhibit both bias and dispersion errors, it is necessary to calibrate and post-process them. Two of the most common methods for this are Bayesian Model Averaging (Raftery et al., 2005) and Ensemble Model Output Statistics (EMOS) (Gneiting et al., 2005). There are also methods for regionalizing these methods (Berrocal et al., 2007) and for incorporating the correlation between lead times (Hemri et al., 2013). Engeland and Steinsland Engeland and Steinsland (2014) developed a framework which can estimate post-processing parameters which are different in space and time, but still can give a spatially and temporally consistent output. However, their method is computationally complex for our larger number of stations, and cannot directly be regionalized in the way we would like, so we suggest a different path below. The target of our work is to create a mean forecast with uncertainty bounds for a large number of locations in the framework of the European Flood Awareness System (EFAS - http://www.efas.eu) We are therefore more interested in improving the forecast skill for high-flows rather than the forecast skill of lower runoff levels. EFAS uses a combination of ensemble forecasts and deterministic forecasts from different forecasters to force a distributed hydrologic model and to compute runoff ensembles for each river pixel within the model domain. Instead of showing the mean and the variability of each forecast ensemble individually, we will now post-process all model outputs to find a total probability, the post-processed mean and uncertainty of all ensembles. The post-processing parameters are first calibrated for each calibration location, but assuring that they have some spatial correlation, by adding a spatial penalty in the calibration process. This can in some cases have a slight negative
Selecting supplier combination based on fuzzy multicriteria analysis
Existing multicriteria analysis (MCA) methods are probably ineffective in selecting a supplier combination. Thus, an MCA-based fuzzy 0-1 programming method is introduced. The programming relates to a simple MCA matrix that is used to select a single supplier. By solving the programming, the most feasible combination of suppliers is selected. Importantly, this result differs from selecting suppliers one by one according to a single-selection order, which is used to rank sole suppliers in existing MCA methods. An example highlights such difference and illustrates the proposed method.
Multilevel ensemble Kalman filtering
This work embeds a multilevel Monte Carlo sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF) in the setting of finite dimensional signal evolution and noisy discrete-time observations. The signal dynamics is assumed to be governed by a stochastic differential equation (SDE), and a hierarchy of time grids is introduced for multilevel numerical integration of that SDE. The resulting multilevel EnKF is proved to asymptotically outperform EnKF in terms of computational cost versus approximation accuracy. The theoretical results are illustrated numerically.
Image Combination Analysis in SPECAN Algorithm of Spaceborne SAR
An analysis of image combination in SPECAN algorithm is delivered in time-frequency domain in detail and a new image combination method is proposed. For four multi-looks processing one sub-aperture data in every three sub-apertures is processed in this combination method. The continual sub-aperture processing in SPECAN algorithm is realized and the processing efficiency can be dramatically increased. A new parameter is also put forward to measure the processing efficient of SAR image processing. Finally, the raw data of RADARSAT are used to test the method and the result proves that this method is feasible to be used in SPECAN algorithm of spaceborne SAR and can improve processing efficiently. SPECAN algorithm with this method can be used in quick-look imaging.
Meta analysis a guide to calibrating and combining statistical evidence
Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence acts as a source of basic methods for scientists wanting to combine evidence from different experiments. The authors aim to promote a deeper understanding of the notion of statistical evidence.The book is comprised of two parts - The Handbook, and The Theory. The Handbook is a guide for combining and interpreting experimental evidence to solve standard statistical problems. This section allows someone with a rudimentary knowledge in general statistics to apply the methods. The Theory provides the motivation, theory and results of simulation experiments to justify the methodology.This is a coherent introduction to the statistical concepts required to understand the authors' thesis that evidence in a test statistic can often be calibrated when transformed to the right scale.
Conductor gestures influence evaluations of ensemble performance
Full Text Available Previous research has found that listener evaluations of ensemble performances vary depending on the expressivity of the conductor’s gestures, even when performances are otherwise identical. It was the purpose of the present study to test whether this effect of visual information was evident in the evaluation of specific aspects of ensemble performance, articulation and dynamics. We constructed a set of 32 music performances that combined auditory and visual information and were designed to feature a high degree of contrast along one of two target characteristics: articulation and dynamics. We paired each of four music excerpts recorded by a chamber ensemble in both a high- and low-contrast condition with video of four conductors demonstrating high- and low-contrast gesture specifically appropriate to either articulation or dynamics. Using one of two equivalent test forms, college music majors and nonmajors (N = 285 viewed sixteen 30-second performances and evaluated the quality of the ensemble’s articulation, dynamics, technique and tempo along with overall expressivity. Results showed significantly higher evaluations for performances featuring high rather than low conducting expressivity regardless of the ensemble’s performance quality. Evaluations for both articulation and dynamics were strongly and positively correlated with evaluations of overall ensemble expressivity.
Avian nest-site selection is an important research and management subject. The hooded crane (Grus monacha) is a vulnerable (VU) species according to the IUCN Red List. Here, we present the first long-term Chinese legacy nest data for this species (1993-2010) with publicly available metadata. Further, we provide the first study that reports findings on multivariate nest habitat preference using such long-term field data for this species. Our work was carried out in Northeastern China, where we found and measured 24 nests and 81 randomly selected control plots and their environmental parameters in a vast landscape. We used machine learning (stochastic boosted regression trees) to quantify nest selection. Our analysis further included varclust (R Hmisc) and (TreenNet) to address statistical correlations and two-way interactions. We found that from an initial list of 14 measured field variables, water area (+), water depth (+) and shrub coverage (-) were the main explanatory variables that contributed to hooded crane nest-site selection. Agricultural sites played a smaller role in the selection of these nests. Our results are important for the conservation management of cranes all over East Asia and constitute a defensible and quantitative basis for predictive models. PMID:25001914
Ensemble Data Assimilation: Algorithms and Software
Ensemble data assimilation is nowadays applied to various problems to estimate a model state and model parameters by combining the model predictions with observational data. At the Alfred Wegener Institute, the assimilation focuses on ocean-sea ice models and coupled ocean-biogeochemical models. The high dimension of realistic models requires particularly efficient algorithms that are also usable on supercomputers. For the application of such filters, the Parallel Data Assimilation Framework ...
Attenuation Analysis and Acoustic Pressure Levels for Combined Absorptive Mufflers
Full Text Available The paper describes the pressure-wave propagation in a muffler for an internal combustion engine in case of two combined mufflers geometry. The approach is generally applicable to analyzing the damping of propagation of harmonic pressure waves. The paper purpose is to show finite elements analysis of both inductive and resistive damping in pressure acoustics. The main output is the attenuation and acoustic pressure levels for the frequency range 50 Hz–3000 Hz.
Performance analysis and modeling of energy from waste combined cycles
Municipal solid waste (MSW) is produced in a substantial amount with minimal fluctuations throughout the year. The analysis of carbon neutrality of MSW on a life cycle basis shows that MSW is about 67% carbon-neutral, suggesting that only 33% of the CO2 emissions from incinerating MSW are of fossil origin. The waste constitutes a 'renewable biofuel' energy resource and energy from waste (EfW) can result in a net reduction in CO2 emissions. In this paper, we explore an approach to extracting energy from MSW efficiently - EfW/gas turbine hybrid combined cycles. This approach innovates by delivering better performance with respect to energy efficiency and CO2 mitigation. In the combined cycles, the topping cycle consists of a gas turbine, while the bottoming cycle is a steam cycle where the low quality fuel - waste is utilized. This paper assesses the viability of the hybrid combined cycles and analyses their thermodynamic advantages with the help of computer simulations. It was shown that the combined cycles could offer significantly higher energy conversion efficiency and a practical solution to handling MSW. Also, the potential for a net reduction in CO2 emissions resulting from the hybrid combined cycles was evaluated.
Combined multi-criteria and cost-benefit analysis
The paper is an introduction to both theory and application of combined Cost-Benefit and Multi-Criteria Analysis. The first section is devoted to basic utility theory and its practical application in Cost-Benefit Analysis. Based on some of the problems encountered, arguments in favour of the...... application of utility-based Multi-Criteria Analyses methods as an extension and refinement of the traditional Cost-Benefit Analysis are provided. The theory presented in this paper is closely related the methods used in the WARP software (Leleur & Jensen, 1989). The presentation is however wider in scope.......The second section introduces the stated preference methodology used in WARP to create weight profiles for project pool sensitivity analysis. This section includes a simple example. The third section discusses how decision makers can get a priori aid to make their pair-wise comparisons based on project pool...
Estimating combining ability in popcorn lines using multivariate analysis
Full Text Available Aiming to estimate the combining ability in tropical and temperate popcorn (Zea mays L. var. everta Sturt. lines using multivariate analysis, ten popcorn lines were crossed in a complete diallel without reciprocals and the lines and hybrids were tested in two randomized complete block experiments with three replicates. Data were subjected to univariate and multivariate ANOVA, principal component analysis, and univariate and multivariate diallel analysis. For multivariate diallel analysis, variables were divided into group I (grain yield, mean weight of ears with grains, popping expansion, mean number of ears per plant, and final stand and group II (days to silking, plant height, first ear height, and lodged or broken plants. The P2 line had positive values for agronomic traits related to yield and popping expansion for group I, whereas the P4 line had fewer days to silking and lodged or broken plants for group II. Regarding the hybrids, P2 x P7 exhibited favorable values for most of the analyzed variables and had potential for recommendation. The multivariate diallel analysis can be useful in popcorn genetic improvement programs, particularly when directed toward the best cross combinations, where the objective is to simultaneously obtain genetic gains in multiple traits.
Meta-analysis for pathway enrichment analysis when combining multiple genomic studies
Motivation: Many pathway analysis (or gene set enrichment analysis) methods have been developed to identify enriched pathways under different biological states within a genomic study. As more and more microarray datasets accumulate, meta-analysis methods have also been developed to integrate information among multiple studies. Currently, most meta-analysis methods for combining genomic studies focus on biomarker detection and meta-analysis for pathway analysis has not been systematically purs...
Combined cardiotocographic and ST event analysis: A review.
ST-analysis of the fetal electrocardiogram (ECG) (STAN(®)) combined with cardiotocography (CTG) for intrapartum fetal monitoring has been developed following many years of animal research. Changes in the ST-segment of the fetal ECG correlated with fetal hypoxia occurring during labor. In 1993 the first randomized controlled trial (RCT), comparing CTG with CTG + ST-analysis was published. STAN(®) was introduced for daily practice in 2000. To date, six RCTs have been performed, out of which five have been published. Furthermore, there are six published meta-analyses. The meta-analyses showed that CTG + ST-analysis reduced the risks of vaginal operative delivery by about 10% and fetal blood sampling by 40%. There are conflicting results regarding the effect on metabolic acidosis, much because of controveries about which RCTs should be included in a meta-analysis, and because of differences in methodology, execution and quality of the meta-analyses. Several cohort studies have been published, some showing significant decrease of metabolic acidosis after the introduction of ST-analysis. In this review, we discuss not only the scientific evidence from the RCTs and meta-analyses, but also the limitations of these studies. In conclusion, ST-analysis is effective in reducing operative vaginal deliveries and fetal blood sampling but the effect on neonatal metabolic acidosis is still under debate. Further research is needed to determine the place of ST-analysis in the labor ward for daily practice. PMID:26206514
Scalable Ensemble Learning and Computationally Efficient Variance Estimation
LeDell, Erin
Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning
Conventional ensemble learning combines students in the space domain. In this paper, however, we combine students in the time domain and call it time-domain ensemble learning. We analyze, compare, and discuss the generalization performances regarding time-domain ensemble learning of both a linear model and a nonlinear model. Analyzing in the framework of online learning using a statistical mechanical method, we show the qualitatively different behaviors between the two models. In a linear mod...
Low energy level spacing distribution in the atomic table ensemble
We have analysed the nearest neighbour spacing distributions for the atomic table ensemble. The analysis carried out indicates that the random matrix theory arguments extend even to the ground state domain of atoms. (orig.)
Representative Ensembles in Statistical Mechanics
The notion of representative statistical ensembles, correctly representing statistical systems, is strictly formulated. This notion allows for a proper description of statistical systems, avoiding inconsistencies in theory. As an illustration, a Bose-condensed system is considered. It is shown that a self-consistent treatment of the latter, using a representative ensemble, always yields a conserving and gapless theory.
We discuss the geometry of trees endowed with a causal structure using the conventional framework of equilibrium statistical mechanics. We show how this ensemble is related to popular growing network models. In particular we demonstrate that on a class of afine attachment kernels the two models are identical but they can differ substantially for other choice of weights. We show that causal trees exhibit condensation even for asymptotically linear kernels. We derive general formulae describing the degree distribution, the ancestor--descendant correlation and the probability that a randomly chosen node lives at a given geodesic distance from the root. It is shown that the Hausdorff dimension dH of the causal networks is generically infinite. (author)
We discuss the geometry of trees endowed with a causal structure using the conventional framework of equilibrium statistical mechanics. We show how this ensemble is related to popular growing network models. In particular we demonstrate that on a class of afine attachment kernels the two models are identical but they can differ substantially for other choice of weights. We show that causal trees exhibit condensation even for asymptotically linear kernels. We derive general formulae describing the degree distribution, the ancestor--descendant correlation and the probability that a randomly chosen node lives at a given geodesic distance from the root. It is shown that the Hausdorff dimension dH of the causal networks is generically infinite.
Land Surface Models (LSM) coupled with River Routing schemes (RRM), are used in Global Climate Models (GCM) to simulate the continental part of the water cycle. They are key component of GCM as they provide boundary conditions to atmospheric and oceanic models. However, at global scale, errors arise mainly from simplified physics, atmospheric forcing, and input parameters. More particularly, those used in RRM, such as river width, depth and friction coefficients, are difficult to calibrate and are mostly derived from geomorphologic relationships, which may not always be realistic. In situ measurements are then used to calibrate these relationships and validate the model, but global in situ data are very sparse. Additionally, due to the lack of existing global river geomorphology database and accurate forcing, models are run at coarse resolution. This is typically the case of the ISBA-TRIP model used in this study.A complementary alternative to in-situ data are satellite observations. In this regard, the Surface Water and Ocean Topography (SWOT) satellite mission, jointly developed by NASA/CNES/CSA/UKSA and scheduled for launch around 2020, should be very valuable to calibrate RRM parameters. It will provide maps of water surface elevation for rivers wider than 100 meters over continental surfaces in between 78°S and 78°N and also direct observation of river geomorphological parameters such as width ans slope.Yet, before assimilating such kind of data, it is needed to analyze RRM temporal sensitivity to time-constant parameters. This study presents such analysis over large river basins for the TRIP RRM. Model output uncertainty, represented by unconditional variance, is decomposed into ordered contribution from each parameter. Doing a time-dependent analysis allows then to identify to which parameters modeled water level and discharge are the most sensitive along a hydrological year. The results show that local parameters directly impact water levels, while
A Combined Metabolomic and Proteomic Analysis of Gestational Diabetes Mellitus
Full Text Available The aim of this pilot study was to apply a novel combined metabolomic and proteomic approach in analysis of gestational diabetes mellitus. The investigation was performed with plasma samples derived from pregnant women with diagnosed gestational diabetes mellitus (n = 18 and a matched control group (n = 13. The mass spectrometry-based analyses allowed to determine 42 free amino acids and low molecular-weight peptide profiles. Different expressions of several peptides and altered amino acid profiles were observed in the analyzed groups. The combination of proteomic and metabolomic data allowed obtaining the model with a high discriminatory power, where amino acids ethanolamine, l-citrulline, l-asparagine, and peptide ions with m/z 1488.59; 4111.89 and 2913.15 had the highest contribution to the model. The sensitivity (94.44% and specificity (84.62%, as well as the total group membership classification value (90.32% calculated from the post hoc classification matrix of a joint model were the highest when compared with a single analysis of either amino acid levels or peptide ion intensities. The obtained results indicated a high potential of integration of proteomic and metabolomics analysis regardless the sample size. This promising approach together with clinical evaluation of the subjects can also be used in the study of other diseases.
The bivariate combined model for spatial data analysis.
To describe the spatial distribution of diseases, a number of methods have been proposed to model relative risks within areas. Most models use Bayesian hierarchical methods, in which one models both spatially structured and unstructured extra-Poisson variance present in the data. For modelling a single disease, the conditional autoregressive (CAR) convolution model has been very popular. More recently, a combined model was proposed that 'combines' ideas from the CAR convolution model and the well-known Poisson-gamma model. The combined model was shown to be a good alternative to the CAR convolution model when there was a large amount of uncorrelated extra-variance in the data. Less solutions exist for modelling two diseases simultaneously or modelling a disease in two sub-populations simultaneously. Furthermore, existing models are typically based on the CAR convolution model. In this paper, a bivariate version of the combined model is proposed in which the unstructured heterogeneity term is split up into terms that are shared and terms that are specific to the disease or subpopulation, while spatial dependency is introduced via a univariate or multivariate Markov random field. The proposed method is illustrated by analysis of disease data in Georgia (USA) and Limburg (Belgium) and in a simulation study. We conclude that the bivariate combined model constitutes an interesting model when two diseases are possibly correlated. As the choice of the preferred model differs between data sets, we suggest to use the new and existing modelling approaches together and to choose the best model via goodness-of-fit statistics. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26928309
Technical and financial analysis of combined cycle gas turbine
Full Text Available This paper presents technical and financial models which were developed in this study to predict the overall performance of combined cycle gas turbine plant in line with the needs of independent power producers in the liberalized market of power sector. Three similar sizes of combined cycle gas turbine power projects up to 200 Megawatt of independent power producers in Pakistan were selected in-order to develop and drive the basic assumptions for the inputs of the models in view of prevailing Government of Pakistan’s two components of electricity purchasing tariff that is energy purchase price and capacity purchase price at higher voltage grid station terminal from independent power producers. The levelized electricity purchasing tariff over life of plant on gaseous fuel at 60 percent plant load factor was 6.47 cent per kilowatt hour with energy purchase price and capacity purchase prices of 3.54 and 2.93 cents per kilowatt hour respectively. The outcome of technical models of gas turbine, steam turbine and combined cycle gas turbine power were found in close agreement with the projects under consideration and provides opportunity of evaluation of technical and financial aspects of combined cycle power plant in a more simplified manner with relatively accurate results. At 105 Celsius exit temperature of heat recovery steam generator flue gases the net efficiency of combined cycle gas turbine was 48.8 percent whereas at 125 Celsius exit temperature of heat recovery steam generator flue gases it was 48.0 percent. Sensitivity analysis of selected influential components of electricity tariff was also carried out.
Combining OLAP and data mining for analysis on trainee dataset
The aim of this thesis is to show the possibility of combining two data analyses techniques OLAP and data mining in a certain area. The principal method of achieving the aim will be continuous comparison and check of acquired results using two techniques. A practise dataset on credits provided to physical persons is used for practical application. The data analysis will be performed using Power Pivot MS Excel complement and LISp-Miner system. For work with LISp-System the 4ft Miner procedure ...
Cost-benefit analysis for combined heat and power plant
The paper presents a methodology and practical application of Cost-Benefit Analysis for Combined Heat and Power Plant (Cogeneration facility). Methodology include up-to-date and real data for cogeneration plant in accordance with the trends ill development of the CHP technology. As a case study a CHP plant that could be built-up in Republic of Macedonia is analyzed. The main economic parameters for project evaluation, such as NPV and IRR are calculated for a number of possible scenarios. The analyze present the economic outputs that could be used as a decision for CHP project acceptance for investment. (Author)
The Split-Apply-Combine Strategy for Data Analysis
Full Text Available Many data analysis problems involve the application of a split-apply-combine strategy, where you break up a big problem into manageable pieces, operate on each piece independently and then put all the pieces back together. This insight gives rise to a new R package that allows you to smoothly apply this strategy, without having to worry about the type of structure in which your data is stored.The paper includes two case studies showing how these insights make it easier to work with batting records for veteran baseball players and a large 3d array of spatio-temporal ozone measurements.
Interpretation of the results of emanation thermal analysis was obtained by combination with other thermoanalytical methods: a combination of ETA, EGA and DTA used with samples of CaCO3 and Ca(COO)2. H2O is given as an example. The samples were labelled with 228Th, the parent nuclide of 220Rn, the release of which was measured. Into the samples of CaCO3 the parent nuclide was introduced by impregnation, an alcoholic solution of 228Th and 224Rn in radioactive equilibrium being used. The samples of Ca(COO)2.H2O were labelled in the bulk by coprecipitation, 228Th and 224Ra being added to the initial calcium nitrate solution. (T.I.)
Ensemble annealing of complex physical systems
Algorithms for simulating complex physical systems or solving difficult optimization problems often resort to an annealing process. Rather than simulating the system at the temperature of interest, an annealing algorithm starts at a temperature that is high enough to ensure ergodicity and gradually decreases it until the destination temperature is reached. This idea is used in popular algorithms such as parallel tempering and simulated annealing. A general problem with annealing methods is that they require a temperature schedule. Choosing well-balanced temperature schedules can be tedious and time-consuming. Imbalanced schedules can have a negative impact on the convergence, runtime and success of annealing algorithms. This article outlines a unifying framework, ensemble annealing, that combines ideas from simulated annealing, histogram reweighting and nested sampling with concepts in thermodynamic control. Ensemble annealing simultaneously simulates a physical system and estimates its density of states. The...
针对集合预报方法在天气预报业务中的应用，开发了具有自主知识产权的集合预报产品综合分析显示平台。以集合预报模式输出数据量大、气象图表显示效率和质量要求高两个主要需求为出发点，采用客户端服务器架构设计。服务器端将原始数据转换为产品数据以提高客户端执行效率。该文详细分析了平台关键技术，针对数据延时问题，轮询式数据处理技术实时检查原始数据变化状态并更新产品，采用生产者消费者互斥方法解决多线程锁死问题。为提高图表美观程度，动态页面布局显示技术对所有图形要素进行分类，并给出显示属性的抽象描述，结合图形渲染技术，实现了看图模式和出图模式的动态切换。该平台为预报员和服务决策者提供了宝贵的不确定性信息，在中小尺度极端天气预报、台风路径预报中发挥了重要作用。%In response to the impendent requirement of ensemble forecast applications in modern weather forecast operations,an ensemble forecast product analysis and display platform named NUMBERS (NUmerical Model Blending and Ensemble foRecast System)is developed.The application background,requirement analysis,design of system architecture and function implementation are discussed in details.In addition, some key technologies such as dynamic page layout rendering and data pooling,are also described. First of all,the ensemble forecast platform is designed using the client-server architecture.On the server side,there is a data processing program that converts large amounts of ensemble numerical model output into product data to ensure the performance of client data visualization program.On the client side, there is a data visualization program and a management console program.The data visualization program provides features including ensemble product data analysis,blending of multiple deterministic models
Towards Advanced Data Analysis by Combining Soft Computing and Statistics
Gil, María; Sousa, João; Verleysen, Michel
2013-01-01
Soft computing, as an engineering science, and statistics, as a classical branch of mathematics, emphasize different aspects of data analysis. Soft computing focuses on obtaining working solutions quickly, accepting approximations and unconventional approaches. Its strength lies in its flexibility to create models that suit the needs arising in applications. In addition, it emphasizes the need for intuitive and interpretable models, which are tolerant to imprecision and uncertainty. Statistics is more rigorous and focuses on establishing objective conclusions based on experimental data by analyzing the possible situations and their (relative) likelihood. It emphasizes the need for mathematical methods and tools to assess solutions and guarantee performance. Combining the two fields enhances the robustness and generalizability of data analysis methods, while preserving the flexibility to solve real-world problems efficiently and intuitively.
Exergoeconomical analysis of coal gasification combined cycle power plants
This paper reports on combined cycle power plants with integrated coal gasification for a better utilization of primary energy sources which gained more and more importance. The established coal gasification technology offers various possibilities e.g. the TEXACO or the PRENFLO method. Recommendation for processes with these gasification methods will be evaluated energetically and exergetically. The pure thermodynamical analysis is at a considerable disadvantage in that the economical consequences of certain process improvement measures are not subjected to investigation. The connection of the exergetical with the economical evaluation will be realized in a way suggested as exergoeconomical analysis. This consideration of the reciprocal influencing of the exergy destruction and the capital depending costs is resulting in an optimization of the process and a minimization of the product costs
Construction of High-accuracy Ensemble of Classifiers
Full Text Available There have been several methods developed to construct ensembles. Some of these methods, such as Bagging and Boosting are meta-learners, i.e. they can be applied to any base classifier. The combination of methods should be selected in order that classifiers cover each other weaknesses. In ensemble, the output of several classifiers is used only when they disagree on some inputs. The degree of disagreement is called diversity of the ensemble. Another factor that plays a significant role in performing an ensemble is accuracy of the basic classifiers. It can be said that all the procedures of constructing ensembles seek to achieve a balance between these two parameters, and successful methods can reach a better balance. The diversity of the members of an ensemble is known as an important factor in determining its generalization error. In this paper, we present a new approach for generating ensembles. The proposed approach uses Bagging and Boosting as the generators of base classifiers. Subsequently, the classifiers are partitioned by means of a clustering algorithm. We introduce a selection phase for construction the final ensemble and three different selection methods are proposed for applying in this phase. In the first proposed selection method, a classifier is selected randomly from each cluster. The second method selects the most accurate classifier from each cluster and the third one selects the nearest classifier to the center of each cluster to construct the final ensemble. The results of the experiments on well-known datasets demonstrate the strength of our proposed approach, especially applying the selection of the most accurate classifiers from clusters and employing Bagging generator.
2000-11-01
A working group of the International GPS Service (IGS) was created to look after Reference Frame (RF) issues and contribute to the densification and improvement of the International Terrestrial Reference Frame (ITRF). One important objective of the Reference Frame Working Group is to generate consistent IGS station coordinates and velocities, Earth Rotation Parameters (ERP) and geocenter estimates along with the appropriate covariance information. These parameters have a direct impact on other IGS products such as the estimation of GPS satellite ephemerides, as well as satellite and station clocks. The information required is available weekly from the Analysis Centers (AC) (cod, emr, esa, gfz, jpl, ngs, sio) and from the Global Network Associate Analysis Centers (GNAAC) (JPL, mit, ncl) using a "Software Independent Exchange Format" (SINEX). The AC are also contributing daily ERPs as part of their weekly submission. The procedure in place simultaneously combines the weekly station coordinates, geocenter and daily ERP estimates. A cumulative solution containing station coordinates and velocity is also updated with each weekly combination. This provides a convenient way to closely monitor the quality of the estimated station coordinates and to have an up to date cumulative solution available at all times. To provide some necessary redundancy, the weekly station coordinates solution is compared against the GNAAC solutions. Each of the 3 GNAAC uses its own software, allowing independent verification of the combination process. The RMS of the coordinate differences in the north, east and up components between the AC/GNAAC and the ITRF97 Reference Frame Stations are 4-10 mm, 5-20 mm and 6-25 mm. The station velocities within continental plates are compared to the NNR-NUVEL1A plate motion model (DeMets et al., 1994). The north, east and up velocity RMS are 2 mm/y, 3 mm/y and 8 mm/y. Note that NNR-NUVEL1A assumes a zero vertical velocity.
Thermoeconomic Analysis of Advanced Solar-Fossil Combined Power Plants
Hybrid solar thermal power plants (with parabolic trough type of solar collectors featuring gas burners and Rankine steam cycles have been successfully demonstrated by California's Solar Electric Generating System (SEGS. This system has been proven to be one of the most efficient and economical schemes to convert solar energy into electricity. Recent technological progress opens interesting prospects for advanced cycle concepts: a the ISCCS (Integrated Solar Combined Cycle System that integrates the parabolic trough into a fossil fired combined cycle, which allows a larger exergy potential of the fuel to be converted. b the HSTS (Hybrid Solar Tower System which uses high concentration optics (via a power tower generator and high temperature air receivers to drive the combined cycle power plant. In the latter case, solar energy is used at a higher exergy level as a heat source of the topping cycle. This paper presents the results of a thermoeconomic investigation of an ISCCS envisaged in Tunisia. The study is realized in two phases. In the first phase, a mixed approach, based on pinch technology principles coupled with a mathematical optimization algorithm, is used to minimize the heat transfer exergy losses in the steam generators, respecting the off design operating conditions of the steam turbine (cone law. In the second phase, an economic analysis based on the Levelized Electricity Cost (LEC approach was carried out for the configurations, which provided the best concepts during the first phase. A comparison of ISCCS with pure fossil fueled plants (CC+GT is reported for the same electrical power load. A sensitivity analysis based on the relative size of the solar field is presented.
We analyze an ensemble of seven XCO2 retrieval algorithms for SCIAMACHY and GOSAT. The ensemble spread can be interpreted as regional uncertainty and can help to identify locations for new TCCON validation sites. Additionally, we introduce the ensemble median algorithm EMMA combining individual soundings of the seven algorithms into one new dataset. The ensemble takes advantage of the algorithms' independent developments. We find ensemble spreads being often
2016-01-01
Protein remote homology detection is one of the central problems in bioinformatics. Although some computational methods have been proposed, the problem is still far from being solved. In this paper, an ensemble classifier for protein remote homology detection, called SVM-Ensemble, was proposed with a weighted voting strategy. SVM-Ensemble combined three basic classifiers based on different feature spaces, including Kmer, ACC, and SC-PseAAC. These features consider the characteristics of proteins from various perspectives, incorporating both the sequence composition and the sequence-order information along the protein sequences. Experimental results on a widely used benchmark dataset showed that the proposed SVM-Ensemble can obviously improve the predictive performance for the protein remote homology detection. Moreover, it achieved the best performance and outperformed other state-of-the-art methods. PMID:27294123
Hybrid Intrusion Detection Using Ensemble of Classification Methods
Full Text Available One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed for homogeneous ensemble classifiers using bagging and heterogeneous ensemble classifiers using arcing classifier and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF and Support Vector Machine (SVM as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of real and benchmark data sets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase and combining phase. A wide range of comparative experiments are conducted for real and benchmark data sets of intrusion detection. The accuracy of base classifiers is compared with homogeneous and heterogeneous models for data mining problem. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and also heterogeneous models exhibit better results than homogeneous models for real and benchmark data sets of intrusion detection.
The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL: http://www.ensembl.org/index.html. PMID:27337980
Combined Analysis and Validation of Earth Rotation Models and Observations
Global dynamic processes cause changes in the Earth's rotation, gravity field and geometry. Thus, they can be traced in geodetic observations of these quantities. However, the sensitivity of the various geodetic observation techniques to specific processes in the Earth system differs. More meaningful conclusions with respect to contributions from individual Earth subsystems can be drawn from the combined analysis of highly precise and consistent parameter time series from heterogeneous observation types which carry partially redundant and partially complementary information. For the sake of a coordinated research in this field, the Research Unit FOR 584 "Earth Rotation and Global Dynamic Processes" is funded at present by the German Research Foundation (DFG). It is concerned with the refined and consistent modeling and data analysis. One of the projects (P9) within this Research Unit addresses the combined analysis and validation of Earth rotation models and observations. In P9 three main topics are addressed: (1) the determination and mutual validation of reliable consistent time series for Earth rotation parameters and gravity field coefficients due to the consideration of their physical connection by the Earth's tensor of inertia, (2) the separation of individual Earth rotation excitation mechanisms by merging all available relevant data from recent satellite missions (GRACE, Jason-1, …) and geodetic space techniques (GNSS, SLR, VLBI, …) in a highly consistent way, (3) the estimation of fundamental physical Earth parameters (Love numbers, …) by an inverse model using the improved geodetic observation time series as constraints. Hence, this project provides significant and unique contributions to the field of Earth system science in general; it corresponds with the goals of the Global Geodetic Observing System (GGOS). In this paper project P9 is introduced, the goals are summarized and a status report including a presentation and discussion of intermediate
The use of ensemble streamflow forecasts is developing in the international flood forecasting services. Ensemble streamflow forecast systems can provide more accurate forecasts and useful information about the uncertainty of the forecasts, thus improving the assessment of risks. Nevertheless, these systems, like all hydrological forecasts, suffer from errors on initialization or on meteorological data, which lead to hydrological prediction errors. This article, which is the second part of a 2-part article, concerns the impacts of initial states, improved by a streamflow assimilation system, on an ensemble streamflow prediction system over France. An assimilation system was implemented to improve the streamflow analysis of the SAFRAN-ISBA-MODCOU (SIM) hydro-meteorological suite, which initializes the ensemble streamflow forecasts at Météo-France. This assimilation system, using the Best Linear Unbiased Estimator (BLUE) and modifying the initial soil moisture states, showed an improvement of the streamflow analysis with low soil moisture increments. The final states of this suite were used to initialize the ensemble streamflow forecasts of Météo-France, which are based on the SIM model and use the European Centre for Medium-range Weather Forecasts (ECMWF) 10-day Ensemble Prediction System (EPS). Two different configurations of the assimilation system were used in this study: the first with the classical SIM model and the second using improved soil physics in ISBA. The effects of the assimilation system on the ensemble streamflow forecasts were assessed for these two configurations, and a comparison was made with the original (i.e. without data assimilation and without the improved physics) ensemble streamflow forecasts. It is shown that the assimilation system improved most of the statistical scores usually computed for the validation of ensemble predictions (RMSE, Brier Skill Score and its decomposition, Ranked Probability Skill Score, False Alarm Rate, etc
Full Text Available The use of ensemble streamflow forecasts is developing in the international flood forecasting services. Ensemble streamflow forecast systems can provide more accurate forecasts and useful information about the uncertainty of the forecasts, thus improving the assessment of risks. Nevertheless, these systems, like all hydrological forecasts, suffer from errors on initialization or on meteorological data, which lead to hydrological prediction errors. This article, which is the second part of a 2-part article, concerns the impacts of initial states, improved by a streamflow assimilation system, on an ensemble streamflow prediction system over France. An assimilation system was implemented to improve the streamflow analysis of the SAFRAN-ISBA-MODCOU (SIM hydro-meteorological suite, which initializes the ensemble streamflow forecasts at Météo-France. This assimilation system, using the Best Linear Unbiased Estimator (BLUE and modifying the initial soil moisture states, showed an improvement of the streamflow analysis with low soil moisture increments. The final states of this suite were used to initialize the ensemble streamflow forecasts of Météo-France, which are based on the SIM model and use the European Centre for Medium-range Weather Forecasts (ECMWF 10-day Ensemble Prediction System (EPS. Two different configurations of the assimilation system were used in this study: the first with the classical SIM model and the second using improved soil physics in ISBA. The effects of the assimilation system on the ensemble streamflow forecasts were assessed for these two configurations, and a comparison was made with the original (i.e. without data assimilation and without the improved physics ensemble streamflow forecasts. It is shown that the assimilation system improved most of the statistical scores usually computed for the validation of ensemble predictions (RMSE, Brier Skill Score and its decomposition, Ranked Probability Skill Score, False Alarm
The variation of topography in Colorado not only adds to the beauty of its landscape, but also tests our ability to predict warm season severe convection. Deficient radar coverage and limited observations make quantitative precipitation forecasting quite a challenge. Past studies have suggested that greater forecast skill of mesoscale convection initiation and precipitation characteristics are achievable considering an ensemble with explicitly predicted convection compared to one that has parameterized convection. The range of uncertainty and probabilities in these forecasts can help forecasters in their precipitation predictions and communication of weather information to emergency managers (EMs). EMs serve an integral role in informing and protecting communities in anticipation of hazardous weather. An example of such an event occurred on the evening of 6 June 2012, where areas to the lee of the Rocky Mountain Front Range were impacted by flash-flood-producing severe convection that included heavy rain and copious amounts of hail. Despite the discrepancy in the timing, location and evolution of convection, the convection-allowing ensemble forecasts generally outperformed those of the convection-parameterized ensemble in representing the mesoscale processes responsible for the 6-7 June severe convective event. Key features sufficiently reproduced by several of the convection-allowing ensemble members resembled the observations: 1) general location of a convergence boundary east of Denver, 2) convective initiation along the boundary, 3) general location of a weak cold front near the Wyoming/Nebraska border, and 4) cold pools and moist upslope characteristics that contributed to the backbuilding of convection. Members from the convection-parameterized ensemble that failed to reproduce these results displaced the convergence boundary, produced a cold front that moved southeast too quickly, and used the cold front for convective initiation. The convection
2010-05-01
In this paper, we study the ensemble clustering problem, where the input is in the form of multiple clustering solutions. The goal of ensemble clustering algorithms is to aggregate the solutions into one solution that maximizes the agreement in the input ensemble. We obtain several new results for this problem. Specifically, we show that the notion of agreement under such circumstances can be better captured using a 2D string encoding rather than a voting strategy, which is common among existing approaches. Our optimization proceeds by first constructing a non-linear objective function which is then transformed into a 0-1 Semidefinite program (SDP) using novel convexification techniques. This model can be subsequently relaxed to a polynomial time solvable SDP. In addition to the theoretical contributions, our experimental results on standard machine learning and synthetic datasets show that this approach leads to improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. In addition, we identify several new application scenarios for this problem. These include combining multiple image segmentations and generating tissue maps from multiple-channel Diffusion Tensor brain images to identify the underlying structure of the brain. PMID:21927539
2016-01-01
We evaluate the possibility of application of combination of classifiers using fuzzy measures and integrals to Brain-Computer Interface (BCI) based on electroencephalography. In particular, we present an ensemble method that can be applied to a variety of systems and evaluate it in the context of a visual P300-based BCI. Offline analysis of data relative to 5 subjects lets us argue that the proposed classification strategy is suitable for BCI. Indeed, the achieved performance is significantly greater than the average of the base classifiers and, broadly speaking, similar to that of the best one. Thus the proposed methodology allows realizing systems that can be used by different subjects without the need for a preliminary configuration phase in which the best classifier for each user has to be identified. Moreover, the ensemble is often capable of detecting uncertain situations and turning them from misclassifications into abstentions, thereby improving the level of safety in BCI for environmental or device control. PMID:26819595
2015-01-01
We combine and extend the analyses of effective scalar, vector, Majorana and Dirac fermion Higgs portal models of Dark Matter (DM), in which DM couples to the Standard Model (SM) Higgs boson via an operator of the form $\\mathcal{O}_{\\textrm{DM}}\\, H^\\dagger H$. For the fermion models, we take an admixture of scalar $\\overline{\\psi} \\psi$ and pseudoscalar $\\overline{\\psi} i\\gamma_5 \\psi$ interaction terms. For each model, we apply constraints on the parameter space based on the Planck measured DM relic density and the LHC limits on the Higgs invisible branching ratio. For the first time, we perform a consistent study of the indirect detection prospects for these models based on the WMAP7/Planck observations of the CMB, a combined analysis of 15 dwarf spheroidal galaxies by Fermi-LAT and the upcoming Cherenkov Telescope Array (CTA). We also perform a correct treatment of the momentum-dependent direct search cross-section that arises from the pseudoscalar interaction term in the fermionic DM theories. We find, i...
The paper presents a microflora analysis of a 5-year-old male child with severe combined immune deficiency who was delivered by Caesarean section and continuously maintained in an isolator. Despite precautions, it was found that the child had come in contact with at least 54 different microbial contaminants. While his skin autoflora was similar to that of a reference group of healthy male adults in numbers of different species and the number of viable cells present per square centimeter of surface area, the subject's autoflora differed from the reference group in that significantly fewer anaerobic species were recovered from the patient's mouth and feces. It is suggested that the child's remaining disease free shows that the reported bacteria are noninvasive or that the unaffected components of the child's immune defense mechanisms are important.
Mouse Karyotype Obtained by Combining DAPI Staining with Image Analysis
In this study, mitotic metaphase chromosomes in mouse were identified by a new chromosome fluorescence banding technique combining DAPI staining with image analysis. Clear 4', 6-diamidino-2-phenylindole (DAPI) multiple bands like G-bands could be produced in mouse. The MetaMorph software was then used to generate linescans of pixel intensity for the banded chromosomes from short arm to long arm. These linescans were sufficient not only to identify each individual chromosome but also analyze the physical sites of bands in chromosome. Based on the results, the clear and accurate karyotype of mouse metaphase chromosomes was established. The technique is therefore considered to be a new method for cytological studies of mouse.
Investigation of fish otoliths by combined ion beam analysis
Complete text of publication follows. This work was implemented within the framework of the Hungarian Ion beam Physics Platform (http://hipp.atomki.hu/). Otoliths are small structures, 'the ear stones' of a fish, and are used to detect acceleration and orientation. They are composed of a combination of protein matrix and calcium carbonate (CaCO3) forming aragonite micro crystals. They have an annually deposited layered conformation with a microstructure corresponding to the seasonal and daily increments. Trace elements, such as Sr, Zn, Fe etc., are also incorporated into the otolith from the environment and the nutrition. The elemental distribution of the otolith of fresh water fish burbot (Lota lota L.) collected in Hungary was measured with Elastic Recoil Detection Analysis (ERDA), Rutherford backscattering spectrometry (RBS) and Particle Induced X-ray Emission (PIXE) at the Nuclear Microprobe Facility of HAS ATOMKI. The spatial 3D structure of the otolith could be observed with a sub-micrometer resolution. It is confirmed that the aragonite micro-crystals are covered by an organic layer and there are some protein rich regions in the otolith, too. By applying the RBSMAST code developed for RBS on macroscopic structure, it was proven that the orientation of the needle shaped aragonite crystals is considerably different at adjacent locations in the otolith. The organic and inorganic component of the otolith could be set apart in the depth selective hydrogen and calcium maps derived by micro- ERDA and micro-RBS. Similar structural analysis could be done near the surface by combining the C, O and Ca elemental maps determined by micro-PIXE measurements. It was observed that the trace metal Zn is bound to the protein component. Acknowledgements This work was partially supported by the Hungarian OTKA Grant No. T046238 and the EU cofunded Economic Competitiveness Operative Programme (GVOP-3.2.1.-2004-04-0402/3.0)