Single-ensemble nonequilibrium path-sampling estimates of free energy differences.
Ytreberg, F Marty; Zuckerman, Daniel M
2004-06-15
We introduce a straightforward, single-ensemble, path sampling approach to calculate free energy differences based on Jarzynski's relation. For a two-dimensional "toy" test system, the new (minimally optimized) method performs roughly one hundred times faster than either optimized "traditional" Jarzynski calculations or conventional thermodynamic integration. The simplicity of the underlying formalism suggests the approach will find broad applicability in molecular systems.
Directory of Open Access Journals (Sweden)
Rory M Donovan
2016-02-01
Full Text Available The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation--by orders of magnitude for some observables.
Path planning in uncertain flow fields using ensemble method
Wang, Tong; Le Maître, Olivier P.; Hoteit, Ibrahim; Knio, Omar M.
2016-10-01
An ensemble-based approach is developed to conduct optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where an ensemble of deterministic predictions is used to model and quantify uncertainty. In an operational setting, much about dynamics, topography, and forcing of the ocean environment is uncertain. To address this uncertainty, the flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each of the resulting realizations of the uncertain current field, we predict the path that minimizes the travel time by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of the sampling strategy and to develop insight into extensions dealing with general circulation ocean models. In particular, the ensemble method enables us to perform a statistical analysis of travel times and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.
Path planning in uncertain flow fields using ensemble method
Wang, Tong
2016-08-20
An ensemble-based approach is developed to conduct optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where an ensemble of deterministic predictions is used to model and quantify uncertainty. In an operational setting, much about dynamics, topography, and forcing of the ocean environment is uncertain. To address this uncertainty, the flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each of the resulting realizations of the uncertain current field, we predict the path that minimizes the travel time by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of the sampling strategy and to develop insight into extensions dealing with general circulation ocean models. In particular, the ensemble method enables us to perform a statistical analysis of travel times and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.
Girsanov reweighting for path ensembles and Markov state models
Donati, L.; Hartmann, C.; Keller, B. G.
2017-06-01
The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules. We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSMs) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended to reweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor "on the fly" during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process and an artificial many-body system to alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.
Distinguishing Two Probability Ensembles with One Sample from each Ensemble
Antunes, L.; Buhrman, H.; Matos, A.; Souto, A.; Teixeira, A.
2016-01-01
We introduced a new method for distinguishing two probability ensembles called one from each method, in which the distinguisher receives as input two samples, one from each ensemble. We compare this new method with multi-sample from the same method already exiting in the literature and prove that
Distinguishing two probability ensembles with one sample from each ensemble
L.F. Antunes (Luis); H. Buhrman (Harry); A. Matos; A. Souto (Andre); A. Teixeira
2016-01-01
htmlabstractWe introduced a new method for distinguishing two probability ensembles called one from each method, in which the distinguisher receives as input two samples, one from each ensemble. We compare this new method with multi-sample from the same method already exiting in the literature
Ensemble Weight Enumerators for Protograph LDPC Codes
Divsalar, Dariush
2006-01-01
Recently LDPC codes with projected graph, or protograph structures have been proposed. In this paper, finite length ensemble weight enumerators for LDPC codes with protograph structures are obtained. Asymptotic results are derived as the block size goes to infinity. In particular we are interested in obtaining ensemble average weight enumerators for protograph LDPC codes which have minimum distance that grows linearly with block size. As with irregular ensembles, linear minimum distance property is sensitive to the proportion of degree-2 variable nodes. In this paper the derived results on ensemble weight enumerators show that linear minimum distance condition on degree distribution of unstructured irregular LDPC codes is a sufficient but not a necessary condition for protograph LDPC codes.
Steered transition path sampling.
Guttenberg, Nicholas; Dinner, Aaron R; Weare, Jonathan
2012-06-21
We introduce a path sampling method for obtaining statistical properties of an arbitrary stochastic dynamics. The method works by decomposing a trajectory in time, estimating the probability of satisfying a progress constraint, modifying the dynamics based on that probability, and then reweighting to calculate averages. Because the progress constraint can be formulated in terms of occurrences of events within time intervals, the method is particularly well suited for controlling the sampling of currents of dynamic events. We demonstrate the method for calculating transition probabilities in barrier crossing problems and survival probabilities in strongly diffusive systems with absorbing states, which are difficult to treat by shooting. We discuss the relation of the algorithm to other methods.
Path Minima Queries in Dynamic Weighted Trees
DEFF Research Database (Denmark)
Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao
2011-01-01
In the path minima problem on a tree, each edge is assigned a weight and a query asks for the edge with minimum weight on a path between two nodes. For the dynamic version of the problem, where the edge weights can be updated, we give data structures that achieve optimal query time\\todo{what about...
Zwier, Matthew C; Adelman, Joshua L; Kaus, Joseph W; Pratt, Adam J; Wong, Kim F; Rego, Nicholas B; Suárez, Ernesto; Lettieri, Steven; Wang, David W; Grabe, Michael; Zuckerman, Daniel M; Chong, Lillian T
2015-02-10
The weighted ensemble (WE) path sampling approach orchestrates an ensemble of parallel calculations with intermittent communication to enhance the sampling of rare events, such as molecular associations or conformational changes in proteins or peptides. Trajectories are replicated and pruned in a way that focuses computational effort on underexplored regions of configuration space while maintaining rigorous kinetics. To enable the simulation of rare events at any scale (e.g., atomistic, cellular), we have developed an open-source, interoperable, and highly scalable software package for the execution and analysis of WE simulations: WESTPA (The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis). WESTPA scales to thousands of CPU cores and includes a suite of analysis tools that have been implemented in a massively parallel fashion. The software has been designed to interface conveniently with any dynamics engine and has already been used with a variety of molecular dynamics (e.g., GROMACS, NAMD, OpenMM, AMBER) and cell-modeling packages (e.g., BioNetGen, MCell). WESTPA has been in production use for over a year, and its utility has been demonstrated for a broad set of problems, ranging from atomically detailed host–guest associations to nonspatial chemical kinetics of cellular signaling networks. The following describes the design and features of WESTPA, including the facilities it provides for running WE simulations and storing and analyzing WE simulation data, as well as examples of input and output.
Time-optimal path planning in uncertain flow fields using ensemble method
Wang, Tong
2016-01-06
An ensemble-based approach is developed to conduct time-optimal path planning in unsteady ocean currents under uncertainty. We focus our attention on two-dimensional steady and unsteady uncertain flows, and adopt a sampling methodology that is well suited to operational forecasts, where a set deterministic predictions is used to model and quantify uncertainty in the predictions. In the operational setting, much about dynamics, topography and forcing of the ocean environment is uncertain, and as a result a single path produced by a model simulation has limited utility. To overcome this limitation, we rely on a finitesize ensemble of deterministic forecasts to quantify the impact of variability in the dynamics. The uncertainty of flow field is parametrized using a finite number of independent canonical random variables with known densities, and the ensemble is generated by sampling these variables. For each the resulting realizations of the uncertain current field, we predict the optimal path by solving a boundary value problem (BVP), based on the Pontryagin maximum principle. A family of backward-in-time trajectories starting at the end position is used to generate suitable initial values for the BVP solver. This allows us to examine and analyze the performance of sampling strategy, and develop insight into extensions dealing with regional or general circulation models. In particular, the ensemble method enables us to perform a statistical analysis of travel times, and consequently develop a path planning approach that accounts for these statistics. The proposed methodology is tested for a number of scenarios. We first validate our algorithms by reproducing simple canonical solutions, and then demonstrate our approach in more complex flow fields, including idealized, steady and unsteady double-gyre flows.
Avena-Koenigsberger, Andrea; Mišić, Bratislav; Hawkins, Robert X D; Griffa, Alessandra; Hagmann, Patric; Goñi, Joaquín; Sporns, Olaf
2017-01-01
Computational analysis of communication efficiency of brain networks often relies on graph-theoretic measures based on the shortest paths between network nodes. Here, we explore a communication scheme that relaxes the assumption that information travels exclusively through optimally short paths. The scheme assumes that communication between a pair of brain regions may take place through a path ensemble comprising the k-shortest paths between those regions. To explore this approach, we map path ensembles in a set of anatomical brain networks derived from diffusion imaging and tractography. We show that while considering optimally short paths excludes a significant fraction of network connections from participating in communication, considering k-shortest path ensembles allows all connections in the network to contribute. Path ensembles enable us to assess the resilience of communication pathways between brain regions, by measuring the number of alternative, disjoint paths within the ensemble, and to compare generalized measures of path length and betweenness centrality to those that result when considering only the single shortest path between node pairs. Furthermore, we find a significant correlation, indicative of a trade-off, between communication efficiency and resilience of communication pathways in structural brain networks. Finally, we use k-shortest path ensembles to demonstrate hemispherical lateralization of efficiency and resilience.
Enzymatic reaction paths as determined by transition path sampling
Masterson, Jean Emily
Enzymes are biological catalysts capable of enhancing the rates of chemical reactions by many orders of magnitude as compared to solution chemistry. Since the catalytic power of enzymes routinely exceeds that of the best artificial catalysts available, there is much interest in understanding the complete nature of chemical barrier crossing in enzymatic reactions. Two specific questions pertaining to the source of enzymatic rate enhancements are investigated in this work. The first is the issue of how fast protein motions of an enzyme contribute to chemical barrier crossing. Our group has previously identified sub-picosecond protein motions, termed promoting vibrations (PVs), that dynamically modulate chemical transformation in several enzymes. In the case of human heart lactate dehydrogenase (hhLDH), prior studies have shown that a specific axis of residues undergoes a compressional fluctuation towards the active site, decreasing a hydride and a proton donor--acceptor distance on a sub-picosecond timescale to promote particle transfer. To more thoroughly understand the contribution of this dynamic motion to the enzymatic reaction coordinate of hhLDH, we conducted transition path sampling (TPS) using four versions of the enzymatic system: a wild type enzyme with natural isotopic abundance; a heavy enzyme where all the carbons, nitrogens, and non-exchangeable hydrogens were replaced with heavy isotopes; and two versions of the enzyme with mutations in the axis of PV residues. We generated four separate ensembles of reaction paths and analyzed each in terms of the reaction mechanism, time of barrier crossing, dynamics of the PV, and residues involved in the enzymatic reaction coordinate. We found that heavy isotopic substitution of hhLDH altered the sub-picosecond dynamics of the PV, changed the favored reaction mechanism, dramatically increased the time of barrier crossing, but did not have an effect on the specific residues involved in the PV. In the mutant systems
Approximate Shortest Homotopic Paths in Weighted Regions
Cheng, Siu-Wing
2010-01-01
Let P be a path between two points s and t in a polygonal subdivision T with obstacles and weighted regions. Given a relative error tolerance ε ∈(0,1), we present the first algorithm to compute a path between s and t that can be deformed to P without passing over any obstacle and the path cost is within a factor 1 + ε of the optimum. The running time is O(h 3/ε2 kn polylog(k, n, 1/ε)), where k is the number of segments in P and h and n are the numbers of obstacles and vertices in T, respectively. The constant in the running time of our algorithm depends on some geometric parameters and the ratio of the maximum region weight to the minimum region weight. © 2010 Springer-Verlag.
Approximate shortest homotopic paths in weighted regions
Cheng, Siuwing
2012-02-01
A path P between two points s and t in a polygonal subdivision T with obstacles and weighted regions defines a class of paths that can be deformed to P without passing over any obstacle. We present the first algorithm that, given P and a relative error tolerance ε (0, 1), computes a path from this class with cost at most 1 + ε times the optimum. The running time is O(h 3/ε 2kn polylog (k,n,1/ε)), where k is the number of segments in P and h and n are the numbers of obstacles and vertices in T, respectively. The constant in the running time of our algorithm depends on some geometric parameters and the ratio of the maximum region weight to the minimum region weight. © 2012 World Scientific Publishing Company.
Ensemble approach to the analysis of weighted networks
Ahnert, S. E.; Garlaschelli, D.; Fink, T. M. A.; Caldarelli, G.
2007-07-01
We present an approach to the analysis of weighted networks, by providing a straightforward generalization of any network measure defined on unweighted networks, such as the average degree of the nearest neighbors, the clustering coefficient, the “betweenness,” the distance between two nodes, and the diameter of a network. All these measures are well established for unweighted networks but have hitherto proven difficult to define for weighted networks. Our approach is based on the translation of a weighted network into an ensemble of edges. Further introducing this approach we demonstrate its advantages by applying the clustering coefficient constructed in this way to two real-world weighted networks.
Weighted ensemble transform Kalman filter for image assimilation
Directory of Open Access Journals (Sweden)
Sebastien Beyou
2013-01-01
Full Text Available This study proposes an extension of the Weighted Ensemble Kalman filter (WEnKF proposed by Papadakis et al. (2010 for the assimilation of image observations. The main focus of this study is on a novel formulation of the Weighted filter with the Ensemble Transform Kalman filter (WETKF, incorporating directly as a measurement model a non-linear image reconstruction criterion. This technique has been compared to the original WEnKF on numerical and real world data of 2-D turbulence observed through the transport of a passive scalar. In particular, it has been applied for the reconstruction of oceanic surface current vorticity fields from sea surface temperature (SST satellite data. This latter technique enables a consistent recovery along time of oceanic surface currents and vorticity maps in presence of large missing data areas and strong noise.
Weighted Ensemble Simulation: Review of Methodology, Applications, and Software.
Zuckerman, Daniel M; Chong, Lillian T
2017-05-22
The weighted ensemble (WE) methodology orchestrates quasi-independent parallel simulations run with intermittent communication that can enhance sampling of rare events such as protein conformational changes, folding, and binding. The WE strategy can achieve superlinear scaling-the unbiased estimation of key observables such as rate constants and equilibrium state populations to greater precision than would be possible with ordinary parallel simulation. WE software can be used to control any dynamics engine, such as standard molecular dynamics and cell-modeling packages. This article reviews the theoretical basis of WE and goes on to describe successful applications to a number of complex biological processes-protein conformational transitions, (un)binding, and assembly processes, as well as cell-scale processes in systems biology. We furthermore discuss the challenges that need to be overcome in the next phase of WE methodological development. Overall, the combined advances in WE methodology and software have enabled the simulation of long-timescale processes that would otherwise not be practical on typical computing resources using standard simulation.
Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.
Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel
2017-06-01
Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.
Line transect sampling from a curving path.
Hiby, L; Krishna, M B
2001-09-01
Cutting straight line transects through dense forest is time consuming and expensive when large areas need to be surveyed for rare or highly clustered species. We argue that existing paths or game trails may be suitable as transects for line transect sampling even though they will not, in general, run straight. Formulas and software currently used to estimate local density using perpendicular distance data can be used with closest approach distances measured from curving transects. Suitable paths or trails are those for which the minimum radius of curvature is rarely less than the width of the shoulder in the detection probability function. The use of existing paths carries the risk of bias resulting from unrepresentative sampling of available habitats, and this must be weighed against the increase in coverage available.
Statistical Analysis of the First Passage Path Ensemble of Jump Processes
von Kleist, Max; Schütte, Christof; Zhang, Wei
2017-12-01
The transition mechanism of jump processes between two different subsets in state space reveals important dynamical information of the processes and therefore has attracted considerable attention in the past years. In this paper, we study the first passage path ensemble of both discrete-time and continuous-time jump processes on a finite state space. The main approach is to divide each first passage path into nonreactive and reactive segments and to study them separately. The analysis can be applied to jump processes which are non-ergodic, as well as continuous-time jump processes where the waiting time distributions are non-exponential. In the particular case that the jump processes are both Markovian and ergodic, our analysis elucidates the relations between the study of the first passage paths and the study of the transition paths in transition path theory. We provide algorithms to numerically compute statistics of the first passage path ensemble. The computational complexity of these algorithms scales with the complexity of solving a linear system, for which efficient methods are available. Several examples demonstrate the wide applicability of the derived results across research areas.
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Diks, Cees G H [NON LANL; Clark, Martyn P [NON LANL
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
Asymmetric similarity-weighted ensembles for image segmentation
DEFF Research Database (Denmark)
Cheplygina, V.; Van Opbroek, A.; Ikram, M. A.
2016-01-01
Supervised classification is widely used for image segmentation. To work effectively, these techniques need large amounts of labeled training data, that is representative of the test data. Different patient groups, different scanners or different scanning protocols can lead to differences between...... the images, thus representative data might not be available. Transfer learning techniques can be used to account for these differences, thus taking advantage of all the available data acquired with different protocols. We investigate the use of classifier ensembles, where each classifier is weighted...... according to the similarity between the data it is trained on, and the data it needs to segment. We examine 3 asymmetric similarity measures that can be used in scenarios where no labeled data from a newly introduced scanner or scanning protocol is available. We show that the asymmetry is informative...
SAChES: Scalable Adaptive Chain-Ensemble Sampling.
Energy Technology Data Exchange (ETDEWEB)
Swiler, Laura Painton [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ray, Jaideep [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Ebeida, Mohamed Salah [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Huang, Maoyi [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Hou, Zhangshuan [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bao, Jie [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Ren, Huiying [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
2017-08-01
We present the development of a parallel Markov Chain Monte Carlo (MCMC) method called SAChES, Scalable Adaptive Chain-Ensemble Sampling. This capability is targed to Bayesian calibration of com- putationally expensive simulation models. SAChES involves a hybrid of two methods: Differential Evo- lution Monte Carlo followed by Adaptive Metropolis. Both methods involve parallel chains. Differential evolution allows one to explore high-dimensional parameter spaces using loosely coupled (i.e., largely asynchronous) chains. Loose coupling allows the use of large chain ensembles, with far more chains than the number of parameters to explore. This reduces per-chain sampling burden, enables high-dimensional inversions and the use of computationally expensive forward models. The large number of chains can also ameliorate the impact of silent-errors, which may affect only a few chains. The chain ensemble can also be sampled to provide an initial condition when an aberrant chain is re-spawned. Adaptive Metropolis takes the best points from the differential evolution and efficiently hones in on the poste- rior density. The multitude of chains in SAChES is leveraged to (1) enable efficient exploration of the parameter space; and (2) ensure robustness to silent errors which may be unavoidable in extreme-scale computational platforms of the future. This report outlines SAChES, describes four papers that are the result of the project, and discusses some additional results.
Competitive Learning Neural Network Ensemble Weighted by Predicted Performance
Ye, Qiang
2010-01-01
Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…
Directory of Open Access Journals (Sweden)
Suyu Mei
Full Text Available Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM, where support vector machine (SVM is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.
Constructing Better Classifier Ensemble Based on Weighted Accuracy and Diversity Measure
Directory of Open Access Journals (Sweden)
Xiaodong Zeng
2014-01-01
Full Text Available A weighted accuracy and diversity (WAD method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases.
Finding the optimal-path maps for path planning across weighted regions
Energy Technology Data Exchange (ETDEWEB)
Rowe, N.C.; Alexander, R.S.
2000-02-01
Optimal-path maps tell robots or people the best way to reach a goal point from anywhere in a known terrain area, eliminating most of the need to plan during travel. The authors address the construction of optimal-path maps for two-dimensional polygonal weighted-region terrain, terrain partitioned into polygonal areas such that the cost per unit of distance traveled is homogeneous and isotropic within each area. This is useful for overland route planning across varied ground surfaces and vegetation. The authors propose a new algorithm that recursively partitions terrain into regions of similar optimal-path behavior, and defines corresponding path subspaces for these regions. This process constructs a piecewise-smooth function of terrain position whose gradient direction is everywhere the optimal-path direction, permitting quick path finding. The algorithm used is more complicated than the current path-caching and wavefront-propagation algorithms, but it gives more accurate maps requiring less space to represent. Experiments with an implementation confirm the practicality of the authors' algorithm.
The Total Acquisition Number Of The Randomly Weighted Path
Directory of Open Access Journals (Sweden)
Godbole Anant
2017-11-01
Full Text Available There exists a significant body of work on determining the acquisition number at(G of various graphs when the vertices of those graphs are each initially assigned a unit weight. We determine properties of the acquisition number of the path, star, complete, complete bipartite, cycle, and wheel graphs for variations on this initial weighting scheme, with the majority of our work focusing on the expected acquisition number of randomly weighted graphs. In particular, we bound the expected acquisition number E(at(Pn of the n-path when n distinguishable “units” of integral weight, or chips, are randomly distributed across its vertices between 0.242n and 0.375n. With computer support, we improve it by showing that E(at(Pn lies between 0.29523n and 0.29576n. We then use subadditivity to show that the limiting ratio lim E(at(Pn/n exists, and simulations reveal more exactly what the limiting value equals. The Hoeffding-Azuma inequality is used to prove that the acquisition number is tightly concentrated around its expected value. Additionally, in a different context, we offer a non-optimal acquisition protocol algorithm for the randomly weighted path and exactly compute the expected size of the resultant residual set.
Gruber, Susan; Logan, Roger W; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A
2015-01-15
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.
Weighted Ensemble Square Root Filters for Non-linear, Non-Gaussian, Data Assimilation
Livings, D. M.; van Leeuwen, P.
2012-12-01
In recent years the Ensemble Kalman Filter (EnKF) has become widely-used in both operational and research data assimilation systems. The particle filter is an alternative ensemble-based algorithm that offers the possibility of improved performance in non-linear and non-Gaussian problems. Papadakis et al (2010) introduced the Weighted Ensemble Kalman Filter (WEnKF) as a combination of the best features of the EnKF and the particle filter. Published work on the WEnKF has so far concentrated on the formulation of the EnKF in which observations are perturbed; no satisfactory general framework has been given for particle filters based on the alternative formulation of the EnKF known as the ensemble square root filter. This presentation will provide such a framework and show how several popular ensemble square root filters fit into it. No linear or Gaussian assumptions about the dynamical or observational models will be necessary. By examining the algorithms closely, shortcuts will be identified that increase both the simplicity and the efficiency of the resulting particle filter in comparison with a naive implementation. A procedure will be given for simply converting an existing ensemble square root filter into a particle filter. The procedure will not be limited to basic ensemble square root filters, but will be able to incorporate common variations such as covariance inflation without making any approximations.
Monte Carlo path sampling approach to modeling aeolian sediment transport
Hardin, E. J.; Mitasova, H.; Mitas, L.
2011-12-01
but evolve the system according to rules that are abstractions of the governing physics. This work presents the Green function solution to the continuity equations that govern sediment transport. The Green function solution is implemented using a path sampling approach whereby sand mass is represented as an ensemble of particles that evolve stochastically according to the Green function. In this approach, particle density is a particle representation that is equivalent to the field representation of elevation. Because aeolian transport is nonlinear, particles must be propagated according to their updated field representation with each iteration. This is achieved using a particle-in-cell technique. The path sampling approach offers a number of advantages. The integral form of the Green function solution makes it robust to discontinuities in complex terrains. Furthermore, this approach is spatially distributed, which can help elucidate the role of complex landscapes in aeolian transport. Finally, path sampling is highly parallelizable, making it ideal for execution on modern clusters and graphics processing units.
Weighted polynomial models and weighted sampling schemes for finite population
Chen, Sean X.
1998-01-01
This paper outlines a theoretical framework for finite population models with unequal sample probabilities, along with sampling schemes for drawing random samples from these models. We first present four exact weighted sampling schemes that can be used for any finite population model to satisfy such requirements as ordered/ unordered samples, with/without replacement, and fixed/nonfixed sample size. We then introduce a new class of finite population models called weighted po...
Broderick, Ciaran; Fealy, Rowan; Murphy, Conor
2013-04-01
Multimodel experiments have provided the data necessary for undertaking probabilistic assessments of the likely impacts which projected climate change may have on hydrological systems. The availability of ensemble data has also facilitated a more comprehensive exploration of uncertainty and a greater understanding of the implications it has for future resource management. In this study a probabilistic framework is used to examine changes in the flow regime of the Burrishoole catchment - characterised as a responsive peatland system typical of many upland catchments found along Ireland's Atlantic seaboard. For the study a sampling procedure is used to generate probability distributions which quantify the range of uncertainty in the projected hydrological response. The sampling scheme combines model projections by weighting; to this end a likelihood value is attached to each member of a multimodel ensemble. Model reliability is quantified based on performance at capturing different aspects of the observed system behaviour. The dynamically downscaled climate data used is obtained from the EU-FP6 ENSEMBLES project; to overcome some of the limitations associated with this dataset it is used alongside statistically downscaled climate scenarios. To address uncertainty in the hydrological simulations multiple realizations of the catchment system - obtained by altering both the model structure and parameter values in search of behavioural solutions - are employed. The overriding aim of the paper is to examine how ensemble data can be most effectively exploited when conducting impact assessments. The probabilistic framework outlined is used to explore whether the application of a weighting scheme produces a different outcome than if uniform probabilities are applied; also examined is whether the weighting enables the uncertainty space to be constrained in a methodologically rigorous way. In order to understand how we can more effectively manage uncertainty the study
Zhou, Shaohua Kevin; Chellappa, Rama
2006-06-01
This paper addresses the problem of characterizing ensemble similarity from sample similarity in a principled manner. Using reproducing kernel as a characterization of sample similarity, we suggest a probabilistic distance measure in the reproducing kernel Hilbert space (RKHS) as the ensemble similarity. Assuming normality in the RKHS, we derive analytic expressions for probabilistic distance measures that are commonly used in many applications, such as Chernoff distance (or the Bhattacharyya distance as its special case), Kullback-Leibler divergence, etc. Since the reproducing kernel implicitly embeds a nonlinear mapping, our approach presents a new way to study these distances whose feasibility and efficiency is demonstrated using experiments with synthetic and real examples. Further, we extend the ensemble similarity to the reproducing kernel for ensemble and study the ensemble similarity for more general data representations.
Adelman, Joshua L; Grabe, Michael
2015-04-14
Ion channels are responsible for a myriad of fundamental biological processes via their role in controlling the flow of ions through water-filled membrane-spanning pores in response to environmental cues. Molecular simulation has played an important role in elucidating the mechanism of ion conduction, but connecting atomistically detailed structural models of the protein to electrophysiological measurements remains a broad challenge due to the computational cost of reaching the necessary time scales. Here, we introduce an enhanced sampling method for simulating the conduction properties of narrow ion channels using the Weighted ensemble (WE) sampling approach. We demonstrate the application of this method to calculate the current–voltage relationship as well as the nonequilibrium ion distribution at steady-state of a simple model ion channel. By direct comparisons with long brute force simulations, we show that the WE simulations rigorously reproduce the correct long-time scale kinetics of the system and are capable of determining these quantities using significantly less aggregate simulation time under conditions where permeation events are rare.
A Bayesian posterior predictive framework for weighting ensemble regional climate models
Directory of Open Access Journals (Sweden)
Y. Fan
2017-06-01
Full Text Available We present a novel Bayesian statistical approach to computing model weights in climate change projection ensembles in order to create probabilistic projections. The weight of each climate model is obtained by weighting the current day observed data under the posterior distribution admitted under competing climate models. We use a linear model to describe the model output and observations. The approach accounts for uncertainty in model bias, trend and internal variability, including error in the observations used. Our framework is general, requires very little problem-specific input, and works well with default priors. We carry out cross-validation checks that confirm that the method produces the correct coverage.
Ab initio sampling of transition paths by conditioned Langevin dynamics
Delarue, Marc; Koehl, Patrice; Orland, Henri
2017-10-01
We propose a novel stochastic method to generate Brownian paths conditioned to start at an initial point and end at a given final point during a fixed time tf under a given potential U(x). These paths are sampled with a probability given by the overdamped Langevin dynamics. We show that these paths can be exactly generated by a local stochastic partial differential equation. This equation cannot be solved in general but we present several approximations that are valid either in the low temperature regime or in the presence of barrier crossing. We show that this method warrants the generation of statistically independent transition paths. It is computationally very efficient. We illustrate the method first on two simple potentials, the two-dimensional Mueller potential and the Mexican hat potential, and then on the multi-dimensional problem of conformational transitions in proteins using the "Mixed Elastic Network Model" as a benchmark.
Mielke, Steven L; Truhlar, Donald G
2016-01-21
Using Feynman path integrals, a molecular partition function can be written as a double integral with the inner integral involving all closed paths centered at a given molecular configuration, and the outer integral involving all possible molecular configurations. In previous work employing Monte Carlo methods to evaluate such partition functions, we presented schemes for importance sampling and stratification in the molecular configurations that constitute the path centroids, but we relied on free-particle paths for sampling the path integrals. At low temperatures, the path sampling is expensive because the paths can travel far from the centroid configuration. We now present a scheme for importance sampling of whole Feynman paths based on harmonic information from an instantaneous normal mode calculation at the centroid configuration, which we refer to as harmonically guided whole-path importance sampling (WPIS). We obtain paths conforming to our chosen importance function by rejection sampling from a distribution of free-particle paths. Sample calculations on CH4 demonstrate that at a temperature of 200 K, about 99.9% of the free-particle paths can be rejected without integration, and at 300 K, about 98% can be rejected. We also show that it is typically possible to reduce the overhead associated with the WPIS scheme by sampling the paths using a significantly lower-order path discretization than that which is needed to converge the partition function.
Donovan, Rory M; Sedgewick, Andrew J; Faeder, James R; Zuckerman, Daniel M
2013-09-21
We apply the "weighted ensemble" (WE) simulation strategy, previously employed in the context of molecular dynamics simulations, to a series of systems-biology models that range in complexity from a one-dimensional system to a system with 354 species and 3680 reactions. WE is relatively easy to implement, does not require extensive hand-tuning of parameters, does not depend on the details of the simulation algorithm, and can facilitate the simulation of extremely rare events. For the coupled stochastic reaction systems we study, WE is able to produce accurate and efficient approximations of the joint probability distribution for all chemical species for all time t. WE is also able to efficiently extract mean first passage times for the systems, via the construction of a steady-state condition with feedback. In all cases studied here, WE results agree with independent "brute-force" calculations, but significantly enhance the precision with which rare or slow processes can be characterized. Speedups over "brute-force" in sampling rare events via the Gillespie direct Stochastic Simulation Algorithm range from ~10(12) to ~10(18) for characterizing rare states in a distribution, and ~10(2) to ~10(4) for finding mean first passage times.
Weighted log-rank statistic to compare shared-path adaptive treatment strategies.
Kidwell, Kelley M; Wahed, Abdus S
2013-04-01
Adaptive treatment strategies (ATSs) more closely mimic the reality of a physician's prescription process where the physician prescribes a medication to his/her patient, and based on that patient's response to the medication, modifies the treatment. Two-stage randomization designs, more generally, sequential multiple assignment randomization trial designs, are useful to assess ATSs where the interest is in comparing the entire sequence of treatments, including the patient's intermediate response. In this paper, we introduce the notion of shared-path and separate-path ATSs and propose a weighted log-rank statistic to compare overall survival distributions of multiple two-stage ATSs, some of which may be shared-path. Large sample properties of the statistic are derived and the type I error rate and power of the test are compared with the standard log-rank test through simulation.
Transition path sampling of rare events by shooting from the top.
Jung, Hendrik; Okazaki, Kei-Ichi; Hummer, Gerhard
2017-10-21
Transition path sampling is a powerful tool in the study of rare events. Shooting trial trajectories from configurations along existing transition paths proved particularly efficient in the sampling of reactive trajectories. However, most shooting attempts tend not to result in transition paths, in particular in cases where the transition dynamics has diffusive character. To overcome the resulting efficiency problem, we developed an algorithm for "shooting from the top." We first define a shooting range through which all paths have to pass and then shoot off trial trajectories only from within this range. For a well chosen shooting range, nearly every shot is successful, resulting in an accepted transition path. To deal with multiple mechanisms, weighted shooting ranges can be used. To cope with the problem of unsuitably placed shooting ranges, we developed an algorithm that iteratively improves the location of the shooting range. The transition path sampling procedure is illustrated for models of diffusive and Langevin dynamics. The method should be particularly useful in cases where the transition paths are long so that only relatively few shots are possible, yet reasonable order parameters are known.
Transition path sampling of rare events by shooting from the top
Jung, Hendrik; Okazaki, Kei-ichi; Hummer, Gerhard
2017-10-01
Transition path sampling is a powerful tool in the study of rare events. Shooting trial trajectories from configurations along existing transition paths proved particularly efficient in the sampling of reactive trajectories. However, most shooting attempts tend not to result in transition paths, in particular in cases where the transition dynamics has diffusive character. To overcome the resulting efficiency problem, we developed an algorithm for "shooting from the top." We first define a shooting range through which all paths have to pass and then shoot off trial trajectories only from within this range. For a well chosen shooting range, nearly every shot is successful, resulting in an accepted transition path. To deal with multiple mechanisms, weighted shooting ranges can be used. To cope with the problem of unsuitably placed shooting ranges, we developed an algorithm that iteratively improves the location of the shooting range. The transition path sampling procedure is illustrated for models of diffusive and Langevin dynamics. The method should be particularly useful in cases where the transition paths are long so that only relatively few shots are possible, yet reasonable order parameters are known.
Sun, Xiaogong; Yin, Jinfang; Zhao, Yan
2017-06-01
The inverse of expected error variance is utilized to determine weights of individual ensemble members based on the THORPEX (The Observing System Research and Predictability Experiment) Interactive Grand Global Ensemble (TIGGE) forecast datasets. The weights of all ensemble members are thus calculated for summer 2012, with the NCEP final operational global analysis (FNL) data as the truth. Based on the weights of all ensemble members, the variable weighted ensemble mean (VWEM) of temperature of summer 2013 is derived and compared with that from the simple equally weighted ensemble mean. The results show that VWEM has lower root-mean-square error (RMSE) as well as absolute error, and has improved the temperature prediction accuracy. The improvements are quite notable over the Tibetan Plateau and its surrounding areas; specifically, a relative improvement rate of RMSE of more than 24% in 2-m temperature is demonstrated. Moreover, the improvement rates vary slightly with the prediction lead-time (24-96 h). It is suggested that the VWEM approach be employed in operational ensemble prediction to provide guidance for weather forecasting and climate prediction.
Rowley, Christopher N; Woo, Tom K
2009-12-21
Transition path sampling has been established as a powerful tool for studying the dynamics of rare events. The trajectory generation moves of this Monte Carlo procedure, shooting moves and shifting modes, were developed primarily for rate constant calculations, although this method has been more extensively used to study the dynamics of reactive processes. We have devised and implemented three alternative trajectory generation moves for use with transition path sampling. The centering-shooting move incorporates a shifting move into a shooting move, which centers the transition period in the middle of the trajectory, eliminating the need for shifting moves and generating an ensemble where the transition event consistently occurs near the middle of the trajectory. We have also developed varied-perturbation size shooting moves, wherein smaller perturbations are made if the shooting point is far from the transition event. The trajectories generated using these moves decorrelate significantly faster than with conventional, constant sized perturbations. This results in an increase in the statistical efficiency by a factor of 2.5-5 when compared to the conventional shooting algorithm. On the other hand, the new algorithm breaks detailed balance and introduces a small bias in the transition time distribution. We have developed a modification of this varied-perturbation size shooting algorithm that preserves detailed balance, albeit at the cost of decreased sampling efficiency. Both varied-perturbation size shooting algorithms are found to have improved sampling efficiency when compared to the original constant perturbation size shooting algorithm.
Yang, Pengyi; Yoo, Paul D; Fernando, Juanita; Zhou, Bing B; Zhang, Zili; Zomaya, Albert Y
2014-03-01
Data sampling is a widely used technique in a broad range of machine learning problems. Traditional sampling approaches generally rely on random resampling from a given dataset. However, these approaches do not take into consideration additional information, such as sample quality and usefulness. We recently proposed a data sampling technique, called sample subset optimization (SSO). The SSO technique relies on a cross-validation procedure for identifying and selecting the most useful samples as subsets. In this paper, we describe the application of SSO techniques to imbalanced and ensemble learning problems, respectively. For imbalanced learning, the SSO technique is employed as an under-sampling technique for identifying a subset of highly discriminative samples in the majority class. In ensemble learning, the SSO technique is utilized as a generic ensemble technique where multiple optimized subsets of samples from each class are selected for building an ensemble classifier. We demonstrate the utilities and advantages of the proposed techniques on a variety of bioinformatics applications where class imbalance, small sample size, and noisy data are prevalent.
Fang, Keyan; Wilmking, Martin; Davi, Nicole; Zhou, Feifei; Liu, Changzhi
2014-01-01
Traditional detrending methods assign equal mean value to all tree-ring series for chronology developments, despite that the mean annual growth changes in different time periods. We find that the strength of a tree-ring model can be improved by giving more weights to tree-ring series that have a stronger climate signal and less weight to series that have a weaker signal. We thus present an ensemble weighting method to mitigate these potential biases and to more accurately extract the climate signals in dendroclimatology studies. This new method has been used to develop the first annual precipitation reconstruction (previous August to current July) at the Songmingyan Mountain and to recalculate the tree-ring chronology from Shenge site in Dulan area in northeastern Tibetan Plateau (TP), a marginal area of Asian summer monsoon. The ensemble weighting method explains 31.7% of instrumental variance for the reconstructions at Songmingyan Mountain and 57.3% of the instrumental variance in the Dulan area, which are higher than those developed using traditional methods. We focus on the newly introduced reconstruction at Songmingyan Mountain, which showsextremely dry (wet) epochs from 1862–1874, 1914–1933 and 1991–1999 (1882–1905). These dry/wet epochs were also found in the marginal areas of summer monsoon and the Indian subcontinent, indicating the linkages between regional hydroclimate changes and the Indian summer monsoon. PMID:24497967
Efficient Unbiased Rendering using Enlightened Local Path Sampling
DEFF Research Database (Denmark)
Kristensen, Anders Wang
, such as the location of the light sources or cameras, or the re flection models at each point. In this work we explore new methods of importance sampling paths. Our idea is to analyze the scene before rendering and compute various statistics that we use to improve importance sampling. The first of these are adjoint...... measurements, which are the solution to the adjoint light transport problem. The second is a representation of the distribution of radiance and importance in the scene. We also derive a new method of particle sampling, which is advantageous compared to existing methods. Together we call the resulting algorithm...
Robust Estimation of Diffusion-Optimized Ensembles for Enhanced Sampling
DEFF Research Database (Denmark)
Tian, Pengfei; Jónsson, Sigurdur Æ.; Ferkinghoff-Borg, Jesper
2014-01-01
The multicanonical, or flat-histogram, method is a common technique to improve the sampling efficiency of molecular simulations. The idea is that free-energy barriers in a simulation can be removed by simulating from a distribution where all values of a reaction coordinate are equally likely...... accurate estimates of the diffusion coefficient. Here, we present a simple, yet robust solution to this problem. Compared to current state-of-the-art procedures, the new estimation method requires an order of magnitude fewer data to obtain reliable estimates, thus broadening the potential scope in which...
Deur, Killian; Fromager, Emmanuel
2016-01-01
Ensemble density functional theory (eDFT) is an exact time-independent alternative to time-dependent DFT (TD-DFT) for the calculation of excitation energies. Despite its formal simplicity and advantages in contrast to TD-DFT (multiple excitations, for example, can be easily taken into account in an ensemble), eDFT is not standard which is essentially due to the lack of reliable approximate exchange-correlation (xc) functionals for ensembles. Following Burke and coworkers [Phys. Rev. B 93, 245131 (2016)], we propose in this work to construct an exact eDFT for the nontrivial asymmetric Hubbard dimer, thus providing more insight into the weight dependence of the ensemble xc energy in various correlation regimes. For that purpose, an exact analytical expression for the weight-dependent ensemble exchange energy has been derived. The complementary exact ensemble correlation energy has been computed by means of Legendre-Fenchel transforms. Interesting features like discontinuities in the ensemble xc potential in the...
Path Integral Coarse-Graining Replica Exchange Method for Enhanced Sampling.
Peng, Yuxing; Cao, Zhen; Zhou, Ruhong; Voth, Gregory A
2014-09-09
An enhanced conformational space sampling method is developed that utilizes replica exchange molecular dynamics between a set of imaginary time Feynman path integral replicas, each having an increasing degree of contraction (or coarse-graining) of the quasi-particle or "polymer beads" in the evaluation of the isomorphic ring-polymer potential energy terms. However, there is no contraction of beads in the effectively harmonic kinetic energy terms. The final replica in this procedure is the fully contracted one in which the potential energy is evaluated only at the centroid of the beads-and hence it is the classical distribution in the centroid variable-while the initial replica has the full degree (or even a heightened degree, if desired) of quantum delocalization and tunneling in the physical potential by the polymer necklace beads. The exchange between the different ring-polymer ensembles is governed by the Metropolis criteria to guarantee detailed balance. The method is applied successfully to several model systems, ranging from one-dimensional prototype rough energy landscape models having analytical solutions to the more realistic alanine dipeptide. A detailed comparison with the classical temperature-based replica exchange method shows an improved efficiency of this new method in the classical conformational space sampling due to coupling with the fictitious path integral (quantum) replicas.
Rauscher, Sarah; Neale, Chris; Pomès, Régis
2009-10-13
Generalized-ensemble algorithms in temperature space have become popular tools to enhance conformational sampling in biomolecular simulations. A random walk in temperature leads to a corresponding random walk in potential energy, which can be used to cross over energetic barriers and overcome the problem of quasi-nonergodicity. In this paper, we introduce two novel methods: simulated tempering distributed replica sampling (STDR) and virtual replica exchange (VREX). These methods are designed to address the practical issues inherent in the replica exchange (RE), simulated tempering (ST), and serial replica exchange (SREM) algorithms. RE requires a large, dedicated, and homogeneous cluster of CPUs to function efficiently when applied to complex systems. ST and SREM both have the drawback of requiring extensive initial simulations, possibly adaptive, for the calculation of weight factors or potential energy distribution functions. STDR and VREX alleviate the need for lengthy initial simulations, and for synchronization and extensive communication between replicas. Both methods are therefore suitable for distributed or heterogeneous computing platforms. We perform an objective comparison of all five algorithms in terms of both implementation issues and sampling efficiency. We use disordered peptides in explicit water as test systems, for a total simulation time of over 42 μs. Efficiency is defined in terms of both structural convergence and temperature diffusion, and we show that these definitions of efficiency are in fact correlated. Importantly, we find that ST-based methods exhibit faster temperature diffusion and correspondingly faster convergence of structural properties compared to RE-based methods. Within the RE-based methods, VREX is superior to both SREM and RE. On the basis of our observations, we conclude that ST is ideal for simple systems, while STDR is well-suited for complex systems.
The effect of sampling noise in ensemble-based Kalman filters
Sacher, William
Ensemble-based Kalman filters have drawn a lot of attention in the atmospheric and ocean scientific community because of their potential to be used as a data assimilation tool for numerical prediction in a strongly nonlinear context at an affordable cost. However, many studies have noted practical problems in their implementation. Indeed, being Monte-Carlo methods, the useful parameters are estimated from a sample of limited size of independent realizations of the process. As a consequence, the unavoidable sampling noise impacts the quality of the analysis. An idealized perfect model context is considered in which the analytical expression for the analysis accuracy and reliability as a function of the ensemble size is established, from a second-order moment perspective. It is proved that one can analytically explain the general tendency for ensemble-based Kalman filters to underestimate, on average, the analysis variance and therefore the likeliness for these filters to diverge. Performance of alternative methods, designed to reduce or eliminate sampling error effects, such as the double ensemble Kalman filter or covariance inflation are also analytically explored. For methods using perturbed observations, it is shown that the covariance inflation is the easiest and least expensive method to obtain the most accurate and reliable analysis. These analytical results agreed well with means over a large number of experiments using a perfect, low-resolution, and quasi-geostrophic barotropic model, in a series of observation system simulation experiments of single analysis cycles as well as in a simulated forecast system. In one-analysis cycle experiments with rank histograms, non-perturbed-observation methods show a lack of reliability regardless of the number of members. For small ensemble sizes, sampling error effects are dominant but have a smaller impact than in the perturbed observation method, making non-perturbed-observation method filters much less subject to
Evaluation of effective factors on low birth weight neonates' mortality using path analysis
Directory of Open Access Journals (Sweden)
Babaee Gh
2008-06-01
Full Text Available Background: This study have conducted in order to determine of direct or indirect effective factors on mortality of neonates with low birth weight by path analysis.Methods: In this cohort study 445 paired mothers and their neonates were participated in Tehran city. The data were gathered through an answer sheet contain mother age, gestational age, apgar score, pregnancy induced hypertension (PIH and birth weight. Sampling was convenience and neonates of women were included in this study who were referred to 15 government and private hospitals in Tehran city. Live being status of neonates was determined until 24 hours after delivery.Results: The most changes in mortality rate is related to birth weight and its negative score means that increasing in weight leads to increase chance of live being. Second score is related to apgar sore and its negative score means that increasing in apgar score leads to decrease chance of neonate death. Third score is gestational age and its negative score means that increasing in weight leads to increase chance of live being. The less changes in mortality rate is due to hypertensive disorders in pregnancy.Conclusion: The methodology has been used could be adopted in other investigations to distinguish and measuring effect of predictive factors on the risk of an outcome.
The MEXSAS2 Sample and the Ensemble X-ray Variability of Quasars
Directory of Open Access Journals (Sweden)
Roberto Serafinelli
2017-10-01
Full Text Available We present the second Multi-Epoch X-ray Serendipitous AGN Sample (MEXSAS2, extracted from the 6th release of the XMM Serendipitous Source Catalog (XMMSSC-DR6, cross-matched with Sloan Digital Sky Survey quasar Catalogs DR7Q and DR12Q. Our sample also includes the available measurements for masses, bolometric luminosities, and Eddington ratios. Analyses of the ensemble structure function and spectral variability are presented, together with their dependences on such parameters. We confirm a decrease of the structure function with the X-ray luminosity, and find a weak dependence on the black hole mass. We introduce a new spectral variability estimator, taking errors on both fluxes and spectral indices into account. We confirm an ensemble softer when brighter trend, with no dependence of such estimator on black hole mass, Eddington ratio, redshift, X-ray and bolometric luminosity.
Predictor-weighting strategies for probabilistic wind power forecasting with an analog ensemble
Directory of Open Access Journals (Sweden)
Constantin Junk
2015-04-01
Full Text Available Unlike deterministic forecasts, probabilistic predictions provide estimates of uncertainty, which is an additional value for decision-making. Previous studies have proposed the analog ensemble (AnEn, which is a technique to generate uncertainty information from a purely deterministic forecast. The objective of this study is to improve the AnEn performance for wind power forecasts by developing static and dynamic weighting strategies, which optimize the predictor combination with a brute-force continuous ranked probability score (CRPS minimization and a principal component analysis (PCA of the predictors. Predictors are taken from the high-resolution deterministic forecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF, including forecasts of wind at several heights, geopotential height, pressure, and temperature, among others. The weighting strategies are compared at five wind farms in Europe and the U.S. situated in regions with different terrain complexity, both on and offshore, and significantly improve the deterministic and probabilistic AnEn forecast performance compared to the AnEn with 10‑m wind speed and direction as predictors and compared to PCA-based approaches. The AnEn methodology also provides reliable estimation of the forecast uncertainty. The optimized predictor combinations are strongly dependent on terrain complexity, local wind regimes, and atmospheric stratification. Since the proposed predictor-weighting strategies can accomplish both the selection of relevant predictors as well as finding their optimal weights, the AnEn performance is improved by up to 20 % at on and offshore sites.
Iachimciuc, Igor
The dissertation is in two parts, a theoretical study and a musical composition. In Part I the music of Gyorgy Kurtag is analyzed from the point of view of sound color. A brief description of what is understood by the term sound color, and various ways of achieving specific coloristic effects, are presented in the Introduction. An examination of Kurtag's approaches to the domain of sound color occupies the chapters that follow. The musical examples that are analyzed are selected from Kurtag's different compositional periods, showing a certain consistency in sound color techniques, the most important of which are already present in the String Quartet, Op. 1. The compositions selected for analysis are written for different ensembles, but regardless of the instrumentation, certain principles of the formation and organization of sound color remain the same. Rather than relying on extended instrumental techniques, Kurtag creates a large variety of sound colors using traditional means such as pitch material, register, density, rhythm, timbral combinations, dynamics, texture, spatial displacement of the instruments, and the overall musical context. Each sound color unit in Kurtag's music is a separate entity, conceived as a complete microcosm. Sound color units can either be juxtaposed as contrasting elements, forming sound color variations, or superimposed, often resulting in a Klangfarbenmelodie effect. Some of the same gestural figures (objets trouves) appear in different compositions, but with significant coloristic modifications. Thus, the principle of sound color variations is not only a strong organizational tool, but also a characteristic stylistic feature of the music of Gyorgy Kurtag. Part II, Leopard's Path (2010), for flute, clarinet, violin, cello, cimbalom, and piano, is an original composition inspired by the painting of Jesse Allen, a San Francisco based artist. The composition is conceived as a cycle of thirteen short movements. Ten of these movements are
Imaouchen, Yacine; Kedadouche, Mourad; Alkama, Rezak; Thomas, Marc
2017-01-01
Signal processing techniques for non-stationary and noisy signals have recently attracted considerable attentions. Among them, the empirical mode decomposition (EMD) which is an adaptive and efficient method for decomposing signals from high to low frequencies into intrinsic mode functions (IMFs). Ensemble EMD (EEMD) is proposed to overcome the mode mixing problem of the EMD. In the present paper, the Complementary EEMD (CEEMD) is used for bearing fault detection. As a noise-improved method, the CEEMD not only overcomes the mode mixing, but also eliminates the residual of added white noise persisting into the IMFs and enhance the calculation efficiency of the EEMD method. Afterward, a selection method is developed to choose relevant IMFs containing information about defects. Subsequently, a signal is reconstructed from the sum of relevant IMFs and a Frequency-Weighted Energy Operator is tailored to extract both the amplitude and frequency modulations from the selected IMFs. This operator outperforms the conventional energy operator and the enveloping methods, especially in the presence of strong noise and multiple vibration interferences. Furthermore, simulation and experimental results showed that the proposed method improves performances for detecting the bearing faults. The method has also high computational efficiency and is able to detect the fault at an early stage of degradation.
Computing Sampling Weights in Large-scale Assessments in Education
Meinck, Sabine
2015-01-01
Sampling weights are a reflection of sampling design; they allow us to draw valid conclusions about population features from sample data. This paper explains the fundamentals of computing sampling weights for large-scale assessments in educational research. The relationship between the nature of complex samples and best practices in developing a set of weights to enable computation of unbiased population estimates is described. Effects of sampling weights on estimates are shown...
Zilly, Julian; Buhmann, Joachim M; Mahapatra, Dwarikanath
2017-01-01
We present a novel method to segment retinal images using ensemble learning based convolutional neural network (CNN) architectures. An entropy sampling technique is used to select informative points thus reducing computational complexity while performing superior to uniform sampling. The sampled points are used to design a novel learning framework for convolutional filters based on boosting. Filters are learned in several layers with the output of previous layers serving as the input to the next layer. A softmax logistic classifier is subsequently trained on the output of all learned filters and applied on test images. The output of the classifier is subject to an unsupervised graph cut algorithm followed by a convex hull transformation to obtain the final segmentation. Our proposed algorithm for optic cup and disc segmentation outperforms existing methods on the public DRISHTI-GS data set on several metrics. Copyright © 2016 Elsevier Ltd. All rights reserved.
Multilevel ensemble Kalman filtering
Hoel, Haakon
2016-01-08
The ensemble Kalman filter (EnKF) is a sequential filtering method that uses an ensemble of particle paths to estimate the means and covariances required by the Kalman filter by the use of sample moments, i.e., the Monte Carlo method. EnKF is often both robust and efficient, but its performance may suffer in settings where the computational cost of accurate simulations of particles is high. The multilevel Monte Carlo method (MLMC) is an extension of classical Monte Carlo methods which by sampling stochastic realizations on a hierarchy of resolutions may reduce the computational cost of moment approximations by orders of magnitude. In this work we have combined the ideas of MLMC and EnKF to construct the multilevel ensemble Kalman filter (MLEnKF) for the setting of finite dimensional state and observation spaces. The main ideas of this method is to compute particle paths on a hierarchy of resolutions and to apply multilevel estimators on the ensemble hierarchy of particles to compute Kalman filter means and covariances. Theoretical results and a numerical study of the performance gains of MLEnKF over EnKF will be presented. Some ideas on the extension of MLEnKF to settings with infinite dimensional state spaces will also be presented.
Performance Assessment of Multi-Source Weighted-Ensemble Precipitation (MSWEP Product over India
Directory of Open Access Journals (Sweden)
Akhilesh S. Nair
2017-01-01
Full Text Available Error characterization is vital for the advancement of precipitation algorithms, the evaluation of numerical model outputs, and their integration in various hydro-meteorological applications. The Tropical Rainfall Measuring Mission (TRMM Multi-satellite Precipitation Analysis (TMPA has been a benchmark for successive Global Precipitation Measurement (GPM based products. This has given way to the evolution of many multi-satellite precipitation products. This study evaluates the performance of the newly released multi-satellite Multi-Source Weighted-Ensemble Precipitation (MSWEP product, whose temporal variability was determined based on several data products including TMPA 3B42 RT. The evaluation was conducted over India with respect to the IMD-gauge-based rainfall for pre-monsoon, monsoon, and post monsoon seasons at daily scale for a 35-year (1979–2013 period. The rainfall climatology is examined over India and over four geographical extents within India known to be subject to uniform rainfall. The performance evaluation of rainfall time series was carried out. In addition to this, the performance of the product over different rainfall classes was evaluated along with the contribution of each class to the total rainfall. Further, seasonal evaluation of the MSWEP products was based on the categorical and volumetric indices from the contingency table. Upon evaluation it was observed that the MSWEP products show large errors in detecting the higher quantiles of rainfall (>75th and > 95th quantiles. The MSWEP precipitation product available at a 0.25° × 0.25° spatial resolution and daily temporal resolution matched well with the daily IMD rainfall over India. Overall results suggest that a suitable region and season-dependent bias correction is essential before its integration in hydrological applications. While the MSWEP was observed to perform well for daily rainfall, it suffered from poor detection capabilities for higher quantiles, making
Probability-weighted ensembles of U.S. county-level climate projections for climate risk analysis
Rasmussen, D J; Kopp, Robert E
2015-01-01
Quantitative assessment of climate change risk requires a method for constructing probabilistic time series of changes in physical climate parameters. Here, we develop two such methods, Surrogate/Model Mixed Ensemble (SMME) and Monte Carlo Pattern/Residual (MCPR), and apply them to construct joint probability density functions (PDFs) of temperature and precipitation change over the 21st century for every county in the United States. Both methods produce $likely$ (67% probability) temperature and precipitation projections consistent with the Intergovernmental Panel on Climate Change's interpretation of an equal-weighted Coupled Model Intercomparison Project 5 (CMIP5) ensemble, but also provide full PDFs that include tail estimates. For example, both methods indicate that, under representative concentration pathway (RCP) 8.5, there is a 5% chance that the contiguous United States could warm by at least 8$^\\circ$C. Variance decomposition of SMME and MCPR projections indicate that background variability dominates...
Ren, Fulong; Cao, Peng; Li, Wei; Zhao, Dazhe; Zaiane, Osmar
2017-01-01
Diabetic retinopathy (DR) is a progressive disease, and its detection at an early stage is crucial for saving a patient's vision. An automated screening system for DR can help in reduce the chances of complete blindness due to DR along with lowering the work load on ophthalmologists. Among the earliest signs of DR are microaneurysms (MAs). However, current schemes for MA detection appear to report many false positives because detection algorithms have high sensitivity. Inevitably some non-MAs structures are labeled as MAs in the initial MAs identification step. This is a typical "class imbalance problem". Class imbalanced data has detrimental effects on the performance of conventional classifiers. In this work, we propose an ensemble based adaptive over-sampling algorithm for overcoming the class imbalance problem in the false positive reduction, and we use Boosting, Bagging, Random subspace as the ensemble framework to improve microaneurysm detection. The ensemble based over-sampling methods we proposed combine the strength of adaptive over-sampling and ensemble. The objective of the amalgamation of ensemble and adaptive over-sampling is to reduce the induction biases introduced from imbalanced data and to enhance the generalization classification performance of extreme learning machines (ELM). Experimental results show that our ASOBoost method has higher area under the ROC curve (AUC) and G-mean values than many existing class imbalance learning methods. Copyright © 2016 Elsevier Ltd. All rights reserved.
Calderon, Christopher P
2007-02-28
We use a constant velocity steered molecular dynamics (SMD) simulation of the stretching of deca-alanine in vacuum to demonstrate a technique that can be used to create a surrogate processes approximation (SPA) using the time series that come out of SMD simulations. In this article, the surrogate processes are constructed by first estimating a sequence of local parametric diffusion models along a SMD trajectory and then a single global model is constructed by piecing the local models together through smoothing splines (estimation is made computationally feasible by likelihood function approximations). The SPAs are then "bootstrapped" in order to obtain a plausible range of work values associated with a particular SMD realization. This information is then used to assist in estimating a potential of mean force constructed by appealing to the Jarzynski equality. When this procedure is repeated for a small number of SMD paths, it is shown that the global models appear to come from a single family of closely related diffusion processes. Possible techniques for exploiting this observation are also briefly discussed. The findings of this paper have potential relevance to computationally expensive computer simulations and experimental works involving optical tweezers where it is difficult to collect a large number of samples, but possible to sample accurately and frequently in time.
Improved sub-seasonal meteorological forecast skill using weighted multi-model ensemble simulations
Wanders, Niko; Wood, Eric F.
2016-01-01
Sub-seasonal to seasonal weather and hydrological forecasts have the potential to provide vital information for a variety of water-related decision makers. Here, we investigate the skill of four sub-seasonal forecast models from phase-2 of the North American Multi-Model Ensemble using reforecasts
Monte Carlo sampling for stochastic weight functions.
Frenkel, Daan; Schrenk, K Julian; Martiniani, Stefano
2017-07-03
Conventional Monte Carlo simulations are stochastic in the sense that the acceptance of a trial move is decided by comparing a computed acceptance probability with a random number, uniformly distributed between 0 and 1. Here, we consider the case that the weight determining the acceptance probability itself is fluctuating. This situation is common in many numerical studies. We show that it is possible to construct a rigorous Monte Carlo algorithm that visits points in state space with a probability proportional to their average weight. The same approach may have applications for certain classes of high-throughput experiments and the analysis of noisy datasets.
Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi
2014-12-08
Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the "small sample size" (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0-1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system.
Life course path analysis of birth weight, childhood growth, and adult systolic blood pressure
DEFF Research Database (Denmark)
Gamborg, Michael; Andersen, Per Kragh; Baker, Jennifer L
2009-01-01
regression methods. Path analysis produced easily interpretable results, and compared with standard regression methods it produced a noteworthy gain in statistical power. The effect of change in relative body size on adult blood pressure was more pronounced after age 11 years than in earlier childhood....... These results suggest that increases in body size prior to age 11 years are less harmful to adult blood pressure than increases occurring after this age.......The inverse associations between birth weight and later adverse health outcomes and the positive associations between adult body size and poor health imply that increases in relative body size between birth and adulthood may be undesirable. In this paper, the authors describe life course path...
Schwichtenberg, Fabian; Callies, Ulrich; Groll, Nikolaus; Maßmann, Silvia
2017-04-01
Oil dispersed in the water column remains sheltered from wind forcing, so that an altered drift path is a key consequence of using chemical dispersants. In this study, ensemble simulations were conducted based on 7 years of simulated atmospheric and marine conditions, evaluating 2,190 hypothetical spills from each of 636 cells of a regular grid covering the inner German Bight (SE North Sea). Each simulation compares two idealized setups assuming either undispersed or fully dispersed oil. Differences are summarized in a spatial map of probabilities that chemical dispersant applications would help prevent oil pollution from entering intertidal coastal areas of the Wadden Sea. High probabilities of success overlap strongly with coastal regions between 10 m and 20 m water depth, where the use of chemical dispersants for oil spill response is a particularly contentious topic. The present study prepares the ground for a more detailed net environmental benefit analysis (NEBA) accounting also for toxic effects.
Influence of sampling strategy on detecting preferential flow paths in water-repellent sand
Ritsema, C.J.; Dekker, L.W.
1996-01-01
A sample spacing up to 22 cm over a distance of several metres is just sufficient to collect information about preferential flow paths in a water-repellent sandy soil. When larger sample spacings were used, the water content distributions became more horizontally stratified. Increasing the sample
Computing the Stretch Factor of Paths, Trees, and Cycles in Weighted Fixed Orientation Metrics
DEFF Research Database (Denmark)
Wulff-Nilsen, Christian
2008-01-01
Let G be a graph embedded in the L_1-plane. The stretch factor of G is the maximum over all pairs of distinct vertices p and q of G of the ratio L_1^G(p,q)/L_1(p,q), where L_1^G(p,q) is the L_1-distance in G between p and q. We show how to compute the stretch factor of an n-vertex path in O(n*(lo......(n*(log n)^2) worst-case time and O(n) space and we mention generalizations to trees and cycles, to general weighted fixed orientation metrics, and to higher dimensions....
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Van Wart, Adam T; Durrant, Jacob; Votapka, Lane; Amaro, Rommie E
2014-02-11
Allostery can occur by way of subtle cooperation among protein residues (e.g., amino acids) even in the absence of large conformational shifts. Dynamical network analysis has been used to model this cooperation, helping to computationally explain how binding to an allosteric site can impact the behavior of a primary site many ångstroms away. Traditionally, computational efforts have focused on the most optimal path of correlated motions leading from the allosteric to the primary active site. We present a program called Weighted Implementation of Suboptimal Paths (WISP) capable of rapidly identifying additional suboptimal pathways that may also play important roles in the transmission of allosteric signals. Aside from providing signal redundancy, suboptimal paths traverse residues that, if disrupted through pharmacological or mutational means, could modulate the allosteric regulation of important drug targets. To demonstrate the utility of our program, we present a case study describing the allostery of HisH-HisF, an amidotransferase from T. maritima thermotiga. WISP and its VMD-based graphical user interface (GUI) can be downloaded from http://nbcr.ucsd.edu/wisp.
Muzdalo, Anja; Saalfrank, Peter; Vreede, Jocelyne; Santer, Mark
2018-02-21
Azobenzene-based molecular photoswitches are becoming increasingly important for the development of photoresponsive, functional soft-matter material systems. Upon illumination with light, fast interconversion between a more stable trans- and a metastable cis configuration can be established resulting in pronounced changes in conformation, dipole moment or hydrophobicity. A rational design of functional photosensitive molecules with embedded azo moieties requires a thorough understanding of isomerization mechanisms and rates, especially the thermally activated relaxation. For small azo derivatives considered in the gas phase or simple solvents, Eyring's classical Transition State Theory (TST) approach yields useful predictions for trends in activation energies or corresponding half-life times of the cis isomer. However, TST or improved theories cannot easily be applied when the azo moiety is part of a larger molecular complex or embedded into a heterogeneous environment, where a multitude of possible reaction pathways may exist. In these cases, only the sampling of an ensemble of dynamic reactive trajectories (transition path sampling, TPS) with explicit models of the environment may reveal the nature of the processes involved. In the present work we show how a TPS approach can conveniently be implemented for the phenomenon of relaxation-isomerization of azobenzenes starting with the simple examples of pure azobenzene and a push-pull derivative immersed in a polar (DMSO) and apolar (toluene) solvent. The latter are represented explicitly at a molecular mechanical (MM) and the azo moiety at a quantum mechanical (QM) level. We demonstrate for the push-pull azobenze that path sampling in combination with the chosen QM/MM scheme produces the expected change in isomerization pathway from inversion to rotation in going from a low- to a high permittivity (explicit) solvent model. We discuss the potential of the simulation procedure presented for comparative calculation of
A one-way shooting algorithm for transition path sampling of asymmetric barriers
Brotzakis, Z.F.; Bolhuis, P.G.
2016-01-01
We present a novel transition path sampling shooting algorithm for the efficient sampling of complex (biomolecular) activated processes with asymmetric free energy barriers. The method employs a fictitious potential that biases the shooting point toward the transition state. The method is similar in
Fernandes, Adji Achmad Rinaldo; Solimun, Arisoesilaningsih, Endang
2017-12-01
The aim of this research is to estimate the spline in Path Analysis-based on Nonparametric Regression using Penalized Weighted Least Square (PWLS) approach. Approach used is Reproducing Kernel Hilbert Space at sobolev space. Nonparametric path analysis model on the equation y1 i=f1.1(x1 i)+ε1 i; y2 i=f1.2(x1 i)+f2.2(y1 i)+ε2 i; i =1 ,2 ,…,n Nonparametric Path Analysis which meet the criteria of minimizing PWLS min fw .k∈W2m[aw .k,bw .k], k =1 ,2 { (2n ) -1(y˜-f ˜ ) TΣ-1(y ˜-f ˜ ) + ∑k =1 2 ∑w =1 2 λw .k ∫aw .k bw .k [fw.k (m )(xi) ] 2d xi } is f ˜^=Ay ˜ with A=T1(T1TU1-1∑-1T1)-1T1TU1-1∑-1+V1U1-1∑-1[I-T1(T1TU1-1∑-1T1)-1T1TU1-1∑-1] columnalign="left">+T2(T2TU2-1∑-1T2)-1T2TU2-1∑-1+V2U2-1∑-1[I1-T2(T2TU2-1∑-1T2) -1T2TU2-1∑-1
Yin, Dong-shan; Gao, Yu-ping; Zhao, Shu-hong
2017-07-01
Millisecond pulsars can generate another type of time scale that is totally independent of the atomic time scale, because the physical mechanisms of the pulsar time scale and the atomic time scale are quite different from each other. Usually the pulsar timing observations are not evenly sampled, and the internals between two data points range from several hours to more than half a month. Further more, these data sets are sparse. All this makes it difficult to generate an ensemble pulsar time scale. Hence, a new algorithm to calculate the ensemble pulsar time scale is proposed. Firstly, a cubic spline interpolation is used to densify the data set, and make the intervals between data points uniform. Then, the Vondrak filter is employed to smooth the data set, and get rid of the high-frequency noises, and finally the weighted average method is adopted to generate the ensemble pulsar time scale. The newly released NANOGRAV (North American Nanohertz Observatory for Gravitational Waves) 9-year data set is used to generate the ensemble pulsar time scale. This data set includes the 9-year observational data of 37 millisecond pulsars observed by the 100-meter Green Bank telescope and the 305-meter Arecibo telescope. It is found that the algorithm used in this paper can reduce effectively the influence caused by the noises in pulsar timing residuals, and improve the long-term stability of the ensemble pulsar time scale. Results indicate that the long-term (> 1 yr) stability of the ensemble pulsar time scale is better than 3.4 × 10-15.
MODEL SELECTION OF ENSEMBLE FORECASTING USING WEIGHTED SIMILARITY OF TIME SERIES
Directory of Open Access Journals (Sweden)
Agus Widodo
2012-07-01
Full Text Available Several methods have been proposed to combine the forecasting results into single forecast namely the simple averaging, weighted average on validation performance, or non-parametric combination schemas. These methods use fixed combination of individual forecast to get the final forecast result. In this paper, quite different approach is employed to select the forecasting methods, in which every point to forecast is calculated by using the best methods used by similar training dataset. Thus, the selected methods may differ at each point to forecast. The similarity measures used to compare the time series for testing and validation are Euclidean and Dynamic Time Warping (DTW, where each point to compare is weighted according to its recentness. The dataset used in the experiment is the time series data designated for NN3 Competition and time series generated from the frequency of USPTOâ€™s patents and PubMedâ€™s scientific publications on the field of health, namely on Apnea, Arrhythmia, and Sleep Stages. The experimental result shows that the weighted combination of methods selected based on the similarity between training and testing data may perform better compared to either the unweighted combination of methods selected based on the similarity measure or the fixed combination of best individual forecast. Beberapa metode telah diajukan untuk menggabungkan beberapa hasil forecasting dalam single forecast yang diberi nama simple averaging, pemberian rata-rata dengan bobot pada tahap validasi kinerja, atau skema kombinasi non-parametrik. Metode ini menggunakan kombinasi tetap pada individual forecast untuk mendapatkan hasil final dari forecast. Dalam paper ini, pendekatan berbeda digunakan untuk memilih metode forecasting, di mana setiap titik dihitung dengan menggunakan metode terbaik yang digunakan oleh dataset pelatihan sejenis. Dengan demikian, metode yang dipilih dapat berbeda di setiap titik perkiraan. Similarity measure yang
Directory of Open Access Journals (Sweden)
Indah Anita Sari
2013-12-01
Full Text Available Path coefficient analysis is frequently used for development of selection criteria on various type of plants. Path analysis on this research was conducted to find the selection criteria of yield component which directly affect bean weight. In addition to the value of path analysis coefficient, genetic variation coefficient, heritability and the value of genetic progress were also studied. The study was conducted at the Indonesian Coffee and Cocoa Research Institute. The research used randomized complete block design consisting of 14 accession numbers and each consisting of three replications. Pod girth, pod length, pod weight, wet beans weight per pod, number of normal beans per pod, number of abnormal beans per pod, dry weight per normal bean, and shell content were observed. The results showed that the pod weight character had an important role in determining the dry weight of normal bean. The character had a positive genotype correlation coefficient values which was high and significantly different (r=0.46 for dry weight per normal bean, considerable direct influence (P=0.479, moderate of the genotype variation coefficient (9.6%, and high genetic progress (95.23. Character of wet bean weight per pod could also be used indirectly for the selection criteria for dry weight per normal bean based on genetic variation coefficient value (11.88%, genetic progress value (82.48, and direct effect on dry weight per normal bean had positive value (P=0.006. Key words: Selection criteria, dry weight per bean, path analysis,Theobroma cacaoL.
The enzymatic reaction catalyzed by lactate dehydrogenase exhibits one dominant reaction path
Masterson, Jean E.; Schwartz, Steven D.
2014-10-01
Enzymes are the most efficient chemical catalysts known, but the exact nature of chemical barrier crossing in enzymes is not fully understood. Application of transition state theory to enzymatic reactions indicates that the rates of all possible reaction paths, weighted by their relative probabilities, must be considered in order to achieve an accurate calculation of the overall rate. Previous studies in our group have shown a single mechanism for enzymatic barrier passage in human heart lactate dehydrogenase (LDH). To ensure that this result was not due to our methodology insufficiently sampling reactive phase space, we implement high-perturbation transition path sampling in both microcanonical and canonical regimes for the reaction catalyzed by human heart LDH. We find that, although multiple, distinct paths through reactive phase space are possible for this enzymatic reaction, one specific reaction path is dominant. Since the frequency of these paths in a canonical ensemble is inversely proportional to the free energy barriers separating them from other regions of phase space, we conclude that the rarer reaction paths are likely to have a negligible contribution. Furthermore, the non-dominate reaction paths correspond to altered reactive conformations and only occur after multiple steps of high perturbation, suggesting that these paths may be the result of non-biologically significant changes to the structure of the enzymatic active site.
Lu, Jianfeng; Zhou, Zhennan
2017-04-21
In this work, a novel ring polymer representation for a multi-level quantum system is proposed for thermal average calculations. The proposed representation keeps the discreteness of the electronic states: besides position and momentum, each bead in the ring polymer is also characterized by a surface index indicating the electronic energy surface. A path integral molecular dynamics with surface hopping (PIMD-SH) dynamics is also developed to sample the equilibrium distribution of the ring polymer configurational space. The PIMD-SH sampling method is validated theoretically and by numerical examples.
Weighted statistical parameters for irregularly sampled time series
Rimoldini, Lorenzo
2014-01-01
Unevenly spaced time series are common in astronomy because of the day-night cycle, weather conditions, dependence on the source position in the sky, allocated telescope time and corrupt measurements, for example, or inherent to the scanning law of satellites like Hipparcos and the forthcoming Gaia. Irregular sampling often causes clumps of measurements and gaps with no data which can severely disrupt the values of estimators. This paper aims at improving the accuracy of common statistical parameters when linear interpolation (in time or phase) can be considered an acceptable approximation of a deterministic signal. A pragmatic solution is formulated in terms of a simple weighting scheme, adapting to the sampling density and noise level, applicable to large data volumes at minimal computational cost. Tests on time series from the Hipparcos periodic catalogue led to significant improvements in the overall accuracy and precision of the estimators with respect to the unweighted counterparts and those weighted by inverse-squared uncertainties. Automated classification procedures employing statistical parameters weighted by the suggested scheme confirmed the benefits of the improved input attributes. The classification of eclipsing binaries, Mira, RR Lyrae, Delta Cephei and Alpha2 Canum Venaticorum stars employing exclusively weighted descriptive statistics achieved an overall accuracy of 92 per cent, about 6 per cent higher than with unweighted estimators.
Bao, Xiaoqi; Badescu, Mircea; Sherrit, Stewart; Bar-Cohen, Yoseph; Campos, Sergio
2017-04-01
The potential return of Mars sample material is of great interest to the planetary science community, as it would enable extensive analysis of samples with highly sensitive laboratory instruments. It is important to make sure such a mission concept would not bring any living microbes, which may possibly exist on Mars, back to Earth's environment. In order to ensure the isolation of Mars microbes from Earth's Atmosphere, a brazing sealing and sterilizing technique was proposed to break the Mars-to-Earth contamination path. Effectively, heating the brazing zone in high vacuum space and controlling the sample temperature for integrity are key challenges to the implementation of this technique. The break-thechain procedures for container configurations, which are being considered, were simulated by multi-physics finite element models. Different heating methods including induction and resistive/radiation were evaluated. The temperature profiles of Martian samples in a proposed container structure were predicted. The results show that the sealing and sterilizing process can be controlled such that the samples temperature is maintained below the level that may cause damage, and that the brazing technique is a feasible approach to breaking the contamination path.
Peter, Emanuel K.
2017-12-01
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Easy transition path sampling methods: flexible-length aimless shooting and permutation shooting.
Mullen, Ryan Gotchy; Shea, Joan-Emma; Peters, Baron
2015-06-09
We present new algorithms for conducting transition path sampling (TPS). Permutation shooting rigorously preserves the total energy and momentum of the initial trajectory and is simple to implement even for rigid water molecules. Versions of aimless shooting and permutation shooting that use flexible-length trajectories have simple acceptance criteria and are more computationally efficient than fixed-length versions. Flexible-length permutation shooting and inertial likelihood maximization are used to identify the reaction coordinate for vacancy migration in a two-dimensional trigonal crystal of Lennard-Jones particles. The optimized reaction coordinate eliminates nearly all recrossing of the transition state dividing surface.
Levien, Ethan; Bressloff, Paul C.
2017-10-01
Many biochemical systems appearing in applications have a multiscale structure so that they converge to piecewise deterministic Markov processes in a thermodynamic limit. The statistics of the piecewise deterministic process can be obtained much more efficiently than those of the exact process. We explore the possibility of coupling sample paths of the exact model to the piecewise deterministic process in order to reduce the variance of their difference. We then apply this coupling to reduce the computational complexity of a Monte Carlo estimator. Motivated by the rigorous results in [1], we show how this method can be applied to realistic biological models with nontrivial scalings.
A one-way shooting algorithm for transition path sampling of asymmetric barriers.
Brotzakis, Z Faidon; Bolhuis, Peter G
2016-10-28
We present a novel transition path sampling shooting algorithm for the efficient sampling of complex (biomolecular) activated processes with asymmetric free energy barriers. The method employs a fictitious potential that biases the shooting point toward the transition state. The method is similar in spirit to the aimless shooting technique by Peters and Trout [J. Chem. Phys. 125, 054108 (2006)], but is targeted for use with the one-way shooting approach, which has been shown to be more effective than two-way shooting algorithms in systems dominated by diffusive dynamics. We illustrate the method on a 2D Langevin toy model, the association of two peptides and the initial step in dissociation of a β-lactoglobulin dimer. In all cases we show a significant increase in efficiency.
Automated Prediction of Catalytic Mechanism and Rate Law Using Graph-Based Reaction Path Sampling.
Habershon, Scott
2016-04-12
In a recent article [ J. Chem. Phys. 2015 , 143 , 094106 ], we introduced a novel graph-based sampling scheme which can be used to generate chemical reaction paths in many-atom systems in an efficient and highly automated manner. The main goal of this work is to demonstrate how this approach, when combined with direct kinetic modeling, can be used to determine the mechanism and phenomenological rate law of a complex catalytic cycle, namely cobalt-catalyzed hydroformylation of ethene. Our graph-based sampling scheme generates 31 unique chemical products and 32 unique chemical reaction pathways; these sampled structures and reaction paths enable automated construction of a kinetic network model of the catalytic system when combined with density functional theory (DFT) calculations of free energies and resultant transition-state theory rate constants. Direct simulations of this kinetic network across a range of initial reactant concentrations enables determination of both the reaction mechanism and the associated rate law in an automated fashion, without the need for either presupposing a mechanism or making steady-state approximations in kinetic analysis. Most importantly, we find that the reaction mechanism which emerges from these simulations is exactly that originally proposed by Heck and Breslow; furthermore, the simulated rate law is also consistent with previous experimental and computational studies, exhibiting a complex dependence on carbon monoxide pressure. While the inherent errors of using DFT simulations to model chemical reactivity limit the quantitative accuracy of our calculated rates, this work confirms that our automated simulation strategy enables direct analysis of catalytic mechanisms from first principles.
Bolhuis, Peter
Important reaction-diffusion processes, such as biochemical networks in living cells, or self-assembling soft matter, span many orders in length and time scales. In these systems, the reactants' spatial dynamics at mesoscopic length and time scales of microns and seconds is coupled to the reactions between the molecules at microscopic length and time scales of nanometers and milliseconds. This wide range of length and time scales makes these systems notoriously difficult to simulate. While mean-field rate equations cannot describe such processes, the mesoscopic Green's Function Reaction Dynamics (GFRD) method enables efficient simulation at the particle level provided the microscopic dynamics can be integrated out. Yet, many processes exhibit non-trivial microscopic dynamics that can qualitatively change the macroscopic behavior, calling for an atomistic, microscopic description. The recently developed multiscale Molecular Dynamics Green's Function Reaction Dynamics (MD-GFRD) approach combines GFRD for simulating the system at the mesocopic scale where particles are far apart, with microscopic Molecular (or Brownian) Dynamics, for simulating the system at the microscopic scale where reactants are in close proximity. The association and dissociation of particles are treated with rare event path sampling techniques. I will illustrate the efficiency of this method for patchy particle systems. Replacing the microscopic regime with a Markov State Model avoids the microscopic regime completely. The MSM is then pre-computed using advanced path-sampling techniques such as multistate transition interface sampling. I illustrate this approach on patchy particle systems that show multiple modes of binding. MD-GFRD is generic, and can be used to efficiently simulate reaction-diffusion systems at the particle level, including the orientational dynamics, opening up the possibility for large-scale simulations of e.g. protein signaling networks.
Directory of Open Access Journals (Sweden)
Takahashi Susumu
2009-09-01
Full Text Available Abstract Background The matrix-like organization of the hippocampus, with its several inputs and outputs, has given rise to several theories related to hippocampal information processing. Single-cell electrophysiological studies and studies of lesions or genetically altered animals using recognition memory tasks such as delayed non-matching-to-sample (DNMS tasks support the theories. However, a complete understanding of hippocampal function necessitates knowledge of the encoding of information by multiple neurons in a single trial. The role of neuronal ensembles in the hippocampal CA1 for a DNMS task was assessed quantitatively in this study using multi-neuronal recordings and an artificial neural network classifier as a decoder. Results The activity of small neuronal ensembles (6-18 cells over brief time intervals (2-50 ms contains accurate information specifically related to the matching/non-matching of continuously presented stimuli (stimulus comparison. The accuracy of the combination of neurons pooled over all the ensembles was markedly lower than those of the ensembles over all examined time intervals. Conclusion The results show that the spatiotemporal patterns of spiking activity among cells in the small neuronal ensemble contain much information that is specifically useful for the stimulus comparison. Small neuronal networks in the hippocampal CA1 might therefore act as a comparator during recognition memory tasks.
Path-sampling strategies for simulating rare events in biomolecular systems.
Chong, Lillian T; Saglam, Ali S; Zuckerman, Daniel M
2017-04-01
Despite more than three decades of effort with molecular dynamics simulations, long-timescale (ms and beyond) biologically relevant phenomena remain out of reach in most systems of interest. This is largely because important transitions, such as conformational changes and (un)binding events, tend to be rare for conventional simulations (biomolecular energy landscapes. In contrast, path sampling approaches focus computing effort specifically on transitions of interest. Such approaches have been in use for nearly 20 years in biomolecular systems and enabled the generation of pathways and calculation of rate constants for ms processes, including large protein conformational changes, protein folding, and protein (un)binding. Copyright © 2016 Elsevier Ltd. All rights reserved.
Sample-path stability conditions for multiserver input-output processes
Directory of Open Access Journals (Sweden)
Muhammad El-Taha
1994-01-01
Full Text Available We extend our studies of sample-path stability to multiserver input-output processes with conditional output rates that may depend on the state of the system and other auxiliary processes. Our results include processes with countable as well as uncountable state spaces. We establish rate stability conditions for busy period durations as well as the input during busy periods. In addition, stability conditions for multiserver queues with possibly heterogeneous servers are given for the workload, attained service, and queue length processes. The stability conditions can be checked from parameters of primary processes, and thus can be verified a priori. Under the rate stability conditions, we provide stable versions of Little's formula for single server as well as multiserver queues. Our approach leads to extensions of previously known results. Since our results are valid pathwise, non-stationary as well as stationary processes are covered.
Dzierlenga, Michael W; Antoniou, Dimitri; Schwartz, Steven D
2015-04-02
The mechanisms involved in enzymatic hydride transfer have been studied for years, but questions remain due, in part, to the difficulty of probing the effects of protein motion and hydrogen tunneling. In this study, we use transition path sampling (TPS) with normal mode centroid molecular dynamics (CMD) to calculate the barrier to hydride transfer in yeast alcohol dehydrogenase (YADH) and human heart lactate dehydrogenase (LDH). Calculation of the work applied to the hydride allowed for observation of the change in barrier height upon inclusion of quantum dynamics. Similar calculations were performed using deuterium as the transferring particle in order to approximate kinetic isotope effects (KIEs). The change in barrier height in YADH is indicative of a zero-point energy (ZPE) contribution and is evidence that catalysis occurs via a protein compression that mediates a near-barrierless hydride transfer. Calculation of the KIE using the difference in barrier height between the hydride and deuteride agreed well with experimental results.
Multilevel ensemble Kalman filter
Chernov, Alexey
2016-01-06
This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF). In terms of computational cost vs. approximation error the asymptotic performance of the multilevel ensemble Kalman filter (MLEnKF) is superior to the EnKF s.
DEFF Research Database (Denmark)
Rebolj, Matejka; Preisler, Sarah; Ejegod, Ditte M
2013-01-01
The APTIMA Human Papillomavirus (HPV) Assay detects E6/E7 mRNA from 14 human papillomavirus genotypes. Horizon was a population-based split-sample study among well-screened women, with an aim to compare APTIMA, Hybrid Capture 2 (HC2), and liquid-based cytology (LBC) using SurePath samples. APTIMA...... agreement between APTIMA and HC2. This is the first APTIMA study using SurePath samples on the PANTHER platform. The trends in positivity rates on SurePath samples for APTIMA, HC2, and LBC were consistent with studies based on PreservCyt samples, and the agreement between the two HPV assays was substantial....... The high proportions of women testing positive suggest that in countries with a high HPV prevalence, caution will be needed if HPV tests, including mRNA-based tests, are to replace LBC....
A Stochastic Approach to Path Planning in the Weighted-Region Problem
1991-03-01
1985. [Akma87] Akman, V., Unobstructed Shortest Paths in Polyhedral Environments, pp. 90-92, Springer- Verlag, 1987. [Alex90] Alexander , R.S. and...6, pp. 14-15, IEEE, June 1989. [John89] Johnson, D.S., Aragon , C.R., McGeoch, L.A., and Schevon, C., "Optimization by Simulated Annealing: An... Aragon , C.R., McGeoch, L.A., and Schevon, C., "Optimization by Simulated Annealing: An Experimental Evaluation; Part II, (Graph Coloring and Number
Transition path sampling with quantum/classical mechanics for reaction rates.
Gräter, Frauke; Li, Wenjin
2015-01-01
Predicting rates of biochemical reactions through molecular simulations poses a particular challenge for two reasons. First, the process involves bond formation and/or cleavage and thus requires a quantum mechanical (QM) treatment of the reaction center, which can be combined with a more efficient molecular mechanical (MM) description for the remainder of the system, resulting in a QM/MM approach. Second, reaction time scales are typically many orders of magnitude larger than the (sub-)nanosecond scale accessible by QM/MM simulations. Transition path sampling (TPS) allows to efficiently sample the space of dynamic trajectories from the reactant to the product state without an additional biasing potential. We outline here the application of TPS and QM/MM to calculate rates for biochemical reactions, by means of a simple toy system. In a step-by-step protocol, we specifically refer to our implementation within the MD suite Gromacs, which we have made available to the research community, and include practical advice on the choice of parameters.
Shi, Y.; Long, Y.; Wi, X. L.
2014-04-01
When tourists visiting multiple tourist scenic spots, the travel line is usually the most effective road network according to the actual tour process, and maybe the travel line is different from planned travel line. For in the field of navigation, a proposed travel line is normally generated automatically by path planning algorithm, considering the scenic spots' positions and road networks. But when a scenic spot have a certain area and have multiple entrances or exits, the traditional described mechanism of single point coordinates is difficult to reflect these own structural features. In order to solve this problem, this paper focuses on the influence on the process of path planning caused by scenic spots' own structural features such as multiple entrances or exits, and then proposes a doubleweighted Graph Model, for the weight of both vertexes and edges of proposed Model can be selected dynamically. And then discusses the model building method, and the optimal path planning algorithm based on Dijkstra algorithm and Prim algorithm. Experimental results show that the optimal planned travel line derived from the proposed model and algorithm is more reasonable, and the travelling order and distance would be further optimized.
Sørbye, Sveinung Wergeland; Pedersen, Mette Kristin; Ekeberg, Bente; Williams, Merete E Johansen; Sauer, Torill; Chen, Ying
2017-01-01
The Norwegian Cervical Cancer Screening Program recommends screening every 3 years for women between 25 and 69 years of age. There is a large difference in the percentage of unsatisfactory samples between laboratories that use different brands of liquid-based cytology. We wished to examine if inadequate ThinPrep samples could be satisfactory by processing them with the SurePath protocol. A total of 187 inadequate ThinPrep specimens from the Department of Clinical Pathology at University Hospital of North Norway were sent to Akershus University Hospital for conversion to SurePath medium. Ninety-one (48.7%) were processed through the automated "gynecologic" application for cervix cytology samples, and 96 (51.3%) were processed with the "nongynecological" automatic program. Out of 187 samples that had been unsatisfactory by ThinPrep, 93 (49.7%) were satisfactory after being converted to SurePath. The rate of satisfactory cytology was 36.6% and 62.5% for samples run through the "gynecology" program and "nongynecology" program, respectively. Of the 93 samples that became satisfactory after conversion from ThinPrep to SurePath, 80 (86.0%) were screened as normal while 13 samples (14.0%) were given an abnormal diagnosis, which included 5 atypical squamous cells of undetermined significance, 5 low-grade squamous intraepithelial lesion, 2 atypical glandular cells not otherwise specified, and 1 atypical squamous cells cannot exclude high-grade squamous intraepithelial lesion. A total of 2.1% (4/187) of the women got a diagnosis of cervical intraepithelial neoplasia 2 or higher at a later follow-up. Converting cytology samples from ThinPrep to SurePath processing can reduce the number of unsatisfactory samples. The samples should be run through the "nongynecology" program to ensure an adequate number of cells.
Directory of Open Access Journals (Sweden)
Sveinung Wergeland Sorbye
2017-01-01
Full Text Available Background: The Norwegian Cervical Cancer Screening Program recommends screening every 3 years for women between 25 and 69 years of age. There is a large difference in the percentage of unsatisfactory samples between laboratories that use different brands of liquid-based cytology. We wished to examine if inadequate ThinPrep samples could be satisfactory by processing them with the SurePath protocol. Materials and Methods: A total of 187 inadequate ThinPrep specimens from the Department of Clinical Pathology at University Hospital of North Norway were sent to Akershus University Hospital for conversion to SurePath medium. Ninety-one (48.7% were processed through the automated “gynecologic” application for cervix cytology samples, and 96 (51.3% were processed with the “nongynecological” automatic program. Results: Out of 187 samples that had been unsatisfactory by ThinPrep, 93 (49.7% were satisfactory after being converted to SurePath. The rate of satisfactory cytology was 36.6% and 62.5% for samples run through the “gynecology” program and “nongynecology” program, respectively. Of the 93 samples that became satisfactory after conversion from ThinPrep to SurePath, 80 (86.0% were screened as normal while 13 samples (14.0% were given an abnormal diagnosis, which included 5 atypical squamous cells of undetermined significance, 5 low-grade squamous intraepithelial lesion, 2 atypical glandular cells not otherwise specified, and 1 atypical squamous cells cannot exclude high-grade squamous intraepithelial lesion. A total of 2.1% (4/187 of the women got a diagnosis of cervical intraepithelial neoplasia 2 or higher at a later follow-up. Conclusions: Converting cytology samples from ThinPrep to SurePath processing can reduce the number of unsatisfactory samples. The samples should be run through the “nongynecology” program to ensure an adequate number of cells.
Varga, Matthew; Schwartz, Steven
2015-03-01
The experimental determination of kinetic isotope effects in enzymatic systems can be a difficult, time-consuming, and expensive process. In this study, we use the Chandler-Bolhius method for the determination of reaction rates within transition path sampling (rTPS) to determine the primary kinetic isotope effect in yeast alcohol dehydrogenase (YADH). In this study, normal mode centroid molecular dynamics (CMD) was applied to the transferring hydride/deuteride in order to correctly incorporate quantum effects into the molecular simulations. Though previous studies have used rTPS to calculate reaction rate constants in various model and real systems, it has not been applied to a system as large as YADH. Due to the fact that particle transfer is not wholly indicative of the chemical step, this method cannot be used to determine reaction rate constants in YADH. However, it is possible to determine the transition rate constant of the particle transfer, and the kinetic isotope effect of that step. This method provides a set of tools to determine kinetic isotope effects with the atomistic detail of molecular simulations.
Dzierlenga, Michael; Antoniou, Dimitri; Schwartz, Steven
2015-03-01
The mechanisms involved in enzymatic hydride transfer have been studies for years but questions remain, due to the difficulty in determining the participation of protein dynamics and quantum effects, especially hydrogen tunneling. In this study, we use transition path sampling (TPS) with normal mode centroid molecular dynamics (CMD) to calculate the barrier to hydride transfer in yeast alcohol dehydrogenase (YADH) and lactate dehydrogenase (LDH). Calculation of the work applied to the hydride during the reaction allows for observation of the change in barrier height due to inclusion of quantum effects. Additionally, the same calculations were performed using deuterium as the transferring particle to validate our methods with experimentally measured kinetic isotope effects. The change in barrier height in YADH upon inclusion of quantum effects is indicative of a zero-point energy contribution, and is evidence that the protein mediates a near-barrierless transfer of the rate-limiting hydride. Calculation of kinetic isotope effects using the average difference in barrier between hydride and deuteride agreed well with experimental results. The authors acknowledge the support of the National Institutes of Health Grants GM068036 and GM102226.
Weight references for burned human skeletal remains from Portuguese samples.
Gonçalves, David; Cunha, Eugénia; Thompson, Tim J U
2013-09-01
Weight is often one of the few recoverable data when analyzing human cremains but references are still rare, especially for European populations. Mean weights for skeletal remains were thus documented for Portuguese modern cremations of both recently deceased individuals and dry skeletons, and the effect of age, sex, and the intensity of combustion was investigated using both multivariate and univariate statistics. The cremains from fresh cadavers were significantly heavier than the ones from dry skeletons regardless of sex and age cohort (p skeletal weight. The effect of the intensity of combustion in cremains weight was unclear. These weight references may, in some cases, help estimating the minimum number of individuals, the completeness of the skeletal assemblage, and the sex of an unknown individual. © 2013 American Academy of Forensic Sciences.
Directory of Open Access Journals (Sweden)
Aimei Shao
2013-02-01
Full Text Available In the ensemble-based four dimensional variational assimilation method (SVD-En4DVar, a singular value decomposition (SVD technique is used to select the leading eigenvectors and the analysis variables are expressed as the orthogonal bases expansion of the eigenvectors. The experiments with a two-dimensional shallow-water equation model and simulated observations show that the truncation error and rejection of observed signals due to the reduced-dimensional reconstruction of the analysis variable are the major factors that damage the analysis when the ensemble size is not large enough. However, a larger-sized ensemble is daunting computational burden. Experiments with a shallow-water equation model also show that the forecast error covariances remain relatively constant over time. For that reason, we propose an approach that increases the members of the forecast ensemble while reducing the update frequency of the forecast error covariance in order to increase analysis accuracy and to reduce the computational cost. A series of experiments were conducted with the shallow-water equation model to test the efficiency of this approach. The experimental results indicate that this approach is promising. Further experiments with the WRF model show that this approach is also suitable for the real atmospheric data assimilation problem, but the update frequency of the forecast error covariances should not be too low.
DEFF Research Database (Denmark)
Fornari, D; Rebolj, M; Bjerregaard, B
2016-01-01
OBJECTIVE: In two laboratories (Departments of Pathology, Copenhagen University Hospitals of Herlev and Hvidovre), we compared cobas and Hybrid Capture 2 (HC2) human papillomavirus (HPV) assays using SurePath® samples from women with atypical squamous cells of undetermined significance (ASCUS...
DEFF Research Database (Denmark)
Fratini, Gerardo; Ibrom, Andreas; Arriga, Nicola
2012-01-01
It has been formerly recognised that increasing relative humidity in the sampling line of closed-path eddy-covariance systems leads to increasing attenuation of water vapour turbulent fluctuations, resulting in strong latent heat flux losses. This occurrence has been analyzed for very long (50 m)...
2002-01-01
NYYD Ensemble'i duost Traksmann - Lukk E.-S. Tüüri teosega "Symbiosis", mis on salvestatud ka hiljuti ilmunud NYYD Ensemble'i CDle. 2. märtsil Rakvere Teatri väikeses saalis ja 3. märtsil Rotermanni Soolalaos, kavas Tüür, Kaumann, Berio, Reich, Yun, Hauta-aho, Buckinx
Lessons from Climate Modeling on the Design and Use of Ensembles for Crop Modeling
Wallach, Daniel; Mearns, Linda O.; Ruane, Alexander C.; Roetter, Reimund P.; Asseng, Senthold
2016-01-01
Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions include defining criteria for acceptance of models in a crop MME, exploring criteria for evaluating the degree of relatedness of models in a MME, studying the effect of number of models in the ensemble, development of a statistical model of model sampling, creation of a repository for MME results, studies of possible differential weighting of models in an ensemble, creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs specifically oriented toward uncertainty estimation, the creation of super ensembles that sample more than one source of uncertainty, the analysis of super ensemble results to obtain information on total uncertainty and the separate contributions of different sources of uncertainty and finally further investigation of the use of the multi-model mean or median as a predictor.
Directory of Open Access Journals (Sweden)
Kwang-Il Ahn
2013-01-01
Full Text Available The Hurst exponent and variance are two quantities that often characterize real-life, high-frequency observations. Such real-life signals are generally measured under noise environments. We develop a multiscale statistical method for simultaneous estimation of a time-changing Hurst exponent H(t and a variance parameter C in a multifractional Brownian motion model in the presence of white noise. The method is based on the asymptotic behavior of the local variation of its sample paths which applies to coarse scales of the sample paths. This work provides stable and simultaneous estimators of both parameters when independent white noise is present. We also discuss the accuracy of the simultaneous estimators compared with a few selected methods and the stability of computations with regard to adapted wavelet filters.
DEFF Research Database (Denmark)
Rozemeijer, Kirsten; Naber, Steffie K; Penning, Corine
2017-01-01
Objective To compare the cumulative incidence of cervical cancer diagnosed within 72 months after a normal screening sample between conventional cytology and liquid based cytology tests SurePath and ThinPrep.Design Retrospective population based cohort study.Setting Nationwide network and registry...... of histo- and cytopathology in the Netherlands (PALGA), January 2000 to March 2013.Population Women with 5 924 474 normal screening samples (23 833 123 person years).Exposure Use of SurePath or ThinPrep versus conventional cytology as screening test.Main outcome measure 72 month cumulative incidence...... was 58.5 (95% confidence interval 54.6 to 62.7) per 100 000 normal conventional cytology samples, compared with 66.8 (56.7 to 78.7) for ThinPrep and 44.6 (37.8 to 52.6) for SurePath. Compared with conventional cytology, the hazard of invasive cancer was 19% lower (hazard ratio 0.81, 95% confidence...
Directory of Open Access Journals (Sweden)
Marin-Garcia Pablo
2010-05-01
Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.
Lavender, Jason M.; Shaw, Jena A.; Crosby, Ross D.; Feig, Emily H.; Mitchell, James E.; Crow, Scott J.; Hill, Laura; Grange, Daniel Le; Powers, Pauline; Lowe, Michael R.
2015-01-01
Evidence suggests that weight suppression, the difference between an individual’s highest historical body weight and current body weight, may play a role in the etiology and/or maintenance of eating disorders (EDs), and may also impact ED treatment. However, there are limited findings regarding the association between weight suppression and dimensions of ED psychopathology, particularly in multi-diagnostic ED samples. Participants were 1748 adults (94% female) from five sites with a variety o...
Cachelin, F M; Striegel-Moore, R H; Elder, K A
1998-01-01
Recently, a shift in obesity treatment away from emphasizing ideal weight loss goals to establishing realistic weight loss goals has been proposed; yet, what constitutes "realistic" weight loss for different populations is not clear. This study examined notions of realistic shape and weight as well as body size assessment in a large community-based sample of African-American, Asian, Hispanic, and white men and women. Participants were 1893 survey respondents who were all dieters and primarily overweight. Groups were compared on various variables of body image assessment using silhouette ratings. No significant race differences were found in silhouette ratings, nor in perceptions of realistic shape or reasonable weight loss. Realistic shape and weight ratings by both women and men were smaller than current shape and weight but larger than ideal shape and weight ratings. Compared with male dieters, female dieters considered greater weight loss to be realistic. Implications of the findings for the treatment of obesity are discussed.
Garfield, Joan; Le, Laura; Zieffler, Andrew; Ben-Zvi, Dani
2015-01-01
This paper describes the importance of developing students' reasoning about samples and sampling variability as a foundation for statistical thinking. Research on expert-novice thinking as well as statistical thinking is reviewed and compared. A case is made that statistical thinking is a type of expert thinking, and as such, research…
Hubbard, T J P; Aken, B L; Beal, K; Ballester, B; Caccamo, M; Chen, Y; Clarke, L; Coates, G; Cunningham, F; Cutts, T; Down, T; Dyer, S C; Fitzgerald, S; Fernandez-Banet, J; Graf, S; Haider, S; Hammond, M; Herrero, J; Holland, R; Howe, K; Howe, K; Johnson, N; Kahari, A; Keefe, D; Kokocinski, F; Kulesha, E; Lawson, D; Longden, I; Melsopp, C; Megy, K; Meidl, P; Ouverdin, B; Parker, A; Prlic, A; Rice, S; Rios, D; Schuster, M; Sealy, I; Severin, J; Slater, G; Smedley, D; Spudich, G; Trevanion, S; Vilella, A; Vogel, J; White, S; Wood, M; Cox, T; Curwen, V; Durbin, R; Fernandez-Suarez, X M; Flicek, P; Kasprzyk, A; Proctor, G; Searle, S; Smith, J; Ureta-Vidal, A; Birney, E
2007-01-01
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.
Lessons from the Climate Modeling Community on the Deisgn and Use of Ensembles for Crop Modeling
Mearns, L.
2016-12-01
Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions concern: 1) the creation of crop multi-model ensembles, 2) differential weighting of models in an ensemble, 3) the creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs, 4) the creation of super ensembles that sample more than one source of uncertainty, 5) the analysis of results to obtain information on uncertainty and the sources of uncertainty and 6) the use of the multi-model mean or median as a predictor.
The relationship between weight and smoking in a national sample of adolescents: Role of gender.
Lange, Krista; Thamotharan, Sneha; Racine, Madeline; Hirko, Caroline; Fields, Sherecce
2015-12-01
This study sought to investigate the role of weight status and body mass index percentile in risky smoking behaviors in male and female adolescents. Analyses of the data obtained in the 2011 Youth Risk Behavior Surveillance System were conducted. The national sample size included 15,425 adolescents. Questions addressing weight status and smoking behaviors were used in analyses. Significant effects of perceived weight status, weight change status, and body mass index percentile on smoking behaviors were found for both genders. The current findings indicate the importance of accounting for both gender and weight status when developing prevention and cessation programs targeting smoking behaviors. © The Author(s) 2014.
Selecting a climate model subset to optimise key ensemble properties
Directory of Open Access Journals (Sweden)
N. Herger
2018-02-01
Full Text Available End users studying impacts and risks caused by human-induced climate change are often presented with large multi-model ensembles of climate projections whose composition and size are arbitrarily determined. An efficient and versatile method that finds a subset which maintains certain key properties from the full ensemble is needed, but very little work has been done in this area. Therefore, users typically make their own somewhat subjective subset choices and commonly use the equally weighted model mean as a best estimate. However, different climate model simulations cannot necessarily be regarded as independent estimates due to the presence of duplicated code and shared development history. Here, we present an efficient and flexible tool that makes better use of the ensemble as a whole by finding a subset with improved mean performance compared to the multi-model mean while at the same time maintaining the spread and addressing the problem of model interdependence. Out-of-sample skill and reliability are demonstrated using model-as-truth experiments. This approach is illustrated with one set of optimisation criteria but we also highlight the flexibility of cost functions, depending on the focus of different users. The technique is useful for a range of applications that, for example, minimise present-day bias to obtain an accurate ensemble mean, reduce dependence in ensemble spread, maximise future spread, ensure good performance of individual models in an ensemble, reduce the ensemble size while maintaining important ensemble characteristics, or optimise several of these at the same time. As in any calibration exercise, the final ensemble is sensitive to the metric, observational product, and pre-processing steps used.
Weight loss expectations and goals in a population sample of overweight and obese US adults.
Fabricatore, Anthony N; Wadden, Thomas A; Rohay, Jeffrey M; Pillitteri, Janine L; Shiffman, Saul; Harkins, Andrea M; Burton, Steven L
2008-11-01
The purpose of this study was to investigate weight loss expectations and goals in a population sample of US adults who planned to make a weight loss attempt, and to examine predictors of those expectations and goals. Participants were 658 overweight and obese adults (55% women, mean age = 47.9 years, BMI = 31.8 kg/m(2)) who responded to a telephone survey about weight loss. Respondents reported weight loss expectations (i.e., reductions they realistically expected) and goals (i.e., reductions they ideally desired) for an upcoming "serious and deliberate" weight loss attempt. They also reported the expectations they had, and the reductions they actually achieved, in a previous attempt. Respondents' weight loss expectations for their upcoming attempt (8.0% reduction in initial weight) were significantly more modest than their goals for that attempt (16.8%), and smaller than the losses that they expected (12.0%), and achieved (8.9%) in their most recent past attempt (Ps goals. After controlling for BMI, age, and gender, previous weight loss was unrelated to expectations (but was inversely related to goals) for the upcoming weight loss attempt. Results suggest that overweight and obese individuals can select realistic weight loss expectations that are more modest than their ideal goals. BMI and gender appear to be more important than previous weight loss experiences in determining expectations among persons planning a weight loss attempt.
Directory of Open Access Journals (Sweden)
2006-01-01
Full Text Available In the previous work, the authors have considered a discrete-time queueing system and they have established that, under some assumptions, the stationary queue length distribution for the system with capacity K 1 is completely expressed in terms of the stationary distribution for the system with capacity K 0 ( > K 1 . In this paper, we study a sample-path version of this problem in more general setting, where neither stationarity nor ergodicity is assumed. We establish that, under some assumptions, the empirical queue length distribution (along through a sample path for the system with capacity K 1 is completely expressed only in terms of the quantities concerning the corresponding system with capacity K 0 ( > K 1 . Further, we consider a probabilistic setting where the assumptions are satisfied with probability one, and under the probabilistic setting, we obtain a stochastic version of our main result. The stochastic version is considered as a generalization of the author's previous result, because the probabilistic assumptions are less restrictive.
Directory of Open Access Journals (Sweden)
Fumio Ishizaki
2006-01-01
Full Text Available In the previous work, the authors have considered a discrete-time queueing system and they have established that, under some assumptions, the stationary queue length distribution for the system with capacity K1 is completely expressed in terms of the stationary distribution for the system with capacity K0 (>K1. In this paper, we study a sample-path version of this problem in more general setting, where neither stationarity nor ergodicity is assumed. We establish that, under some assumptions, the empirical queue length distribution (along through a sample path for the system with capacity K1 is completely expressed only in terms of the quantities concerning the corresponding system with capacity K0 (>K1. Further, we consider a probabilistic setting where the assumptions are satisfied with probability one, and under the probabilistic setting, we obtain a stochastic version of our main result. The stochastic version is considered as a generalization of the author's previous result, because the probabilistic assumptions are less restrictive.
K. Rozemeijer (Kirsten); S.K. Naber (Steffie); C. Penning (Corine); L.I.H. Overbeek (Lucy); C.W.N. Looman (Caspar); I.M.C.M. de Kok (Inge); S.M. Matthijsse (Suzette); M. Rebolj (Matejka); F.J. van Kemenade (Folkert); M. van Ballegooijen (Marjolein)
2017-01-01
markdownabstract#### Objective To compare the cumulative incidence of cervical cancer diagnosed within 72 months after a normal screening sample between conventional cytology and liquid based cytology tests SurePath and ThinPrep. #### Design Retrospective population based cohort
Lavender, Jason M; Shaw, Jena A; Crosby, Ross D; Feig, Emily H; Mitchell, James E; Crow, Scott J; Hill, Laura; Le Grange, Daniel; Powers, Pauline; Lowe, Michael R
2015-10-01
Evidence suggests that weight suppression, the difference between an individual's highest historical body weight and current body weight, may play a role in the etiology and/or maintenance of eating disorders (EDs), and may also impact ED treatment. However, there are limited findings regarding the association between weight suppression and dimensions of ED psychopathology, particularly in multi-diagnostic ED samples. Participants were 1748 adults (94% female) from five sites with a variety of DSM-IV ED diagnoses who completed the Eating Disorder Questionnaire, a self-report measure of various attitudinal, behavioral, and medical features of EDs. Four factor analytically derived dimensions of ED psychopathology were examined: (a) weight/shape concerns, (b) binge eating/vomiting, (c) exercise/restrictive eating behaviors, and (d) weight control medication use. Hierarchical regression analyses were conducted to examine the unique association of weight suppression with each dimension (controlling for ED diagnosis and BMI), as well as the independent unique associations of three interactions: (a) weight suppression×BMI, (b) weight suppression×ED diagnosis, and (c) BMI×ED diagnosis. Results revealed that weight suppression was uniquely associated with all of the ED psychopathology dimensions except binge eating/vomiting. The weight suppression × BMI interaction was significant only for weight/shape concerns, whereas the weight suppression×ED diagnosis was not significant for any of the dimensions. Significant BMI×ED diagnosis interactions were found for all dimensions except weight/shape concerns. Overall, the current results support the salience of weight suppression across multiple dimensions of ED psychopathology, with the exception of binge eating/vomiting. Copyright © 2015 Elsevier Ltd. All rights reserved.
Structural Fire Fighting Ensembles: Accumulation and Off-gassing of Combustion Products.
Kirk, Katherine M; Logan, Michael B
2015-01-01
Firefighters may be exposed to toxic combustion products not only during fire fighting operations and training, but also afterwards as a result of contact with contaminated structural fire fighting ensembles. This study characterized the deposition of polycyclic aromatic hydrocarbons (PAHs) onto structural fire fighting ensembles and off-gassing of combustion products from ensembles after multiple exposures to hostile structural attack fire environments. A variety of PAHs were deposited onto the outer layer of structural fire fighting ensembles, with no variation in deposition flux between new ensembles and already contaminated ensembles. Contaminants released from ensembles after use included volatile organic compounds, carbonyl compounds, low molecular weight PAHs, and hydrogen cyanide. Air samples collected in a similar manner after laundering of ensembles according to manufacturer specifications indicated that laundering returns off-gassing concentrations of most of the investigated compounds to pre-exposure levels. These findings suggest that contamination of firefighter protective clothing increases with use, and that storage of unlaundered structural fire fighting ensembles in small, unventilated spaces immediately after use may create a source of future exposure to toxic combustion products for fire fighting personnel.
Probability Maps for the Visualization of Assimilation Ensemble Flow Data
Hollt, Thomas
2015-05-25
Ocean forecasts nowadays are created by running ensemble simulations in combination with data assimilation techniques. Most of these techniques resample the ensemble members after each assimilation cycle. This means that in a time series, after resampling, every member can follow up on any of the members before resampling. Tracking behavior over time, such as all possible paths of a particle in an ensemble vector field, becomes very difficult, as the number of combinations rises exponentially with the number of assimilation cycles. In general a single possible path is not of interest but only the probabilities that any point in space might be reached by a particle at some point in time. In this work we present an approach using probability-weighted piecewise particle trajectories to allow such a mapping interactively, instead of tracing quadrillions of individual particles. We achieve interactive rates by binning the domain and splitting up the tracing process into the individual assimilation cycles, so that particles that fall into the same bin after a cycle can be treated as a single particle with a larger probability as input for the next time step. As a result we loose the possibility to track individual particles, but can create probability maps for any desired seed at interactive rates.
Particle filtering with path sampling and an application to a bimodal ocean current model
Weare, Jonathan
2009-07-01
This paper introduces a recursive particle filtering algorithm designed to filter high dimensional systems with complicated non-linear and non-Gaussian effects. The method incorporates a parallel marginalization (PMMC) step in conjunction with the hybrid Monte Carlo (HMC) scheme to improve samples generated by standard particle filters. Parallel marginalization is an efficient Markov chain Monte Carlo (MCMC) strategy that uses lower dimensional approximate marginal distributions of the target distribution to accelerate equilibration. As a validation the algorithm is tested on a 2516 dimensional, bimodal, stochastic model motivated by the Kuroshio current that runs along the Japanese coast. The results of this test indicate that the method is an attractive alternative for problems that require the generality of a particle filter but have been inaccessible due to the limitations of standard particle filtering strategies.
Examination of weight control practices in a non-clinical sample of college women.
Hayes, S; Napolitano, M A
2012-09-01
The current study examined healthy weight control practices among a sample of college women enrolled at an urban university (N=715; age=19.87±1.16; 77.2% Caucasian; 13.4% African American, 7.2% Asian, 2.2% other races). Participants completed measures as part of an on-line study about health habits, behaviors, and attitudes. Items from the Three Factor Eating Questionnaire were selected and evaluated with exploratory factor analysis to create a healthy weight control practices scale. Results revealed that college women, regardless of weight status, used a comparable number (four of eight) of practices. Examination of racial differences between Caucasian and African American women revealed that normal weight African American women used significantly fewer strategies than Caucasian women. Of note, greater use of healthy weight control practices was associated with higher cognitive restraint, drive for thinness, minutes of physical activity, and more frequent use of compensatory strategies. Higher scores on measures of binge and disinhibited eating, body dissatisfaction, negative affect, and depressive symptoms were associated with greater use of healthy weight control practices by underweight/normal weight but not by overweight/obese college women. Results suggest that among a sample of college females, a combination of healthy and potentially unhealthy weight control practices occurs. Implications of the findings suggest the need for effective weight management and eating disorder prevention programs for this critical developmental life stage. Such programs should be designed to help students learn how to appropriately use healthy weight control practices, as motivations for use may vary by weight status.
Han, Mengzhi; Xu, Ji; Ren, Ying
2017-03-01
Intrinsically disordered proteins (IDPs) are a class of proteins that expected to be largely unstructured under physiological conditions. Due to their heterogeneous nature, experimental characterization of IDP is challenging. Temperature replica exchange molecular dynamics (T-REMD) is a widely used enhanced sampling method to probe structural characteristics of these proteins. However, its application has been hindered due to its tremendous computational cost, especially when simulating large systems in explicit solvent. Two methods, parallel tempering well-tempered ensemble (PT-WTE) and replica exchange with solute tempering (REST), have been proposed to alleviate the computational expense of T-REMD. In this work, we select three different IDP systems to compare the sampling characteristics and efficiencies of the two methods Both the two methods could efficiently sample the conformational space of IDP and yield highly consistent results for all the three IDPs. The efficiencies of the two methods: are compatible, with about 5-6 times better than the plain T-REMD. Besides, the advantages and disadvantages of each method are also discussed. Specially, the PT-WTE method could provide temperature dependent data of the system which could not be achieved by REST, while the REST method could readily be used to a part of the system, which is quite efficient to simulate some biological processes. Copyright © 2016 Elsevier Inc. All rights reserved.
[Sampling plan, weighting process and design effects of the Brazilian Oral Health Survey].
Silva, Nilza Nunes da; Roncalli, Angelo Giuseppe
2013-12-01
To present aspects of the sampling plan of the Brazilian Oral Health Survey (SBBrasil Project). with theoretical and operational issues that should be taken into account in the primary data analyses. The studied population was composed of five demographic groups from urban areas of Brazil in 2010. Two and three stage cluster sampling was used. adopting different primary units. Sample weighting and design effects (deff) were used to evaluate sample consistency. In total. 37,519 individuals were reached. Although the majority of deff estimates were acceptable. some domains showed distortions. The majority (90%) of the samples showed results in concordance with the precision proposed in the sampling plan. The measures to prevent losses and the effects the cluster sampling process in the minimum sample sizes proved to be effective for the deff. which did not exceeded 2. even for results derived from weighting. The samples achieved in the SBBrasil 2010 survey were close to the main proposals for accuracy of the design. Some probabilities proved to be unequal among the primary units of the same domain. Users of this database should bear this in mind, introducing sample weighting in calculations of point estimates, standard errors, confidence intervals and design effects.
2018-01-01
ECBC-TR-1506 NIST-TRACEABLE NMR METHOD TO DETERMINE QUANTITATIVE WEIGHT PERCENTAGE PURITY OF MUSTARD (HD) FEEDSTOCK SAMPLES David J...McGarvey RESEARCH AND TECHNOLOGY DIRECTORATE William R. Creasy LEIDOS, INC. Abingdon, MD 21009-1261 Theresa R. Connell EXCET, INC...Jan 2012–May 2012 4. TITLE AND SUBTITLE NIST-Traceable NMR Method to Determine Quantitative Weight Percentage Purity of Mustard (HD) Feedstock
Efendiev, Y.
2009-11-01
The Markov chain Monte Carlo (MCMC) is a rigorous sampling method to quantify uncertainty in subsurface characterization. However, the MCMC usually requires many flow and transport simulations in evaluating the posterior distribution and can be computationally expensive for fine-scale geological models. We propose a methodology that combines coarse- and fine-scale information to improve the efficiency of MCMC methods. The proposed method employs off-line computations for modeling the relation between coarse- and fine-scale error responses. This relation is modeled using nonlinear functions with prescribed error precisions which are used in efficient sampling within the MCMC framework. We propose a two-stage MCMC where inexpensive coarse-scale simulations are performed to determine whether or not to run the fine-scale (resolved) simulations. The latter is determined on the basis of a statistical model developed off line. The proposed method is an extension of the approaches considered earlier where linear relations are used for modeling the response between coarse-scale and fine-scale models. The approach considered here does not rely on the proximity of approximate and resolved models and can employ much coarser and more inexpensive models to guide the fine-scale simulations. Numerical results for three-phase flow and transport demonstrate the advantages, efficiency, and utility of the method for uncertainty assessment in the history matching. Copyright 2009 by the American Geophysical Union.
Directory of Open Access Journals (Sweden)
D. Slobinsky
2015-03-01
Full Text Available We implement the Wang-Landau algorithm to sample with equal probabilities the static configurations of a model granular system. The "non-interacting rigid arch model" used is based on the description of static configurations by means of splitting the assembly of grains into sets of stable arches. This technique allows us to build the entropy as a function of the volume of the packing for large systems. We make a special note of the details that have to be considered when defining the microstates and proposing the moves for the correct sampling in these unusual models. We compare our results with previous exact calculations of the model made at moderate system sizes. The technique opens a new opportunity to calculate the entropy of more complex granular models. Received: 19 January 2015, Accepted: 25 February 2015; Reviewed by: M. Pica Ciamarra, Nanyang Technological University, Singapore; Edited by: C. S. O'Hern; DOI: http://dx.doi.org/10.4279/PIP.070001 Cite as: D Slobinsky, L A Pugnaloni, Papers in Physics 7, 070001 (2015
On Ensemble Nonlinear Kalman Filtering with Symmetric Analysis Ensembles
Luo, Xiaodong
2010-09-19
The ensemble square root filter (EnSRF) [1, 2, 3, 4] is a popular method for data assimilation in high dimensional systems (e.g., geophysics models). Essentially the EnSRF is a Monte Carlo implementation of the conventional Kalman filter (KF) [5, 6]. It is mainly different from the KF at the prediction steps, where it is some ensembles, rather then the means and covariance matrices, of the system state that are propagated forward. In doing this, the EnSRF is computationally more efficient than the KF, since propagating a covariance matrix forward in high dimensional systems is prohibitively expensive. In addition, the EnSRF is also very convenient in implementation. By propagating the ensembles of the system state, the EnSRF can be directly applied to nonlinear systems without any change in comparison to the assimilation procedures in linear systems. However, by adopting the Monte Carlo method, the EnSRF also incurs certain sampling errors. One way to alleviate this problem is to introduce certain symmetry to the ensembles, which can reduce the sampling errors and spurious modes in evaluation of the means and covariances of the ensembles [7]. In this contribution, we present two methods to produce symmetric ensembles. One is based on the unscented transform [8, 9], which leads to the unscented Kalman filter (UKF) [8, 9] and its variant, the ensemble unscented Kalman filter (EnUKF) [7]. The other is based on Stirling’s interpolation formula (SIF), which results in the divided difference filter (DDF) [10]. Here we propose a simplified divided difference filter (sDDF) in the context of ensemble filtering. The similarity and difference between the sDDF and the EnUKF will be discussed. Numerical experiments will also be conducted to investigate the performance of the sDDF and the EnUKF, and compare them to a well‐established EnSRF, the ensemble transform Kalman filter (ETKF) [2].
Incorporating Complex Sample Design Effects When Only Final Survey Weights are Available.
West, Brady T; McCabe, Sean Esteban
2012-10-01
This article considers the situation that arises when a survey data producer has collected data from a sample with a complex design (possibly featuring stratification of the population, cluster sampling, and / or unequal probabilities of selection), and for various reasons only provides secondary analysts of those survey data with a final survey weight for each respondent and "average" design effects for survey estimates computed from the data. In general, these "average" design effects, presumably computed by the data producer in a way that fully accounts for all of the complex sampling features, already incorporate possible increases in sampling variance due to the use of the survey weights in estimation. The secondary analyst of the survey data who then 1) uses the provided information to compute weighted estimates, 2) computes design-based standard errors reflecting variance in the weights (using Taylor Series Linearization, for example), and 3) inflates the estimated variances using the "average" design effects provided is applying a "double" adjustment to the standard errors for the effect of weighting on the variance estimates, leading to overly conservative inferences. We propose a simple method for preventing this problem, and provide a Stata program for applying appropriate adjustments to variance estimates in this situation. We illustrate two applications of the method to survey data from the Monitoring the Future (MTF) study, and conclude with suggested directions for future research in this area.
Directory of Open Access Journals (Sweden)
Khader Samer
2010-01-01
Full Text Available Background: Liquid-based cytology (LBC cervical samples are increasingly being used to test for pathogens, including: HPV, Chlamydia trachomatis (CT and Neisseria gonorrhoeae (GC using nucleic acid amplification tests. Several reports have shown the accuracy of such testing on ThinPrep (TP LBC samples. Fewer studies have evaluated SurePath (SP LBC samples, which utilize a different specimen preservative. This study was undertaken to assess the performance of the Aptima Combo 2 Assay (AC2 for CT and GC on SP versus endocervical swab samples in our laboratory. Materials and Methods: The live pathology database of Montefiore Medical Center was searched for patients with AC2 endocervical swab specimens and SP Paps taken the same day. SP samples from CT- and/or GC-positive endocervical swab patients and randomly selected negative patients were studied. In each case, 1.5 ml of the residual SP vial sample, which was in SP preservative and stored at room temperature, was transferred within seven days of collection to APTIMA specimen transfer tubes without any sample or patient identifiers. Blind testing with the AC2 assay was performed on the Tigris DTS System (Gen-probe, San Diego, CA. Finalized SP results were compared with the previously reported endocervical swab results for the entire group and separately for patients 25 years and younger and patients over 25 years. Results: SP specimens from 300 patients were tested. This included 181 swab CT-positive, 12 swab GC-positive, 7 CT and GC positive and 100 randomly selected swab CT and GC negative patients. Using the endocervical swab results as the patient′s infection status, AC2 assay of the SP samples showed: CT sensitivity 89.3%, CT specificity 100.0%; GC sensitivity and specificity 100.0%. CT sensitivity for patients 25 years or younger was 93.1%, versus 80.7% for patients over 25 years, a statistically significant difference (P = 0.02. Conclusions: Our results show that AC2 assay of 1.5 ml SP
Shaul, Katherine R S; Schultz, Andrew J; Kofke, David A
2012-11-14
We present Mayer-sampling Monte Carlo calculations of the quantum Boltzmann contribution to the virial coefficients B(n), as defined by path integrals, for n = 2 to 4 and for temperatures from 2.6 K to 1000 K, using state-of-the-art ab initio potentials for interactions within pairs and triplets of helium-4 atoms. Effects of exchange are not included. The vapor-liquid critical temperature of the resulting fourth-order virial equation of state is 5.033(16) K, a value only 3% less than the critical temperature of helium-4: 5.19 K. We describe an approach for parsing the Boltzmann contribution into components that reduce the number of Mayer-sampling Monte Carlo steps required for components with large per-step time requirements. We estimate that in this manner the calculation of the Boltzmann contribution to B(3) at 2.6 K is completed at least 100 times faster than the previously reported approach.
Yılmaz Isıkhan, Selen; Karabulut, Erdem; Alpar, Celal Reha
2016-01-01
Background/Aim. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods. Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results. The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion. Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.
Perception of weight and psychological variables in a sample of Spanish adolescents.
Jáuregui-Lobera, Ignacio; Bolaños-Ríos, Patricia; Santiago-Fernández, María José; Garrido-Casals, Olivia; Sánchez, Elsa
2011-01-01
This study explored the relationship between body mass index (BMI) and weight perception, self-esteem, positive body image, food beliefs, and mental health status, along with any gender differences in weight perception, in a sample of adolescents in Spain. The sample comprised 85 students (53 females and 32 males, mean age 17.4 ± 5.5 years) with no psychiatric history who were recruited from a high school in Écija, Seville. Weight and height were recorded for all participants, who were then classified according to whether they perceived themselves as slightly overweight, very overweight, very underweight, slightly underweight, or about the right weight, using the question "How do you think of yourself in terms of weight?". Finally, a series of questionnaires were administered, including the Irrational Food Beliefs Scale, Body Appreciation Scale, Self Esteem Scale, and General Health Questionnaire. Overall, 23.5% of participants misperceived their weight. Taking into account only those with a normal BMI (percentile 5-85), there was a significant gender difference with respect to those who perceived themselves as overweight (slightly overweight and very overweight); 13.9% of females and 7.9% of males perceived themselves as overweight (χ(2) = 3.957, P difference for age, with participants who perceived their weight adequately being of mean age 16.34 ± 3.17 years and those who misperceived their weight being of mean age 18.50 ± 4.02 years (F = 3.112, P < 0.05). Misperception of overweight seems to be more frequent in female adolescents, and mainly among older ones. Misperception of being overweight is associated with a less positive body image, and the perception of being very underweight is associated with higher scores for general psychopathology.
word2vec Skip-Gram with Negative Sampling is a Weighted Logistic PCA
Landgraf, Andrew J.; Bellay, Jeremy
2017-01-01
We show that the skip-gram formulation of word2vec trained with negative sampling is equivalent to a weighted logistic PCA. This connection allows us to better understand the objective, compare it to other word embedding methods, and extend it to higher dimensional models.
Fornari, D; Rebolj, M; Bjerregård, B; Lidang, M; Christensen, I; Høgdall, E; Bonde, J
2016-08-01
In two laboratories (Departments of Pathology, Copenhagen University Hospitals of Herlev and Hvidovre), we compared cobas and Hybrid Capture 2 (HC2) human papillomavirus (HPV) assays using SurePath® samples from women with atypical squamous cells of undetermined significance (ASCUS) at ≥30 years and women after treatment of cervical intraepithelial neoplasia (CIN). Samples from 566 women with ASCUS and 411 women after treatment were routinely tested with HC2 and, thereafter, with cobas. Histological outcomes were retrieved from the Danish Pathology Data Base. We calculated the overall agreement between the assays, and compared their sensitivity and specificity for ≥CIN2. In women with ASCUS, HC2 and cobas testing results were similar in the two laboratories. The overall agreement was 91% (95% CI, 88-93). After CIN treatment, the overall agreement was 87% (95% CI, 82-91) at Herlev and 88% (95% CI, 82-92) at Hvidovre. There were no significant differences in the sensitivity for ≥CIN2 between the two tests [Herlev, 98% (95% CI, 89-100) for HC2 versus 94% (95% CI, 82-99) for cobas; Hvidovre, 97% (95% CI, 83-100) for HC2 versus 100% (95% CI, 88-100) for cobas]. The differences were also not significant for specificity. In women with the studied well-defined clinical indications for HPV testing, cobas and HC2 performed similarly in terms of the detection of HPV and ≥CIN2. © 2016 The Authors. Cytopathology Published by John Wiley & Sons Ltd.
DEFF Research Database (Denmark)
Ejegod, Ditte; Bottari, Fabio; Pedersen, Helle
2016-01-01
This study describes a validation of the BD Onclarity HPV (Onclarity) assay using the international guidelines for HPV test requirements for cervical cancer screening of women 30 years and above using Danish SurePath screening samples. The clinical specificity (0.90, 95% CI: 0.......93). The inter laboratory agreement was 97% with lower confidence bound of 95% (kappa value: 0.92). The BD Onclarity HPV assay fulfills all the international guidelines for a new HPV test to be used in primarily screening. This is the first clinical validation of a new HPV assay using SurePath screening samples...... and thus the Onclarity HPV assay is the first HPV assay to hold an international validation for both SurePath and ThinPrep....
Ensemble estimators for multivariate entropy estimation.
Sricharan, Kumar; Wei, Dennis; Hero, Alfred O
2013-07-01
The problem of estimation of density functionals like entropy and mutual information has received much attention in the statistics and information theory communities. A large class of estimators of functionals of the probability density suffer from the curse of dimensionality, wherein the mean squared error (MSE) decays increasingly slowly as a function of the sample size T as the dimension d of the samples increases. In particular, the rate is often glacially slow of order O(T(-)(γ)(/)(d) ), where γ > 0 is a rate parameter. Examples of such estimators include kernel density estimators, k-nearest neighbor (k-NN) density estimators, k-NN entropy estimators, intrinsic dimension estimators and other examples. In this paper, we propose a weighted affine combination of an ensemble of such estimators, where optimal weights can be chosen such that the weighted estimator converges at a much faster dimension invariant rate of O(T(-1)). Furthermore, we show that these optimal weights can be determined by solving a convex optimization problem which can be performed offline and does not require training data. We illustrate the superior performance of our weighted estimator for two important applications: (i) estimating the Panter-Dite distortion-rate factor and (ii) estimating the Shannon entropy for testing the probability distribution of a random sample.
Efficient Kernel-Based Ensemble Gaussian Mixture Filtering
Liu, Bo
2015-11-11
We consider the Bayesian filtering problem for data assimilation following the kernel-based ensemble Gaussian-mixture filtering (EnGMF) approach introduced by Anderson and Anderson (1999). In this approach, the posterior distribution of the system state is propagated with the model using the ensemble Monte Carlo method, providing a forecast ensemble that is then used to construct a prior Gaussian-mixture (GM) based on the kernel density estimator. This results in two update steps: a Kalman filter (KF)-like update of the ensemble members and a particle filter (PF)-like update of the weights, followed by a resampling step to start a new forecast cycle. After formulating EnGMF for any observational operator, we analyze the influence of the bandwidth parameter of the kernel function on the covariance of the posterior distribution. We then focus on two aspects: i) the efficient implementation of EnGMF with (relatively) small ensembles, where we propose a new deterministic resampling strategy preserving the first two moments of the posterior GM to limit the sampling error; and ii) the analysis of the effect of the bandwidth parameter on contributions of KF and PF updates and on the weights variance. Numerical results using the Lorenz-96 model are presented to assess the behavior of EnGMF with deterministic resampling, study its sensitivity to different parameters and settings, and evaluate its performance against ensemble KFs. The proposed EnGMF approach with deterministic resampling suggests improved estimates in all tested scenarios, and is shown to require less localization and to be less sensitive to the choice of filtering parameters.
Perception of weight and psychological variables in a sample of Spanish adolescents
Directory of Open Access Journals (Sweden)
Jáuregui-Lobera I
2011-06-01
Full Text Available Ignacio Jáuregui-Lobera1,2, Patricia Bolaños-Ríos2, María José Santiago-Fernández2, Olivia Garrido-Casals2, Elsa Sánchez31Department of Nutrition and Bromatology, Pablo de Olavide University, Seville, Spain; 2Behavioral Sciences Institute, Seville, Spain; 3Professional Schools Sagrada Familia, Écija, Seville, SpainBackground: This study explored the relationship between body mass index (BMI and weight perception, self-esteem, positive body image, food beliefs, and mental health status, along with any gender differences in weight perception, in a sample of adolescents in Spain.Methods: The sample comprised 85 students (53 females and 32 males, mean age 17.4 ± 5.5 years with no psychiatric history who were recruited from a high school in Écija, Seville. Weight and height were recorded for all participants, who were then classified according to whether they perceived themselves as slightly overweight, very overweight, very underweight, slightly underweight, or about the right weight, using the question “How do you think of yourself in terms of weight?”. Finally, a series of questionnaires were administered, including the Irrational Food Beliefs Scale, Body Appreciation Scale, Self Esteem Scale, and General Health Questionnaire.Results: Overall, 23.5% of participants misperceived their weight. Taking into account only those with a normal BMI (percentile 5–85, there was a significant gender difference with respect to those who perceived themselves as overweight (slightly overweight and very overweight; 13.9% of females and 7.9% of males perceived themselves as overweight (χ2 = 3.957, P < 0.05. There was a significant difference for age, with participants who perceived their weight adequately being of mean age 16.34 ± 3.17 years and those who misperceived their weight being of mean age 18.50 ± 4.02 years (F = 3.112, P < 0.05.Conclusion: Misperception of overweight seems to be more frequent in female adolescents, and mainly among
Rozemeijer, K.; Naber, S.K.; Penning, C.; Overbeek, L.I.H.; Looman, C.W.; Kok, I.M. de; Matthijsse, S.M.; Rebolj, M.; Kemenade, F.J. van; Ballegooijen, M. van
2017-01-01
Objective To compare the cumulative incidence of cervical cancer diagnosed within 72 months after a normal screening sample between conventional cytology and liquid based cytology tests SurePath and ThinPrep.Design Retrospective population based cohort study.Setting Nationwide network and registry
Experimental real-time multi-model ensemble (MME) prediction of ...
Indian Academy of Sciences (India)
NCMRWF), India. Simple ensemble mean (EMN) giving equal weight to member models, bias-corrected ensemble mean (BCEMn) and MME forecast, where different weights are given to member models, are the products of the algorithm tested here.
Sampling of systematic errors to estimate likelihood weights in nuclear data uncertainty propagation
Helgesson, P.; Sjöstrand, H.; Koning, A. J.; Rydén, J.; Rochman, D.; Alhassan, E.; Pomp, S.
2016-01-01
In methodologies for nuclear data (ND) uncertainty assessment and propagation based on random sampling, likelihood weights can be used to infer experimental information into the distributions for the ND. As the included number of correlated experimental points grows large, the computational time for the matrix inversion involved in obtaining the likelihood can become a practical problem. There are also other problems related to the conventional computation of the likelihood, e.g., the assumption that all experimental uncertainties are Gaussian. In this study, a way to estimate the likelihood which avoids matrix inversion is investigated; instead, the experimental correlations are included by sampling of systematic errors. It is shown that the model underlying the sampling methodology (using univariate normal distributions for random and systematic errors) implies a multivariate Gaussian for the experimental points (i.e., the conventional model). It is also shown that the likelihood estimates obtained through sampling of systematic errors approach the likelihood obtained with matrix inversion as the sample size for the systematic errors grows large. In studied practical cases, it is seen that the estimates for the likelihood weights converge impractically slowly with the sample size, compared to matrix inversion. The computational time is estimated to be greater than for matrix inversion in cases with more experimental points, too. Hence, the sampling of systematic errors has little potential to compete with matrix inversion in cases where the latter is applicable. Nevertheless, the underlying model and the likelihood estimates can be easier to intuitively interpret than the conventional model and the likelihood function involving the inverted covariance matrix. Therefore, this work can both have pedagogical value and be used to help motivating the conventional assumption of a multivariate Gaussian for experimental data. The sampling of systematic errors could also
Sampling of systematic errors to estimate likelihood weights in nuclear data uncertainty propagation
Energy Technology Data Exchange (ETDEWEB)
Helgesson, P., E-mail: petter.helgesson@physics.uu.se [Department of Physics and Astronomy, Uppsala University, Box 516, 751 20 Uppsala (Sweden); Nuclear Research and Consultancy Group NRG, Petten (Netherlands); Sjöstrand, H. [Department of Physics and Astronomy, Uppsala University, Box 516, 751 20 Uppsala (Sweden); Koning, A.J. [Nuclear Research and Consultancy Group NRG, Petten (Netherlands); Department of Physics and Astronomy, Uppsala University, Box 516, 751 20 Uppsala (Sweden); Rydén, J. [Department of Mathematics, Uppsala University, Uppsala (Sweden); Rochman, D. [Paul Scherrer Institute PSI, Villigen (Switzerland); Alhassan, E.; Pomp, S. [Department of Physics and Astronomy, Uppsala University, Box 516, 751 20 Uppsala (Sweden)
2016-01-21
In methodologies for nuclear data (ND) uncertainty assessment and propagation based on random sampling, likelihood weights can be used to infer experimental information into the distributions for the ND. As the included number of correlated experimental points grows large, the computational time for the matrix inversion involved in obtaining the likelihood can become a practical problem. There are also other problems related to the conventional computation of the likelihood, e.g., the assumption that all experimental uncertainties are Gaussian. In this study, a way to estimate the likelihood which avoids matrix inversion is investigated; instead, the experimental correlations are included by sampling of systematic errors. It is shown that the model underlying the sampling methodology (using univariate normal distributions for random and systematic errors) implies a multivariate Gaussian for the experimental points (i.e., the conventional model). It is also shown that the likelihood estimates obtained through sampling of systematic errors approach the likelihood obtained with matrix inversion as the sample size for the systematic errors grows large. In studied practical cases, it is seen that the estimates for the likelihood weights converge impractically slowly with the sample size, compared to matrix inversion. The computational time is estimated to be greater than for matrix inversion in cases with more experimental points, too. Hence, the sampling of systematic errors has little potential to compete with matrix inversion in cases where the latter is applicable. Nevertheless, the underlying model and the likelihood estimates can be easier to intuitively interpret than the conventional model and the likelihood function involving the inverted covariance matrix. Therefore, this work can both have pedagogical value and be used to help motivating the conventional assumption of a multivariate Gaussian for experimental data. The sampling of systematic errors could also
Reeuwijk, Noortje M; Venhuis, Bastiaan J; de Kaste, Dries; Hoogenboom, Ron L A P; Rietjens, Ivonne M C M; Martena, Martijn J
2014-01-01
Herbal food supplements claiming to reduce weight may contain active pharmacological ingredients (APIs) that can be used for the treatment of overweight and obesity. The aim of this study was to determine whether herbal food supplements for weight loss on the Dutch market contain APIs with weight loss properties. Herbal food supplements intended for weight loss (n = 50) were sampled from August 2004 to May 2013. An HPLC-DAD-MS/MS method was used to screen for the presence of the APIs in herbal supplements. In 24 samples the APIs sibutramine, desmethylsibutramine (DMS), didesmethylsibutramine (DDMS), rimonabant, sildenafil and/or the laxative phenolphthalein were identified 41 times. The presence of these APIs was, however, not stated on the label. The potential pharmacological effects of the detected APIs were estimated using data from reported effective doses of approved drugs. Use of 20 of the 24 herbal food supplements may result in potential pharmacological effects. Furthermore, risk assessment of phenolphthalein, a suspected carcinogen and found to be present in 10 supplements, based on the margin of exposure (MOE) approach, resulted in MOE values of 96-30,000. MOE values lower than 10,000 (96-220) were calculated for the daily intake levels of four out of these 10 supplements in which phenolphthalein was found. However, taking into account that weight loss preparations may be used for only a few weeks or months rather than during a lifetime, MOE values may be two to three orders of magnitude higher. The current study shows that the use of food supplements with sibutramine, DMS, DDMS and/or phenolphthalein could result in pharmacological effects.
Province, M A; Rao, D C
1985-01-01
A multifactorial model incorporating temporal trends in its parameters is discussed. The model is a generalization of the tau model of Rice et al. in which the parameters are assumed to be specific functions of time. A special case of this model is fit to data on height, weight, and Quetelet index in 1,067 nuclear families to demonstrate the utility of the approach. The results indicate that there is considerable temporal variation in family resemblance over time for all three traits. For height and Quetelet index, both the transmissibility, comparable to heritability, and residual sibling environmental correlation show temporal changes, while for weight, only the latter exhibits significant trends. Trends were not found in the marital correlation for any of the traits, and only limited evidence was found for trends in the maternal transmission parameter for height. This provides an objective method for evaluating the nature and sources of temporal trends in family resemblance, which can easily be incorporated into the framework of any model-based approach.
Simone, Gabriele; Cordone, Roberto; Serapioni, Raul Paolo; Lecca, Michela
2017-05-01
Retinex theory estimates the human color sensation at any observed point by correcting its color based on the spatial arrangement of the colors in proximate regions. We revise two recent path-based, edge-aware Retinex implementations: Termite Retinex (TR) and Energy-driven Termite Retinex (ETR). As the original Retinex implementation, TR and ETR scan the neighborhood of any image pixel by paths and rescale their chromatic intensities by intensity levels computed by reworking the colors of the pixels on the paths. Our interest in TR and ETR is due to their unique, content-based scanning scheme, which uses the image edges to define the paths and exploits a swarm intelligence model for guiding the spatial exploration of the image. The exploration scheme of ETR has been showed to be particularly effective: its paths are local minima of an energy functional, designed to favor the sampling of image pixels highly relevant to color sensation. Nevertheless, since its computational complexity makes ETR poorly practicable, here we present a light version of it, named Light Energy-driven TR, and obtained from ETR by implementing a modified, optimized minimization procedure and by exploiting parallel computing.
Multilevel ensemble Kalman filtering
Hoel, Hakon
2016-06-14
This work embeds a multilevel Monte Carlo sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF) in the setting of finite dimensional signal evolution and noisy discrete-time observations. The signal dynamics is assumed to be governed by a stochastic differential equation (SDE), and a hierarchy of time grids is introduced for multilevel numerical integration of that SDE. The resulting multilevel EnKF is proved to asymptotically outperform EnKF in terms of computational cost versus approximation accuracy. The theoretical results are illustrated numerically.
Directory of Open Access Journals (Sweden)
X. Yang
2009-07-01
Full Text Available A new class of ensemble filters, called the Diffuse Ensemble Filter (DEnF, is proposed in this paper. The DEnF assumes that the forecast errors orthogonal to the first guess ensemble are uncorrelated with the latter ensemble and have infinite variance. The assumption of infinite variance corresponds to the limit of "complete lack of knowledge" and differs dramatically from the implicit assumption made in most other ensemble filters, which is that the forecast errors orthogonal to the first guess ensemble have vanishing errors. The DEnF is independent of the detailed covariances assumed in the space orthogonal to the ensemble space, and reduces to conventional ensemble square root filters when the number of ensembles exceeds the model dimension. The DEnF is well defined only in data rich regimes and involves the inversion of relatively large matrices, although this barrier might be circumvented by variational methods. Two algorithms for solving the DEnF, namely the Diffuse Ensemble Kalman Filter (DEnKF and the Diffuse Ensemble Transform Kalman Filter (DETKF, are proposed and found to give comparable results. These filters generally converge to the traditional EnKF and ETKF, respectively, when the ensemble size exceeds the model dimension. Numerical experiments demonstrate that the DEnF eliminates filter collapse, which occurs in ensemble Kalman filters for small ensemble sizes. Also, the use of the DEnF to initialize a conventional square root filter dramatically accelerates the spin-up time for convergence. However, in a perfect model scenario, the DEnF produces larger errors than ensemble square root filters that have covariance localization and inflation. For imperfect forecast models, the DEnF produces smaller errors than the ensemble square root filter with inflation. These experiments suggest that the DEnF has some advantages relative to the ensemble square root filters in the regime of small ensemble size, imperfect model, and copious
Directory of Open Access Journals (Sweden)
Hyung-Il Eum
2016-01-01
Full Text Available This study examined the impact of model biases on climate change signals for daily precipitation and for minimum and maximum temperatures. Through the use of multiple climate scenarios from 12 regional climate model simulations, the ensemble mean, and three synthetic simulations generated by a weighting procedure, we investigated intermodel seasonal climate change signals between current and future periods, for both median and extreme precipitation/temperature values. A significant dependence of seasonal climate change signals on the model biases over southern Québec in Canada was detected for temperatures, but not for precipitation. This suggests that the regional temperature change signal is affected by local processes. Seasonally, model bias affects future mean and extreme values in winter and summer. In addition, potentially large increases in future extremes of temperature and precipitation values were projected. For three synthetic scenarios, systematically less bias and a narrow range of mean change for all variables were projected compared to those of climate model simulations. In addition, synthetic scenarios were found to better capture the spatial variability of extreme cold temperatures than the ensemble mean scenario. These results indicate that the synthetic scenarios have greater potential to reduce the uncertainty of future climate projections and capture the spatial variability of extreme climate events.
A Combined Weighting Method Based on Hybrid of Interval Evidence Fusion and Random Sampling
Directory of Open Access Journals (Sweden)
Ying Yan
2017-01-01
Full Text Available Due to the complexity of system and lack of expertise, epistemic uncertainties may present in the experts’ judgment on the importance of certain indices during group decision-making. A novel combination weighting method is proposed to solve the index weighting problem when various uncertainties are present in expert comments. Based on the idea of evidence theory, various types of uncertain evaluation information are uniformly expressed through interval evidence structures. Similarity matrix between interval evidences is constructed, and expert’s information is fused. Comment grades are quantified using the interval number, and cumulative probability function for evaluating the importance of indices is constructed based on the fused information. Finally, index weights are obtained by Monte Carlo random sampling. The method can process expert’s information with varying degrees of uncertainties, which possesses good compatibility. Difficulty in effectively fusing high-conflict group decision-making information and large information loss after fusion is avertible. Original expert judgments are retained rather objectively throughout the processing procedure. Cumulative probability function constructing and random sampling processes do not require any human intervention or judgment. It can be implemented by computer programs easily, thus having an apparent advantage in evaluation practices of fairly huge index systems.
Shih, H C; Tsai, S W; Kuo, C H
2012-01-01
A solid-phase microextraction (SPME) device was used as a diffusive sampler for airborne propylene glycol ethers (PGEs), including propylene glycol monomethyl ether (PGME), propylene glycol monomethyl ether acetate (PGMEA), and dipropylene glycol monomethyl ether (DPGME). Carboxen-polydimethylsiloxane (CAR/PDMS) SPME fiber was selected for this study. A polytetrafluoroethylene (PTFE) tubing was used as the holder, and the SPME fiber assembly was inserted into the tubing as a diffusive sampler. The diffusion path length and area of the sampler were 0.3 cm and 0.00086 cm(2), respectively. The theoretical sampling constants at 30°C and 1 atm for PGME, PGMEA, and DPGME were 1.50 × 10(-2), 1.23 × 10(-2) and 1.14 × 10(-2) cm(3) min(-1), respectively. For evaluations, known concentrations of PGEs around the threshold limit values/time-weighted average with specific relative humidities (10% and 80%) were generated both by the air bag method and the dynamic generation system, while 15, 30, 60, 120, and 240 min were selected as the time periods for vapor exposures. Comparisons of the SPME diffusive sampling method to Occupational Safety and Health Administration (OSHA) organic Method 99 were performed side-by-side in an exposure chamber at 30°C for PGME. A gas chromatography/flame ionization detector (GC/FID) was used for sample analysis. The experimental sampling constants of the sampler at 30°C were (6.93 ± 0.12) × 10(-1), (4.72 ± 0.03) × 10(-1), and (3.29 ± 0.20) × 10(-1) cm(3) min(-1) for PGME, PGMEA, and DPGME, respectively. The adsorption of chemicals on the stainless steel needle of the SPME fiber was suspected to be one of the reasons why significant differences between theoretical and experimental sampling rates were observed. Correlations between the results for PGME from both SPME device and OSHA organic Method 99 were linear (r = 0.9984) and consistent (slope = 0.97 ± 0.03). Face velocity (0-0.18 m/s) also proved to have no effects on the sampler
Assche, van K.; Beunen, R.; Duineveld, M.
2014-01-01
In this chapter we discuss the concept of governance paths and the forms of dependency marking paths. The forms of dependency constitute rigidities in governance evolution, but leave space for flexibility, for path creation.
Xu, Kaixuan; Wang, Jun
2017-02-01
In this paper, recently introduced permutation entropy and sample entropy are further developed to the fractional cases, weighted fractional permutation entropy (WFPE) and fractional sample entropy (FSE). The fractional order generalization of information entropy is utilized in the above two complexity approaches, to detect the statistical characteristics of fractional order information in complex systems. The effectiveness analysis of proposed methods on the synthetic data and the real-world data reveals that tuning the fractional order allows a high sensitivity and more accurate characterization to the signal evolution, which is useful in describing the dynamics of complex systems. Moreover, the numerical research on nonlinear complexity behaviors is compared between the returns series of Potts financial model and the actual stock markets. And the empirical results confirm the feasibility of the proposed model.
Fuglestad, Paul T; Jeffery, Robert W; Sherwood, Nancy E
2012-06-26
Research suggests that the interaction between biological susceptibility and environmental risk is complex and that further study of behavioral typologies related to obesity and associated behaviors is important to further elucidate the nature of obesity risk and how to approach it for intervention. The current investigation aims to identify phenotypical lifestyle patterns that might begin to unify our understanding of obesity and obesity related behaviors. Individuals who had recently lost substantial weight of their own initiative completed measures of intentional weight control behaviors and lifestyle behaviors associated with eating. These behaviors were factor analyzed and the resulting factors were examined in relation to BMI, recent weight loss, diet, and physical activity. Four meaningful lifestyle and weight control behavioral factors were identified- regularity of meals, TV related viewing and eating, intentional strategies for weight control, and eating away from home. Greater meal regularity was associated with greater recent weight loss and greater fruit and vegetable intake. Greater TV related viewing and eating was associated with greater BMI and greater fat and sugar intake. More eating away from home was related to greater fat and sugar intake, lower fruit and vegetable intake, and less physical activity. Greater use of weight control strategies was most consistently related to better weight, diet, and physical activity outcomes. Compared to the individual behavior variables, the identified lifestyle patterns appeared to be more reliably related to diet, physical activity, and weight (both BMI and recent weight loss). These findings add to the growing body of literature identifying behavioral patterns related to obesity and the overall weight control strategy of eating less and exercising more. In future research it will be important to replicate these behavioral factors (over time and in other samples) and to examine how changes in these factors
World Music Ensemble: Kulintang
Beegle, Amy C.
2012-01-01
As instrumental world music ensembles such as steel pan, mariachi, gamelan and West African drums are becoming more the norm than the exception in North American school music programs, there are other world music ensembles just starting to gain popularity in particular parts of the United States. The kulintang ensemble, a drum and gong ensemble…
Development and application of a needle trap device for time-weighted average diffusive sampling.
Gong, Ying; Eom, In-Yong; Lou, Da-Wei; Hein, Dietmar; Pawliszyn, Janusz
2008-10-01
A simple, cost-effective analysis combining solventless extraction, thermal desorption, and determination of volatile organic compounds (VOCs) was developed and validated. A needle trap device (NTD) packed with the sorbent Carboxen1000 was used as a time-weighted average (TWA) diffusive sampler to collect target compounds by molecular diffusion and adsorption to the packed sorbent. This process can be described with derivations of Fick's first law of diffusion, which expresses the relation between the TWA concentrations to which the passive sampler is exposed and the mass of analytes adsorbed to the packed sorbent in the sampler. The effects of experimental factors such as temperature, pressure, humidity, and face velocity were taken into account in applying diffusive sampling under nonideal conditions. This study demonstrates that NTD is effective for air analysis of benzene, toluene, ethylbenzene, and o-xylene (BTEX), due to the good adsorption/desorption quality of Carboxen 1000 and to the special geometric shape of the needle with a small cross section avoiding the need for calibration. Storage tests showed good storage stability for BTEX. Verification of the theoretical model showed good agreement between theoretical and experimental sampling rates. Method validation done against NIOSH method 1501, SPME, and NTD active sampling revealed good agreement between those methods. Automated NTD sample introduction to a gas chromatograph facilitates the use of this technology for industrial hygiene applications.
Oh, Seok-Geun; Suh, Myoung-Seok
2017-07-01
The projection skills of five ensemble methods were analyzed according to simulation skills, training period, and ensemble members, using 198 sets of pseudo-simulation data (PSD) produced by random number generation assuming the simulated temperature of regional climate models. The PSD sets were classified into 18 categories according to the relative magnitude of bias, variance ratio, and correlation coefficient, where each category had 11 sets (including 1 truth set) with 50 samples. The ensemble methods used were as follows: equal weighted averaging without bias correction (EWA_NBC), EWA with bias correction (EWA_WBC), weighted ensemble averaging based on root mean square errors and correlation (WEA_RAC), WEA based on the Taylor score (WEA_Tay), and multivariate linear regression (Mul_Reg). The projection skills of the ensemble methods improved generally as compared with the best member for each category. However, their projection skills are significantly affected by the simulation skills of the ensemble member. The weighted ensemble methods showed better projection skills than non-weighted methods, in particular, for the PSD categories having systematic biases and various correlation coefficients. The EWA_NBC showed considerably lower projection skills than the other methods, in particular, for the PSD categories with systematic biases. Although Mul_Reg showed relatively good skills, it showed strong sensitivity to the PSD categories, training periods, and number of members. On the other hand, the WEA_Tay and WEA_RAC showed relatively superior skills in both the accuracy and reliability for all the sensitivity experiments. This indicates that WEA_Tay and WEA_RAC are applicable even for simulation data with systematic biases, a short training period, and a small number of ensemble members.
Ensemble global ocean forecasting
Brassington, G. B.
2016-02-01
A novel time-lagged ensemble system based on multiple independent cycles has been performed in operations at the Australian Bureau of Meteorology for the past 3 years. Despite the use of only four cycles the ensemble mean provided robustly higher skill and the ensemble variance was a reliable predictor of forecast errors. A spectral analysis comparing the ensemble mean with the members demonstrated the gradual increase in power of random errors with wavenumber up to a saturation length scale imposed by the resolution of the observing system. This system has been upgraded to a near-global 0.1 degree system in a new hybrid six-member ensemble system configuration including a new data assimilation system, cycling pattern and initialisation. The hybrid system consists of two ensemble members per day each with a 3 day cycle. We will outline the performance of both the deterministic and ensemble ocean forecast system.
Mark Setterfield
2015-01-01
Path dependency is defined, and three different specific concepts of path dependency – cumulative causation, lock in, and hysteresis – are analyzed. The relationships between path dependency and equilibrium, and path dependency and fundamental uncertainty are also discussed. Finally, a typology of dynamical systems is developed to clarify these relationships.
Directory of Open Access Journals (Sweden)
Scott Jane A
2012-02-01
Full Text Available Abstract Background Breastfeeding has been shown consistently in observational studies to be protective of overweight and obesity in later life. This study aimed to investigate the association between breastfeeding duration and weight status in a national sample of Australian children and adolescents. Methods A secondary analysis of the 2007 Australian National Children's Nutrition and Physical Activity Survey data involving 2066, males and females aged 9 to 16 years from all Australian states and territories. The effect of breastfeeding duration on weight status was estimated using multivariate logistic regression analysis. Results Compared to those who were never breastfed, children breastfed for ≥6 months were significantly less likely to be overweight (adjusted odds ratio: 0.64, 95%CI: 0.45, 0.91 or obese (adjusted odds ratio: 0.51, 95%CI: 0.29, 0.90 in later childhood, after adjustment for maternal characteristics (age, education and ethnicity and children's age, gender, mean energy intake, level of moderate and vigorous physical activity, screen time and sleep duration. Conclusions Breastfeeding for 6 or more months appears to be protective against later overweight and obesity in this population of Australian children. The beneficial short-term health outcomes of breastfeeding for the infant are well recognised and this study provides further observational evidence of a potential long-term health outcome and additional justification for the continued support and promotion of breastfeeding to six months and beyond.
Liu, Li; Xu, Yue-Ping
2017-04-01
Ensemble flood forecasting driven by numerical weather prediction products is becoming more commonly used in operational flood forecasting applications.In this study, a hydrological ensemble flood forecasting system based on Variable Infiltration Capacity (VIC) model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated.The hydrological model is optimized by parallel programmed ɛ-NSGAII multi-objective algorithm and two respectively parameterized models are determined to simulate daily flows and peak flows coupled with a modular approach.The results indicatethat the ɛ-NSGAII algorithm permits more efficient optimization and rational determination on parameter setting.It is demonstrated that the multimodel ensemble streamflow mean have better skills than the best singlemodel ensemble mean (ECMWF) and the multimodel ensembles weighted on members and skill scores outperform other multimodel ensembles. For typical flood event, it is proved that the flood can be predicted 3-4 days in advance, but the flows in rising limb can be captured with only 1-2 days ahead due to the flash feature. With respect to peak flows selected by Peaks Over Threshold approach, the ensemble means from either singlemodel or multimodels are generally underestimated as the extreme values are smoothed out by ensemble process.
Directory of Open Access Journals (Sweden)
Zohreh Mahmoodi1,
2017-07-01
Full Text Available Objectives: Low birth weight (LBW is one of the major health problems worldwide. It is important to identify the factors that play a role in the incidence of this adverse pregnancy outcome. This study aimed to develop a tool to measure mothers’ lifestyles during pregnancy with a view to the effects of social determinants on health and develop a correlation model of mothers’ lifestyles with LBW. Methods: This study was conducted using methodological and case-control designs in four stages by selecting 750 mothers with infants weighing less than 4000 g using multistage sampling. The questionnaire contained 160 items. Face, content, criterion, and construct validity were used to study the psychometrics of the instrument. Results: After psychometrics, 132 items were approved in six domains. Test results indicated the utility and the high fitness of the model and reasonable relationships adjusted for variables based on conceptual models. Based on the correlation model of lifestyle, occupation (-0.263 and social relationships (0.248 had the greatest overall effect on birth weight. Conclusions: The review of lifestyle dimensions showed that all of the dimensions directly, indirectly, or both affected birth weight. Thus, given the importance and the role of lifestyle as a determinant affecting birth weight, attention, and training interventions are important to promote healthy lifestyles.
imDC: an ensemble learning method for imbalanced classification with miRNA data.
Wang, C Y; Hu, L L; Guo, M Z; Liu, X Y; Zou, Q
2015-01-15
Imbalances typically exist in bioinformatics and are also common in other areas. A drawback of traditional machine learning methods is the relatively little attention given to small sample classification. Thus, we developed imDC, which uses an ensemble learning concept in combination with weights and sample misclassification information to effectively classify imbalanced data. Our method showed better results when compared to other algorithms with UCI machine learning datasets and microRNA data.
Deng, Bai-chuan; Yun, Yong-huan; Liang, Yi-zeng; Yi, Lun-zhao
2014-10-07
In this study, a new optimization algorithm called the Variable Iterative Space Shrinkage Approach (VISSA) that is based on the idea of model population analysis (MPA) is proposed for variable selection. Unlike most of the existing optimization methods for variable selection, VISSA statistically evaluates the performance of variable space in each step of optimization. Weighted binary matrix sampling (WBMS) is proposed to generate sub-models that span the variable subspace. Two rules are highlighted during the optimization procedure. First, the variable space shrinks in each step. Second, the new variable space outperforms the previous one. The second rule, which is rarely satisfied in most of the existing methods, is the core of the VISSA strategy. Compared with some promising variable selection methods such as competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variable elimination (MCUVE) and iteratively retaining informative variables (IRIV), VISSA showed better prediction ability for the calibration of NIR data. In addition, VISSA is user-friendly; only a few insensitive parameters are needed, and the program terminates automatically without any additional conditions. The Matlab codes for implementing VISSA are freely available on the website: https://sourceforge.net/projects/multivariateanalysis/files/VISSA/.
Morgen, C S; Ängquist, L; Baker, J L; Andersen, A M N; Michaelsen, K F; Sørensen, T I A
2017-09-08
Prenatal risk factors for childhood overweight may operate indirectly through development in body size in early life and/or directly independent hereof. We quantified the effects of maternal and paternal body mass index (BMI), maternal age, socioeconomic position (SEP), parity, gestational weight gain, maternal smoking during pregnancy, caesarean section, birth weight, and BMI at 5 and 12 months on BMI and overweight at 7 and 11 years. Family triads with information on maternal, paternal and child BMI at ages 7 (n=29 374) and 11 years (n=18 044) were selected from the Danish National Birth Cohort. Information originated from maternal interviews and medical health examinations. Path analysis was used to estimate the direct and indirect effects of prenatal risk factors on childhood BMI z-scores (BMIz per unit score of the risk factor). Logistic regression was used to examine associations with overweight. The strongest direct effects on BMIz at age 7 were found for maternal and paternal BMI (0.19 BMIz and 0.14 BMIz per parental BMIz), low SEP (0.08 BMIz), maternal smoking (0.12 BMIz) and higher BMIz at 5 and 12 months (up to 0.19 BMIz per infant BMIz). For BMIz at age 11 with BMIz at age 7 included in the model, similar effects were found, but the direct effects of BMIz at age 5 and 12 months were mediated through BMI at age 7 (0.62 BMIz per BMIz). Same results were found for overweight. The sum of the direct effects can be translated to approximate absolute measures: 2.4 kg at 7 years, 5.7 kg at 11 years, in a child with average height and BMI. Parental BMI, low SEP and smoking during pregnancy have persisting, strong and direct effects on child BMI and overweight independent of birth weight and infancy BMI.International Journal of Obesity advance online publication, 31 October 2017; doi:10.1038/ijo.2017.217.
Fast Constrained Spectral Clustering and Cluster Ensemble with Random Projection
Directory of Open Access Journals (Sweden)
Wenfen Liu
2017-01-01
Full Text Available Constrained spectral clustering (CSC method can greatly improve the clustering accuracy with the incorporation of constraint information into spectral clustering and thus has been paid academic attention widely. In this paper, we propose a fast CSC algorithm via encoding landmark-based graph construction into a new CSC model and applying random sampling to decrease the data size after spectral embedding. Compared with the original model, the new algorithm has the similar results with the increase of its model size asymptotically; compared with the most efficient CSC algorithm known, the new algorithm runs faster and has a wider range of suitable data sets. Meanwhile, a scalable semisupervised cluster ensemble algorithm is also proposed via the combination of our fast CSC algorithm and dimensionality reduction with random projection in the process of spectral ensemble clustering. We demonstrate by presenting theoretical analysis and empirical results that the new cluster ensemble algorithm has advantages in terms of efficiency and effectiveness. Furthermore, the approximate preservation of random projection in clustering accuracy proved in the stage of consensus clustering is also suitable for the weighted k-means clustering and thus gives the theoretical guarantee to this special kind of k-means clustering where each point has its corresponding weight.
Online Learning with Ensembles
Urbanczik, R.
1999-01-01
Supervised online learning with an ensemble of students randomized by the choice of initial conditions is analyzed. For the case of the perceptron learning rule, asymptotically the same improvement in the generalization error of the ensemble compared to the performance of a single student is found as in Gibbs learning. For more optimized learning rules, however, using an ensemble yields no improvement. This is explained by showing that for any learning rule $f$ a transform $\\tilde{f}$ exists,...
Various multistage ensembles for prediction of heating energy consumption
Directory of Open Access Journals (Sweden)
Radisa Jovanovic
2015-04-01
Full Text Available Feedforward neural network models are created for prediction of daily heating energy consumption of a NTNU university campus Gloshaugen using actual measured data for training and testing. Improvement of prediction accuracy is proposed by using neural network ensemble. Previously trained feed-forward neural networks are first separated into clusters, using k-means algorithm, and then the best network of each cluster is chosen as member of an ensemble. Two conventional averaging methods for obtaining ensemble output are applied; simple and weighted. In order to achieve better prediction results, multistage ensemble is investigated. As second level, adaptive neuro-fuzzy inference system with various clustering and membership functions are used to aggregate the selected ensemble members. Feedforward neural network in second stage is also analyzed. It is shown that using ensemble of neural networks can predict heating energy consumption with better accuracy than the best trained single neural network, while the best results are achieved with multistage ensemble.
Reeuwijk, N.M.; Venhuis, B.J.; Kaste, de D.; Hoogenboom, L.A.P.; Rietjens, I.; Martena, M.J.
2014-01-01
Herbal food supplements claiming to reduce weight may contain active pharmacological ingredients (APIs) that can be used for the treatment of overweight and obesity. The aim of this study was to determine whether herbal food supplements for weight loss on the Dutch market contain APIs with weight
DEFF Research Database (Denmark)
Ejegod, Ditte Møller; Junge, Jette; Franzmann, Maria
2016-01-01
using adjudicated histological outcomes from Danish women referred for colposcopy. METHODS: 276 women from Copenhagen, Denmark were referred for colposcopy with abnormal cytology and/or a positive HPV test. Two samples for HPV analysis were taken in BD SurePath™ and in the BD cervical brush diluent (CBD......%, 17%, and 22%, for HC2, Onclarity and LA, respectively. CONCLUSION: Overall, the Onclarity HPV assay performed well on SurePath LBC and CBD media, with clinical sensitivity and specificity matching those of HC2 and LA....
On Extending Neural Networks with Loss Ensembles for Text Classification
Hajiabadi, Hamideh; Molla-Aliod, Diego; Monsefi, Reza
2017-01-01
Ensemble techniques are powerful approaches that combine several weak learners to build a stronger one. As a meta learning framework, ensemble techniques can easily be applied to many machine learning techniques. In this paper we propose a neural network extended with an ensemble loss function for text classification. The weight of each weak loss function is tuned within the training phase through the gradient propagation optimization method of the neural network. The approach is evaluated on...
Directory of Open Access Journals (Sweden)
Bowring Anna L
2012-11-01
Full Text Available Abstract Background Self-reported anthropometric data are commonly used to estimate prevalence of obesity in population and community-based studies. We aim to: 1 Determine whether survey participants are able and willing to self-report height and weight; 2 Assess the accuracy of self-reported compared to measured anthropometric data in a community-based sample of young people. Methods Participants (16–29 years of a behaviour survey, recruited at a Melbourne music festival (January 2011, were asked to self-report height and weight; researchers independently weighed and measured a sub-sample. Body Mass Index was calculated and overweight/obesity classified as ≥25kg/m2. Differences between measured and self-reported values were assessed using paired t-test/Wilcoxon signed ranks test. Accurate report of height and weight were defined as Results Of 1405 survey participants, 82% of males and 72% of females self-reported their height and weight. Among 67 participants who were also independently measured, self-reported height and weight were significantly less than measured height (p=0.01 and weight (p Conclusions Self-reported measurements may underestimate weight but accurately identified overweight/obesity in the majority of this sample of young people.
The Ensembl REST API: Ensembl Data for Any Language.
Yates, Andrew; Beal, Kathryn; Keenan, Stephen; McLaren, William; Pignatelli, Miguel; Ritchie, Graham R S; Ruffier, Magali; Taylor, Kieron; Vullo, Alessandro; Flicek, Paul
2015-01-01
We present a Web service to access Ensembl data using Representational State Transfer (REST). The Ensembl REST server enables the easy retrieval of a wide range of Ensembl data by most programming languages, using standard formats such as JSON and FASTA while minimizing client work. We also introduce bindings to the popular Ensembl Variant Effect Predictor tool permitting large-scale programmatic variant analysis independent of any specific programming language. The Ensembl REST API can be accessed at http://rest.ensembl.org and source code is freely available under an Apache 2.0 license from http://github.com/Ensembl/ensembl-rest. © The Author 2014. Published by Oxford University Press.
Bowring, Anna L; Peeters, Anna; Freak-Poli, Rosanne; Lim, Megan Sc; Gouillou, Maelenn; Hellard, Margaret
2012-11-21
Self-reported anthropometric data are commonly used to estimate prevalence of obesity in population and community-based studies. We aim to: 1) Determine whether survey participants are able and willing to self-report height and weight; 2) Assess the accuracy of self-reported compared to measured anthropometric data in a community-based sample of young people. Participants (16-29 years) of a behaviour survey, recruited at a Melbourne music festival (January 2011), were asked to self-report height and weight; researchers independently weighed and measured a sub-sample. Body Mass Index was calculated and overweight/obesity classified as ≥25 kg/m². Differences between measured and self-reported values were assessed using paired t-test/Wilcoxon signed ranks test. Accurate report of height and weight were defined as <2 cm and <2 kg difference between self-report and measured values, respectively. Agreement between classification of overweight/obesity by self-report and measured values was assessed using McNemar's test. Of 1405 survey participants, 82% of males and 72% of females self-reported their height and weight. Among 67 participants who were also independently measured, self-reported height and weight were significantly less than measured height (p=0.01) and weight (p<0.01) among females, but no differences were detected among males. Overall, 52% accurately self-reported height, 30% under-reported, and 18% over-reported; 34% accurately self-reported weight, 52% under-reported and 13% over-reported. More females (70%) than males (35%) under-reported weight (p=0.01). Prevalence of overweight/obesity was 33% based on self-report data and 39% based on measured data (p=0.16). Self-reported measurements may underestimate weight but accurately identified overweight/obesity in the majority of this sample of young people.
The Polyanalytic Ginibre Ensembles
Haimi, Antti; Hedenmalm, Haakan
2013-10-01
For integers n, q=1,2,3,… , let Pol n, q denote the -linear space of polynomials in z and , of degree ≤ n-1 in z and of degree ≤ q-1 in . We supply Pol n, q with the inner product structure of the resulting Hilbert space is denoted by Pol m, n, q . Here, it is assumed that m is a positive real. We let K m, n, q denote the reproducing kernel of Pol m, n, q , and study the associated determinantal process, in the limit as m, n→+∞ while n= m+O(1); the number q, the degree of polyanalyticity, is kept fixed. We call these processes polyanalytic Ginibre ensembles, because they generalize the Ginibre ensemble—the eigenvalue process of random (normal) matrices with Gaussian weight. There is a physical interpretation in terms of a system of free fermions in a uniform magnetic field so that a fixed number of the first Landau levels have been filled. We consider local blow-ups of the polyanalytic Ginibre ensembles around points in the spectral droplet, which is here the closed unit disk . We obtain asymptotics for the blow-up process, using a blow-up to characteristic distance m -1/2; the typical distance is the same both for interior and for boundary points of . This amounts to obtaining the asymptotical behavior of the generating kernel K m, n, q . Following (Ameur et al. in Commun. Pure Appl. Math. 63(12):1533-1584, 2010), the asymptotics of the K m, n, q are rather conveniently expressed in terms of the Berezin measure (and density) [Equation not available: see fulltext.] For interior points | z|1. In contrast, for exterior points | z|>1, we have instead that , where is the harmonic measure at z with respect to the exterior disk . For boundary points, | z|=1, the Berezin measure converges to the unit point mass at z, as with interior points, but the blow-up to the scale m -1/2 exhibits quite different behavior at boundary points compared with interior points. We also obtain the asymptotic boundary behavior of the 1-point function at the coarser local scale q 1
Martin, Ryan J; Chaney, Beth H; Vail-Smith, Karen; Gallucci, Andrew R
2016-01-01
"Weight-conscious drinking" refers to behaviors to restrict calories in conjunction with consuming alcohol and is associated with numerous negative consequences. This behavior has been observed in the college student population but has not been examined among college student athletes. This cross-sectional study assessed drinking, hazardous drinking levels (Alcohol Use Disorders Identification Test-Consumption [AUDIT-C] sum score), and weight-conscious drinking behaviors (for weight loss purposes and for intoxication purposes) using a paper-and-pencil survey that was completed by students at a large, private university in the Southwest United States. The sample for this study included college student nonathletes (n = 482; 212 males and 270 females) who completed the survey in 1 of 34 classes and college student athletes (n = 201; 79 males and 122 females) who completed the survey during practice. These analyses examined whether hazardous drinking level and other personal covariates (gender, race, and athlete status) predicted the 2 weight-conscious drinking behaviors of interest. Among the subsample of students who drank, the same proportion of participants indicated weight-conscious drinking behavior for weight loss and weight-conscious drinking behavior for intoxication (both 24.9%; n = 122). In the multivariate analyses, students with higher hazardous drinking scores and females were significantly more likely to report engaging in both weight-conscious drinking behaviors. In those analyses, neither weight-conscious drinking behavior varied by athlete status. In this sample of college students, hazardous drinking most predicted weight-conscious drinking behavior and superseded gender and athlete status. In response, college health professionals should consider evidenced-based approaches to address hazardous drinking.
A single weighting approach to analyze respondent-driven sampling data
Directory of Open Access Journals (Sweden)
Vadivoo Selvaraj
2016-01-01
Interpretation & conclusions: The proposed weight was comparable to different weights generated by RDSAT. The estimates were comparable to that by RDS II approach. RDS-MOD provided an efficient and easy-to-use method of estimation and regression accounting inter-individual recruits' dependence.
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Parker, Jennifer; Branum, Amy; Axelrad, Daniel; Cohen, Jonathan
2013-05-01
Maternal risk factors have been tabulated for women of childbearing age using defined age ranges. However, statistics for factors strongly related to age may be overly influenced by values for the youngest and oldest women in a range, because pregnancies are most likely for ages 20-35. This report evaluates adjustment methods, based on the probability of pregnancy, for calculating estimates of risk factors for women of childbearing age. Adjusted and unadjusted estimates for environmental and nutritional variables were calculated from the 1999-2008 National Health and Nutrition Examination Survey (NHANES) for women aged 16-49. U.S. births were used to determine the probability of pregnancy. Adjusted and unadjusted estimates differed for some, but not all, examined variables. More marked differences were observed for the environmental variables compared with the nutritional variables. Adjusted estimates were within about 5% of the unadjusted estimates for the nutritional variables. Adjusted geometric means for lead and mercury were about 7%-10% lower, and for polychlorinated biphenyl (or PCB) about 25% lower, than their respective unadjusted geometric means. With few exceptions, different adjustment methods led to similar estimates. When calculating statistics for women of childbearing age, the decision to adjust for age or not to adjust appears to be more important than the choice of adjustment method. Although the results suggest only small differences among adjustment methods, approaches based on the NHANES design and sample weighting methodology may be the most robust for other applications. All material appearing in this report is in the public domain and may be reproduced or copied without permission; citation as to source, however, is appreciated.
Sampling strategies for the analysis of reactive low-molecular weight compounds in air
Henneken, H.
2006-01-01
Within this thesis, new sampling and analysis strategies for the determination of airborne workplace contaminants have been developed. Special focus has been directed towards the development of air sampling methods that involve diffusive sampling. In an introductory overview, the current
Ensemble learning incorporating uncertain registration.
Simpson, Ivor J A; Woolrich, Mark W; Andersson, Jesper L R; Groves, Adrian R; Schnabel, Julia A
2013-04-01
This paper proposes a novel approach for improving the accuracy of statistical prediction methods in spatially normalized analysis. This is achieved by incorporating registration uncertainty into an ensemble learning scheme. A probabilistic registration method is used to estimate a distribution of probable mappings between subject and atlas space. This allows the estimation of the distribution of spatially normalized feature data, e.g., grey matter probability maps. From this distribution, samples are drawn for use as training examples. This allows the creation of multiple predictors, which are subsequently combined using an ensemble learning approach. Furthermore, extra testing samples can be generated to measure the uncertainty of prediction. This is applied to separating subjects with Alzheimer's disease from normal controls using a linear support vector machine on a region of interest in magnetic resonance images of the brain. We show that our proposed method leads to an improvement in discrimination using voxel-based morphometry and deformation tensor-based morphometry over bootstrap aggregating, a common ensemble learning framework. The proposed approach also generates more reasonable soft-classification predictions than bootstrap aggregating. We expect that this approach could be applied to other statistical prediction tasks where registration is important.
Kühnreich, B.; Wagner, S.; Habig, J. C.; Möhler, O.; Saathoff, H.; Ebert, V.
2015-04-01
An advanced in situ diode laser hygrometer for simultaneous, sampling-free detection of interstitial H2 16O and H2 18O vapor was developed and tested in the aerosol interaction and dynamics in atmosphere (AIDA) cloud chamber during dynamic cloud formation processes. The spectrometer to measure isotope-resolved water vapor concentrations comprises two rapidly time-multiplexed DFB lasers near 1.4 and 2.7 µm and an open-path White cell with 227-m absorption path length and 4-m mirror separation. A dynamic water concentration range from 2.6 ppb to 87 ppm for H2 16O and 87 ppt to 3.6 ppm for H2 18O could be achieved and was used to enable a fast and direct detection of dynamic isotope ratio changes during ice cloud formation in the AIDA chamber at temperatures between 190 and 230 K. Relative changes in the H2 18O/H2 16O isotope ratio of 1 % could be detected and resolved with a signal-to-noise ratio of 7. This converts to an isotope ratio resolution limit of 0.15 % at 1-s time resolution.
Directory of Open Access Journals (Sweden)
Sabrina Bertini
2017-07-01
Full Text Available In a collaborative study involving six laboratories in the USA, Europe, and India the molecular weight distributions of a panel of heparin sodium samples were determined, in order to compare heparin sodium of bovine intestinal origin with that of bovine lung and porcine intestinal origin. Porcine samples met the current criteria as laid out in the USP Heparin Sodium monograph. Bovine lung heparin samples had consistently lower average molecular weights. Bovine intestinal heparin was variable in molecular weight; some samples fell below the USP limits, some fell within these limits and others fell above the upper limits. These data will inform the establishment of pharmacopeial acceptance criteria for heparin sodium derived from bovine intestinal mucosa. The method for MW determination as described in the USP monograph uses a single, broad standard calibrant to characterize the chromatographic profile of heparin sodium on high-resolution silica-based GPC columns. These columns may be short-lived in some laboratories. Using the panel of samples described above, methods based on the use of robust polymer-based columns have been developed. In addition to the use of the USP’s broad standard calibrant for heparin sodium with these columns, a set of conditions have been devised that allow light-scattering detected molecular weight characterization of heparin sodium, giving results that agree well with the monograph method. These findings may facilitate the validation of variant chromatographic methods with some practical advantages over the USP monograph method.
Directory of Open Access Journals (Sweden)
Sarah Preisler
Full Text Available New commercially available Human Papillomavirus (HPV assays need to be evaluated in a variety of cervical screening settings. Cobas HPV Test (cobas is a real-time PCR-based assay allowing for separate detection of HPV genotypes 16 and 18 and a bulk of 12 other high-risk genotypes. The aim of the present study, Horizon, was to assess the prevalence of high-risk HPV infections in an area with a high background risk of cervical cancer, where women aged 23-65 years are targeted for cervical screening. We collected 6,258 consecutive cervical samples from the largest cervical screening laboratory in Denmark serving the whole of Copenhagen. All samples were stored in SurePath media. In total, 5,072 samples were tested with cobas, Hybrid Capture 2 High Risk HPV DNA test (HC2 and liquid-based cytology. Of these, 27% tested positive on cobas. This proportion decreased by age, being 43% in women aged 23-29 years and 10% in women aged 60-65 years. HC2 assay was positive in 20% of samples, and cytology was abnormal (≥ atypical squamous cells of undetermined significance for 7% samples. When only samples without recent abnormalities were taken into account, 24% tested positive on cobas, 19% on HC2, and 5% had abnormal cytology. The proportion of positive cobas samples was higher than in the ATHENA trial. The age-standardized cobas positivity vs. cytology abnormality was 3.9 in our study and 1.7 in ATHENA. If in Copenhagen the presently used cytology would be replaced by cobas in women above age 30 years, an extra 11% of women would based on historical data be expected to have a positive cobas test without an underlying cervical intraepithelial lesion grade 3 or worse. Countries with a high prevalence of HPV infections should therefore proceed to primary HPV-based cervical screening with caution.
National Aeronautics and Space Administration — Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve...
Jiménez-Murcia, Susana; Fernández-Aranda, Fernando; Mestre-Bach, Gemma; Granero, Roser; Tárrega, Salomé; Torrubia, Rafael; Aymamí, Neus; Gómez-Peña, Mónica; Soriano-Mas, Carles; Steward, Trevor; Moragas, Laura; Baño, Marta; Del Pino-Gutiérrez, Amparo; Menchón, José M
2017-06-01
Most individuals will gamble during their lifetime, yet only a select few will develop gambling disorder. Gray's Reinforcement Sensitivity Theory holds promise for providing insight into gambling disorder etiology and symptomatology as it ascertains that neurobiological differences in reward and punishment sensitivity play a crucial role in determining an individual's affect and motives. The aim of the study was to assess a mediational pathway, which included patients' sex, personality traits, reward and punishment sensitivity, and gambling-severity variables. The Sensitivity to Punishment and Sensitivity to Reward Questionnaire, the South Oaks Gambling Screen, the Symptom Checklist-Revised, and the Temperament and Character Inventory-Revised were administered to a sample of gambling disorder outpatients (N = 831), diagnosed according to DSM-5 criteria, attending a specialized outpatient unit. Sociodemographic variables were also recorded. A structural equation model found that both reward and punishment sensitivity were positively and directly associated with increased gambling severity, sociodemographic variables, and certain personality traits while also revealing a complex mediational role for these dimensions. To this end, our findings suggest that the Sensitivity to Punishment and Sensitivity to Reward Questionnaire could be a useful tool for gaining a better understanding of different gambling disorder phenotypes and developing tailored interventions.
Viney, N.R.; Bormann, H.; Breuer, L.; Bronstert, A.; Croke, B.F.W.; Frede, H.; Graff, T.; Hubrechts, L.; Huisman, J.A.; Jakeman, A.J.; Kite, G.W.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Willems, P.
2009-01-01
This paper reports on a project to compare predictions from a range of catchment models applied to a mesoscale river basin in central Germany and to assess various ensemble predictions of catchment streamflow. The models encompass a large range in inherent complexity and input requirements. In approximate order of decreasing complexity, they are DHSVM, MIKE-SHE, TOPLATS, WASIM-ETH, SWAT, PRMS, SLURP, HBV, LASCAM and IHACRES. The models are calibrated twice using different sets of input data. The two predictions from each model are then combined by simple averaging to produce a single-model ensemble. The 10 resulting single-model ensembles are combined in various ways to produce multi-model ensemble predictions. Both the single-model ensembles and the multi-model ensembles are shown to give predictions that are generally superior to those of their respective constituent models, both during a 7-year calibration period and a 9-year validation period. This occurs despite a considerable disparity in performance of the individual models. Even the weakest of models is shown to contribute useful information to the ensembles they are part of. The best model combination methods are a trimmed mean (constructed using the central four or six predictions each day) and a weighted mean ensemble (with weights calculated from calibration performance) that places relatively large weights on the better performing models. Conditional ensembles, in which separate model weights are used in different system states (e.g. summer and winter, high and low flows) generally yield little improvement over the weighted mean ensemble. However a conditional ensemble that discriminates between rising and receding flows shows moderate improvement. An analysis of ensemble predictions shows that the best ensembles are not necessarily those containing the best individual models. Conversely, it appears that some models that predict well individually do not necessarily combine well with other models in
DEFF Research Database (Denmark)
Preisler, Sarah; Rebolj, Matejka; Untermann, Anette
2013-01-01
New commercially available Human Papillomavirus (HPV) assays need to be evaluated in a variety of cervical screening settings. Cobas HPV Test (cobas) is a real-time PCR-based assay allowing for separate detection of HPV genotypes 16 and 18 and a bulk of 12 other high-risk genotypes. The aim...... of the present study, Horizon, was to assess the prevalence of high-risk HPV infections in an area with a high background risk of cervical cancer, where women aged 23-65 years are targeted for cervical screening. We collected 6,258 consecutive cervical samples from the largest cervical screening laboratory...... in Denmark serving the whole of Copenhagen. All samples were stored in SurePath media. In total, 5,072 samples were tested with cobas, Hybrid Capture 2 High Risk HPV DNA test (HC2) and liquid-based cytology. Of these, 27% tested positive on cobas. This proportion decreased by age, being 43% in women aged 23...
Spittal, Matthew J; Carlin, John B; Currier, Dianne; Downes, Marnie; English, Dallas R; Gordon, Ian; Pirkis, Jane; Gurrin, Lyle
2016-10-31
The Australian Longitudinal Study on Male Health (Ten to Men) used a complex sampling scheme to identify potential participants for the baseline survey. This raises important questions about when and how to adjust for the sampling design when analyzing data from the baseline survey. We describe the sampling scheme used in Ten to Men focusing on four important elements: stratification, multi-stage sampling, clustering and sample weights. We discuss how these elements fit together when using baseline data to estimate a population parameter (e.g., population mean or prevalence) or to estimate the association between an exposure and an outcome (e.g., an odds ratio). We illustrate this with examples using a continuous outcome (weight in kilograms) and a binary outcome (smoking status). Estimates of a population mean or disease prevalence using Ten to Men baseline data are influenced by the extent to which the sampling design is addressed in an analysis. Estimates of mean weight and smoking prevalence are larger in unweighted analyses than weighted analyses (e.g., mean = 83.9 kg vs. 81.4 kg; prevalence = 18.0 % vs. 16.7 %, for unweighted and weighted analyses respectively) and the standard error of the mean is 1.03 times larger in an analysis that acknowledges the hierarchical (clustered) structure of the data compared with one that does not. For smoking prevalence, the corresponding standard error is 1.07 times larger. Measures of association (mean group differences, odds ratios) are generally similar in unweighted or weighted analyses and whether or not adjustment is made for clustering. The extent to which the Ten to Men sampling design is accounted for in any analysis of the baseline data will depend on the research question. When the goals of the analysis are to estimate the prevalence of a disease or risk factor in the population or the magnitude of a population-level exposure-outcome association, our advice is to adopt an analysis that respects the
Directory of Open Access Journals (Sweden)
Matthew J. Spittal
2016-10-01
Full Text Available Abstract Background The Australian Longitudinal Study on Male Health (Ten to Men used a complex sampling scheme to identify potential participants for the baseline survey. This raises important questions about when and how to adjust for the sampling design when analyzing data from the baseline survey. Methods We describe the sampling scheme used in Ten to Men focusing on four important elements: stratification, multi-stage sampling, clustering and sample weights. We discuss how these elements fit together when using baseline data to estimate a population parameter (e.g., population mean or prevalence or to estimate the association between an exposure and an outcome (e.g., an odds ratio. We illustrate this with examples using a continuous outcome (weight in kilograms and a binary outcome (smoking status. Results Estimates of a population mean or disease prevalence using Ten to Men baseline data are influenced by the extent to which the sampling design is addressed in an analysis. Estimates of mean weight and smoking prevalence are larger in unweighted analyses than weighted analyses (e.g., mean = 83.9 kg vs. 81.4 kg; prevalence = 18.0 % vs. 16.7 %, for unweighted and weighted analyses respectively and the standard error of the mean is 1.03 times larger in an analysis that acknowledges the hierarchical (clustered structure of the data compared with one that does not. For smoking prevalence, the corresponding standard error is 1.07 times larger. Measures of association (mean group differences, odds ratios are generally similar in unweighted or weighted analyses and whether or not adjustment is made for clustering. Conclusions The extent to which the Ten to Men sampling design is accounted for in any analysis of the baseline data will depend on the research question. When the goals of the analysis are to estimate the prevalence of a disease or risk factor in the population or the magnitude of a population-level exposure
Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways.
Seyler, Sean L; Kumar, Avishek; Thorpe, M F; Beckstein, Oliver
2015-10-01
Diverse classes of proteins function through large-scale conformational changes and various sophisticated computational algorithms have been proposed to enhance sampling of these macromolecular transition paths. Because such paths are curves in a high-dimensional space, it has been difficult to quantitatively compare multiple paths, a necessary prerequisite to, for instance, assess the quality of different algorithms. We introduce a method named Path Similarity Analysis (PSA) that enables us to quantify the similarity between two arbitrary paths and extract the atomic-scale determinants responsible for their differences. PSA utilizes the full information available in 3N-dimensional configuration space trajectories by employing the Hausdorff or Fréchet metrics (adopted from computational geometry) to quantify the degree of similarity between piecewise-linear curves. It thus completely avoids relying on projections into low dimensional spaces, as used in traditional approaches. To elucidate the principles of PSA, we quantified the effect of path roughness induced by thermal fluctuations using a toy model system. Using, as an example, the closed-to-open transitions of the enzyme adenylate kinase (AdK) in its substrate-free form, we compared a range of protein transition path-generating algorithms. Molecular dynamics-based dynamic importance sampling (DIMS) MD and targeted MD (TMD) and the purely geometric FRODA (Framework Rigidity Optimized Dynamics Algorithm) were tested along with seven other methods publicly available on servers, including several based on the popular elastic network model (ENM). PSA with clustering revealed that paths produced by a given method are more similar to each other than to those from another method and, for instance, that the ENM-based methods produced relatively similar paths. PSA applied to ensembles of DIMS MD and FRODA trajectories of the conformational transition of diphtheria toxin, a particularly challenging example, showed that
Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways.
Directory of Open Access Journals (Sweden)
Sean L Seyler
2015-10-01
Full Text Available Diverse classes of proteins function through large-scale conformational changes and various sophisticated computational algorithms have been proposed to enhance sampling of these macromolecular transition paths. Because such paths are curves in a high-dimensional space, it has been difficult to quantitatively compare multiple paths, a necessary prerequisite to, for instance, assess the quality of different algorithms. We introduce a method named Path Similarity Analysis (PSA that enables us to quantify the similarity between two arbitrary paths and extract the atomic-scale determinants responsible for their differences. PSA utilizes the full information available in 3N-dimensional configuration space trajectories by employing the Hausdorff or Fréchet metrics (adopted from computational geometry to quantify the degree of similarity between piecewise-linear curves. It thus completely avoids relying on projections into low dimensional spaces, as used in traditional approaches. To elucidate the principles of PSA, we quantified the effect of path roughness induced by thermal fluctuations using a toy model system. Using, as an example, the closed-to-open transitions of the enzyme adenylate kinase (AdK in its substrate-free form, we compared a range of protein transition path-generating algorithms. Molecular dynamics-based dynamic importance sampling (DIMS MD and targeted MD (TMD and the purely geometric FRODA (Framework Rigidity Optimized Dynamics Algorithm were tested along with seven other methods publicly available on servers, including several based on the popular elastic network model (ENM. PSA with clustering revealed that paths produced by a given method are more similar to each other than to those from another method and, for instance, that the ENM-based methods produced relatively similar paths. PSA applied to ensembles of DIMS MD and FRODA trajectories of the conformational transition of diphtheria toxin, a particularly challenging example
A census-weighted, spatially-stratified household sampling strategy for urban malaria epidemiology
Directory of Open Access Journals (Sweden)
Slutsker Laurence
2008-02-01
Full Text Available Abstract Background Urban malaria is likely to become increasingly important as a consequence of the growing proportion of Africans living in cities. A novel sampling strategy was developed for urban areas to generate a sample simultaneously representative of population and inhabited environments. Such a strategy should facilitate analysis of important epidemiological relationships in this ecological context. Methods Census maps and summary data for Kisumu, Kenya, were used to create a pseudo-sampling frame using the geographic coordinates of census-sampled structures. For every enumeration area (EA designated as urban by the census (n = 535, a sample of structures equal to one-tenth the number of households was selected. In EAs designated as rural (n = 32, a geographically random sample totalling one-tenth the number of households was selected from a grid of points at 100 m intervals. The selected samples were cross-referenced to a geographic information system, and coordinates transferred to handheld global positioning units. Interviewers found the closest eligible household to the sampling point and interviewed the caregiver of a child aged Results 4,336 interviews were completed in 473 of the 567 study area EAs from June 2002 through February 2003. EAs without completed interviews were randomly distributed, and non-response was approximately 2%. Mean distance from the assigned sampling point to the completed interview was 74.6 m, and was significantly less in urban than rural EAs, even when controlling for number of households. The selected sample had significantly more children and females of childbearing age than the general population, and fewer older individuals. Conclusion This method selected a sample that was simultaneously population-representative and inclusive of important environmental variation. The use of a pseudo-sampling frame and pre-programmed handheld GPS units is more efficient and may yield a more complete sample than
Directory of Open Access Journals (Sweden)
Don-Roger Parkinson
2016-02-01
Full Text Available Water samples were collected and analyzed for conductivity, pH, temperature and trihalomethanes (THMs during the fall of 2014 at two monitored municipal drinking water source ponds. Both spot (or grab and time weighted average (TWA sampling methods were assessed over the same two day sampling time period. For spot sampling, replicate samples were taken at each site and analyzed within 12 h of sampling by both Headspace (HS- and direct (DI- solid phase microextraction (SPME sampling/extraction methods followed by Gas Chromatography/Mass Spectrometry (GC/MS. For TWA, a two day passive on-site TWA sampling was carried out at the same sampling points in the ponds. All SPME sampling methods undertaken used a 65-µm PDMS/DVB SPME fiber, which was found optimal for THM sampling. Sampling conditions were optimized in the laboratory using calibration standards of chloroform, bromoform, bromodichloromethane, dibromochloromethane, 1,2-dibromoethane and 1,2-dichloroethane, prepared in aqueous solutions from analytical grade samples. Calibration curves for all methods with R2 values ranging from 0.985–0.998 (N = 5 over the quantitation linear range of 3–800 ppb were achieved. The different sampling methods were compared for quantification of the water samples, and results showed that DI- and TWA- sampling methods gave better data and analytical metrics. Addition of 10% wt./vol. of (NH42SO4 salt to the sampling vial was found to aid extraction of THMs by increasing GC peaks areas by about 10%, which resulted in lower detection limits for all techniques studied. However, for on-site TWA analysis of THMs in natural waters, the calibration standard(s ionic strength conditions, must be carefully matched to natural water conditions to properly quantitate THM concentrations. The data obtained from the TWA method may better reflect actual natural water conditions.
Gaskin, Pamela S.; Lai, Pamela; Guy, Devon; Knight, JaDon; Jackson, Maria; Nielsen, Anders L.
2012-01-01
Objective. Barbados, a small developing state at the end of the nutrition transition, faces an obesity epidemic. Although there is hope of stemming the epidemic in childhood, no descriptions of children's dietary and physical activity (PA) patterns are available for planning purposes. We describe the food and activity preferences and adult encouragement of active and sedentary behaviors for children 9–11 years in relation to weight status and the cultural context. Design. We used data from a ...
The relationship between sleep and weight in a sample of adolescents.
Lytle, Leslie A; Pasch, Keryn E; Farbakhsh, Kian
2011-02-01
Research to date in young children and adults shows a strong, inverse relationship between sleep duration and risk for overweight and obesity. Fewer studies examining this relationship have been conducted in adolescents. The purpose of the article is to describe the relationship between sleep and weight in a population of adolescents, controlling for demographics, energy intake, energy expenditure, and depression. This is a cross-sectional study of 723 adolescents participating in population-based studies of the etiologic factors related to obesity. We examined the relationship between three weight-related dependent variables obtained through a clinical assessment and three sleep variables obtained through self-report. Average caloric intake from dietary recalls, average activity counts based on accelerometers, and depression were included as covariates and the analysis was stratified by gender and grade level. Our results show that the relationship between sleep duration and BMI is evident in middle-school boys (β = -0.32, s.e. = 0.06: P sleep patterns have little association with weight in males, but in high-school girls, waking up late on weekends as compared to weekdays is associated with lower body fat (β = -0.80, s.e. = 0.40: P = 0.05) and a healthy weight status (β = -0.28, s.e. = 0.14: P = 0.05). This study adds to the evidence that, particularly for middle-school boys and girls, inadequate sleep is a risk factor for early adolescent obesity. Future research needs to examine the relationship longitudinally and to study potential mediators of the relationship.
The Relationship Between Sleep and Weight in a Sample of Adolescents
Lytle, Leslie A.; Pasch, Keryn E.; Farbakhsh, Kian
2010-01-01
Research to date in young children and adults shows a strong, inverse relationship between sleep duration and risk for overweight and obesity. Fewer studies examining this relationship have been conducted in adolescents. The purpose of the article is to describe the relationship between sleep and weight in a population of adolescents, controlling for demographics, energy intake, energy expenditure, and depression. This is a cross-sectional study of 723 adolescents participating in population-...
A census-weighted, spatially-stratified household sampling strategy for urban malaria epidemiology.
Siri, Jose G; Lindblade, Kim A; Rosen, Daniel H; Onyango, Bernard; Vulule, John M; Slutsker, Laurence; Wilson, Mark L
2008-02-29
Urban malaria is likely to become increasingly important as a consequence of the growing proportion of Africans living in cities. A novel sampling strategy was developed for urban areas to generate a sample simultaneously representative of population and inhabited environments. Such a strategy should facilitate analysis of important epidemiological relationships in this ecological context. Census maps and summary data for Kisumu, Kenya, were used to create a pseudo-sampling frame using the geographic coordinates of census-sampled structures. For every enumeration area (EA) designated as urban by the census (n = 535), a sample of structures equal to one-tenth the number of households was selected. In EAs designated as rural (n = 32), a geographically random sample totalling one-tenth the number of households was selected from a grid of points at 100 m intervals. The selected samples were cross-referenced to a geographic information system, and coordinates transferred to handheld global positioning units. Interviewers found the closest eligible household to the sampling point and interviewed the caregiver of a child aged < 10 years. The demographics of the selected sample were compared with results from the Kenya Demographic and Health Survey to assess sample validity. Results were also compared among urban and rural EAs. 4,336 interviews were completed in 473 of the 567 study area EAs from June 2002 through February 2003. EAs without completed interviews were randomly distributed, and non-response was approximately 2%. Mean distance from the assigned sampling point to the completed interview was 74.6 m, and was significantly less in urban than rural EAs, even when controlling for number of households. The selected sample had significantly more children and females of childbearing age than the general population, and fewer older individuals. This method selected a sample that was simultaneously population-representative and inclusive of important environmental
Ytreberg, F Marty; Borcherds, Wade; Wu, Hongwei; Daughdrill, Gary W
2015-01-01
A short segment of the disordered p53 transactivation domain (p53TAD) forms an amphipathic helix when bound to the E3 ubiquitin ligase, MDM2. In the unbound p53TAD, this short segment has transient helical secondary structure. Using a method that combines broad sampling of conformational space with re-weighting, it is shown that it is possible to generate multiple, independent structural ensembles that have highly similar secondary structure distributions for both p53TAD and a P27A mutant. Fractional amounts of transient helical secondary structure were found at the MDM2 binding site that are very similar to estimates based directly on experimental observations. Structures were identified in these ensembles containing segments that are highly similar to short p53 peptides bound to MDM2, even though the ensembles were re-weighted using unbound experimental data. Ensembles were generated using chemical shift data (alpha carbon only, or in combination with other chemical shifts) and cross-validated by predicting residual dipolar couplings. We think this ensemble generator could be used to predict the bound state structure of protein interaction sites in IDPs if there are detectable amounts of matching transient secondary structure in the unbound state.
Ensemble Forecasting of Major Solar Flares
Guerra, J A; Uritsky, V M
2015-01-01
We present the results from the first ensemble prediction model for major solar flares (M and X classes). Using the probabilistic forecasts from three models hosted at the Community Coordinated Modeling Center (NASA-GSFC) and the NOAA forecasts, we developed an ensemble forecast by linearly combining the flaring probabilities from all four methods. Performance-based combination weights were calculated using a Monte Carlo-type algorithm by applying a decision threshold $P_{th}$ to the combined probabilities and maximizing the Heidke Skill Score (HSS). Using the probabilities and events time series from 13 recent solar active regions (2012 - 2014), we found that a linear combination of probabilities can improve both probabilistic and categorical forecasts. Combination weights vary with the applied threshold and none of the tested individual forecasting models seem to provide more accurate predictions than the others for all values of $P_{th}$. According to the maximum values of HSS, a performance-based weights ...
Shih, Weichung Joe; Li, Gang; Wang, Yining
2016-03-01
Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one. Copyright © 2015 Elsevier Inc. All rights reserved.
The semantic similarity ensemble
Directory of Open Access Journals (Sweden)
Andrea Ballatore
2013-12-01
Full Text Available Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Alhusen, Jeanne L; Geller, Ruth; Dreisbach, Caitlin; Constantoulakis, Leeza; Siega-Riz, Anna Maria
To examine the effects of intimate partner violence (IPV) at varied time points in the perinatal period on inadequate and excessive gestational weight gain. Retrospective cohort using population-based secondary data. Pregnancy Risk Assessment Monitoring System and birth certificate data from New York City and 35 states. Data were obtained for 251,342 U.S. mothers who gave birth from 2004 through 2011 and completed the Pregnancy Risk Assessment Monitoring System survey 2 to 9 months after birth. The exposure was perinatal IPV, defined as experiencing physical abuse by a current or ex-partner in the year before or during pregnancy. Adequacy of gestational weight gain (GWG) was categorized using 2009 Institute of Medicine guidelines. Weighted descriptive statistics and multivariate logistic regression models were used. Approximately 6% of participants reported perinatal IPV, 2.7% reported IPV in the year before pregnancy, 1.1% reported IPV during pregnancy only, and the remaining 2.5% reported IPV before and during pregnancy. Inadequate GWG was more prevalent among participants who experienced IPV during pregnancy and those who experienced IPV before and during pregnancy (23.3% and 23.5%, respectively) than in participants who reported no IPV (20.2%; p modeling; only participants who experienced IPV before pregnancy had weakly significant odds of excessive GWG (adjusted odds ratio = 1.14, 95% CI [1.02, 1.26]). The association between perinatal IPV and inadequate GWG was explained by confounding variables; however, women who reported perinatal IPV had greater rates of GWG outside the optimal range. Future studies are needed to determine how relevant confounding variables may affect a woman's GWG. Copyright © 2017 AWHONN, the Association of Women’s Health, Obstetric and Neonatal Nurses. Published by Elsevier Inc. All rights reserved.
Precipitation Ensembles from Single-Value Forecasts for Hydrological Ensemble Forecasting
Demargne, J.; Schaake, J.; Wu, L.; Welles, E.; Herr, H.; Seo, D.
2005-05-01
An ensemble pre-processor was developed to produce short-term precipitation ensembles using operational single-value forecasts. The methodology attempts to quantify the uncertainty in the single-value forecast and to capture the skill therein. These precipitation ensemble forecasts could be then ingested in the NOAA/National Weather Service (NWS) Ensemble Streamflow Prediction (ESP) system to produce probabilistic hydrological forecasts that reflect the uncertainty in forecast precipitation. The procedure constructs the joint distribution of forecast and observed precipitation from historical pairs of forecast and observed values. The probability distribution function of the future events that may occur given a particular single-value forecast is then the conditional distribution of observed precipitation given the forecast. To generate individual ensemble members for each lead time and each location, the historical observed values are replaced with values sampled from the conditional distribution given the single-value forecast. The replacement procedure matches the ranks of historical and rescaled values to preserve the space-time properties of observed precipitation in the ensemble traces. Currently, the ensemble pre-processor is being tested and evaluated at four NOAA/NWS River Forecast Centers (RFCs) in the U.S.A. In this contribution, we present the results thus far from the field and retrospective evaluations, and key science issues that must be addressed toward national operational implementation.
Bridging the ensemble Kalman filter and particle filters
Energy Technology Data Exchange (ETDEWEB)
Stordal, Andreas Stoerksen; Karlsen, Hans A.; Naevdal, Geir; Skaug, Hans J.; Valles, Brice
2009-12-15
The nonlinear filtering problem occurs in many scientific areas. Sequential Monte Carlo solutions with the correct asymptotic behavior such as particle filters exist but they are computationally too expensive when working with high-dimensional systems. The ensemble Kalman filter is a more robust method that has shown promising results with a small sample size but the samples are not guaranteed to come from the true posterior distribution. By approximating the model error with Gaussian kernels we get the advantage of both a Kalman correction and a weighting step. The resulting Gaussian mixture filter has the advantage of both a local Kalman type correction and the weighting/re sampling step of a particle filter. The Gaussian mixture approximation relies on a tunable bandwidth parameter which often has to be kept quite large in order to avoid weight collapse in high dimensions. As a result, the Kalman correction is too large to capture highly non-Gaussian posterior distributions. In this paper we have extended the Gaussian mixture filter (Hoteit et al., 2008b) and also made the connection to particle filters more transparent. In particular we introduce a tuning parameter for the importance weights. In the last part of the paper we have performed a simulation experiment with the Lorenz40 model where our method has been compared to the EnKF and a full implementation of a particle filter. The results clearly indicate that the new method has advantages compared to the standard EnKF. (Author)
Directory of Open Access Journals (Sweden)
Pamela S. Gaskin
2012-01-01
Full Text Available Objective. Barbados, a small developing state at the end of the nutrition transition, faces an obesity epidemic. Although there is hope of stemming the epidemic in childhood, no descriptions of children's dietary and physical activity (PA patterns are available for planning purposes. We describe the food and activity preferences and adult encouragement of active and sedentary behaviors for children 9–11 years in relation to weight status and the cultural context. Design. We used data from a pilot study preceding a large-scale ongoing study on the local drivers of the obesity epidemic among preadolescent children. PA, sedentary activity, and dietary intakes were assessed from recalls. Weight and height were measured. Setting. Barbados. Subjects. Sixty-two (62, 9–11-year-old school children. Results. Sugar-sweetened beverages provided 21% of energy consumed. Energy intake significantly explained BMI. Parents selected significantly more of children’s sedentary activities and encouraged mostly homework and chores (59%. Children’s self-selected school-based activity was significantly related to BMI. Conclusions. Childhood obesity prevention recommendations and research should focus on culture-specific practices that promote acquired taste for excess sugar and parent-child interactions regarding PA. Child influenced by school-based activity intervention may an important area for preventive intervention research.
Woo, Kang-Lyung
2005-01-01
Low molecular weight alcohols including fusel oil were determined using diethyl ether extraction and capillary gas chromatography. Twelve kinds of alcohols were successfully resolved on the HP-FFAP (polyethylene glycol) capillary column. The diethyl ether extraction method was very useful for the analysis of alcohols in alcoholic beverages and biological samples with excellent cleanliness of the resulting chromatograms and high sensitivity compared to the direct injection method. Calibration graphs for all standard alcohols showed good linearity in the concentration range used, 0.001-2% (w/v) for all alcohols. Salting out effects were significant (p < 0.01) for the low molecular weight alcohols methanol, isopropanol, propanol, 2-butanol, n-butanol and ethanol, but not for the relatively high molecular weight alcohols amyl alcohol, isoamyl alcohol, and heptanol. The coefficients of variation of the relative molar responses were less than 5% for all of the alcohols. The limits of detection and quantitation were 1-5 and 10-60 microg/L for the diethyl ether extraction method, and 10-50 and 100-350 microg/L for the direct injection method, respectively. The retention times and relative retention times of standard alcohols were significantly shifted in the direct injection method when the injection volumes were changed, even with the same analysis conditions, but they were not influenced in the diethyl ether extraction method. The recoveries by the diethyl ether extraction method were greater than 95% for all samples and greater than 97% for biological samples.
Race, education, and weight change in a biracial sample of women at midlife.
Lewis, Tené T; Everson-Rose, Susan A; Sternfeld, Barbara; Karavolos, Kelly; Wesley, Deidre; Powell, Lynda H
2005-03-14
Overall rates of obesity have increased dramatically in the United States, yet African American women remain disproportionately represented among the overweight and obese. The excess weight observed in African American women is primarily considered a result of low socioeconomic status, but recent cross-sectional findings suggest otherwise. We examined the interactive effects of race and 3 levels of education (low [high school or less]; moderate [some college]; and high [college degree or more]) on body mass index (BMI) (calculated as weight in kilograms divided by the square of height in meters) and changes in BMI over 4 years in 2019 middle-aged African American and white women from the Study of Women's Health Across the Nation (SWAN). Data were analyzed with mixed effects regression models. At baseline, we observed a significant race x education interaction (estimate, -3.7; 95% confidence interval, -5.3 to -2.1; Peducation. At the lowest level of education, African American and white women were similar in BMI (means, 31.1 [African American] and 31.2 [white]). Body mass index increased significantly for all women over follow-up (estimate, 0.22; 95% confidence interval, 0.17 to 0.26; Peducation, or race x education. Results were unchanged after adjustment for potential confounding variables. For middle-aged women, racial disparities in BMI are largely patterned by education, with the greatest disparities observed at higher levels of education. The absence of significant longitudinal effects suggests that these race-education patterns are set in place and well established before midlife.
Disentangling the Weight of School Dropout Predictors: A Test on Two Longitudinal Samples.
Janosz, Michel; LeBlanc, Marc; Boulerice, Bernard; Tremblay, Richard E.
1997-01-01
Predictors of school dropout and their stability over time were studied in 791 white French-speaking Canadian students aged 12 to 16 years in 1974 and 791 similar students in 1985 through self-administered questionnaires. School, family, behavioral, social, and personality variables predicted dropping out in both samples, and these predictors were…
Thiryayi, Sakinah A; Marshall, Janet; Rana, Durgesh N
2009-05-01
A recent audit at our institution revealed a higher number of cases diagnosed as endocervical glandular neoplasia on ThinPrep (TP) cervical cytology samples (9 cases) as opposed to SurePath (SP) (1 case), which on histology showed only high-grade cervical intraepithelial neoplasia (CIN) with endocervical crypt involvement (CI). We attempted to ascertain the reasons for this finding by reviewing the available slides of these cases, as well as slides of cases diagnosed as glandular neoplasia on cytology and histology; cases diagnosed as high-grade squamous intraepithelial lesions (HSIL) on cytology which had CIN with CI on histology and cases with mixed glandular and squamous abnormalities diagnosed both cytologically and histologically. Single neoplastic glandular cells and short pseudostratified strips were more prevalent in SP than TP with the cell clusters in glandular neoplasia 3-4 cells thick, in contrast to the dense crowded centre of cell groups in HSIL with CI. The cells at the periphery of groups can be misleading. Cases with HSIL and glandular neoplasia have a combination of the features of each entity in isolation. The diagnosis of glandular neoplasia remains challenging and conversion from conventional to liquid based cervical cytology requires a period of learning and adaptation, which can be facilitated by local audit and review of the cytology slides in cases with a cytology-histology mismatch. (c) 2009 Wiley-Liss, Inc.
The impact of obesity and weight gain on development of sleep problems in a population-based sample.
Palm, Andreas; Janson, Christer; Lindberg, Eva
2015-05-01
The objective of this study was to investigate the role of obesity and weight gain in the development of sleep problems in a population-based cohort. A population-based sample of men (n = 1896, aged 40-79 years) and women (n = 5116, age ≥20 years) responded to questionnaires at baseline and follow-up after 10-13 years. Sleep problems were assessed through questions about difficulties initiating sleep (DIS), difficulties maintaining sleep (DMS), excessive daytime sleepiness (EDS), and insomnia. Body mass index (BMI) was calculated from self-reported weight and height at both baseline and follow-up, while confounding factors (physical activity, tobacco and alcohol use, somatic disease, and snoring) were based on responses at baseline. Although overweight and obese subjects reported more sleep problems at baseline, there was no independent association between BMI level at baseline and development of new sleep problems. Subjects in the quartile with the highest rise in BMI with a weight gain exceeding 2.06 kg/m2 had a higher risk of developing DMS [adjusted odds ratio (OR) 1.58; 95% confidence interval (CI) 1.25-2.01), EDS (2.25; 1.65-3.06], and insomnia (2.78; 1.60-4.82). Weight gain was not associated with the development of DIS. Weight gain is an independent risk factor for developing several sleep problems and daytime sleepiness. The presence of overweight and weight gain should be considered when treating patients with sleep problems. Copyright © 2015 Elsevier B.V. All rights reserved.
Ensemble Learning for Multi-Source Neural Machine Translation
Garmash, E.; Monz, C.
2016-01-01
In this paper we describe and evaluate methods to perform ensemble prediction in neural machine translation (NMT). We compare two methods of ensemble set induction: sampling parameter initializations for an NMT system, which is a relatively established method in NMT (Sutskever et al., 2014), and NMT
DEFF Research Database (Denmark)
Hansen, Lars Kai; Salamon, Peter
1990-01-01
We propose several means for improving the performance an training of neural networks for classification. We use crossvalidation as a tool for optimizing network parameters and architecture. We show further that the remaining generalization error can be reduced by invoking ensembles of similar...... networks....
ESPC Coupled Global Ensemble Design
2014-09-30
square error of the ensemble mean (RMSE), Continuous Ranked Probability Skill score ( CRPS ), the bias of the ensemble mean, and a measure of the...boundary conditions use a multi-year history of global ocean model output. RESULTS Atmosphere component: 1) Milestone: Develop ensemble...significant improvements, mainly in the ensemble 10m wind speed RMSE and CRPS metrics. An example of these improvements is shown for the case of the
DEFF Research Database (Denmark)
Madsen, Mogens Ove
Begrebet Path Dependence blev oprindelig udviklet inden for New Institutionel Economics af bl.a. David, Arthur og North. Begrebet har spredt sig vidt i samfundsvidenskaberne og undergået en udvikling. Dette paper propagerer for at der er sket så en så omfattende udvikling af begrebet, at man nu kan...... tale om 1. og 2. generation af Path Dependence begrebet. Den nyeste udvikling af begrebet har relevans for metodologi-diskusionerne i relation til Keynes...
SSAGES: Software Suite for Advanced General Ensemble Simulations.
Energy Technology Data Exchange (ETDEWEB)
Sidky, Hythem; Colon, Yamil J.; Helfferich, Julian; Sikora, Benjamin J.; Bezik, Cody; Chu, Weiwei; Giberti, Federico; Guo, Ashley Z.; Jiang, Xikai; Lequieu, Joshua P.; Webb, Michael; de Pablo, Juan J.
2018-01-28
Molecular simulation has emerged as an essential tool for modern-day research, but obtaining proper results and making reliable conclusions from simulations requires adequate sampling of the system under consideration. To this end, a variety of methods exist in the literature that can enhance sampling considerably, and increasingly sophisticated, effective algorithms continue to be developed at a rapid pace. Implementation of these techniques, however, can be challenging for experts and non-experts alike. There is a clear need for software that provides rapid, reliable, and easy access to a wide range of advanced sampling methods, and that facilitates implementation of new techniques as they emerge. Here we present SSAGES, a publicly available Software Suite for Advanced General Ensemble Simulations designed to interface with multiple widely used molecular dynamics simulations packages. SSAGES allows facile application of a variety of enhanced sampling techniques—including adaptive biasing force, string methods, and forward flux sampling—that extract meaningful free energy and transition path data from all-atom and coarse grained simulations. A noteworthy feature of SSAGES is a user-friendly framework that facilitates further development and implementation of new methods and collective variables. In this work, the use of SSAGES is illustrated in the context of simple representative applications involving distinct methods and different collective variables that are available in the current release of the suite.
NMR studies of dynamic biomolecular conformational ensembles.
Torchia, Dennis A
2015-02-01
Multidimensional heteronuclear NMR approaches can provide nearly complete sequential signal assignments of isotopically enriched biomolecules. The availability of assignments together with measurements of spin relaxation rates, residual spin interactions, J-couplings and chemical shifts provides information at atomic resolution about internal dynamics on timescales ranging from ps to ms, both in solution and in the solid state. However, due to the complexity of biomolecules, it is not possible to extract a unique atomic-resolution description of biomolecular motions even from extensive NMR data when many conformations are sampled on multiple timescales. For this reason, powerful computational approaches are increasingly applied to large NMR data sets to elucidate conformational ensembles sampled by biomolecules. In the past decade, considerable attention has been directed at an important class of biomolecules that function by binding to a wide variety of target molecules. Questions of current interest are: "Does the free biomolecule sample a conformational ensemble that encompasses the conformations found when it binds to various targets; and if so, on what time scale is the ensemble sampled?" This article reviews recent efforts to answer these questions, with a focus on comparing ensembles obtained for the same biomolecules by different investigators. A detailed comparison of results obtained is provided for three biomolecules: ubiquitin, calmodulin and the HIV-1 trans-activation response RNA. Published by Elsevier B.V.
Kernel Factory: An Ensemble of Kernel Machines
M. BALLINGS; D. VAN DEN POEL
2012-01-01
We propose an ensemble method for kernel machines. The training data is randomly split into a number of mutually exclusive partitions defined by a row and column parameter. Each partition forms an input space and is transformed by a kernel function into a kernel matrix K. Subsequently, each K is used as training data for a base binary classifier (Random Forest). This results in a number of predictions equal to the number of partitions. A weighted average combines the predictions into one fina...
DEFF Research Database (Denmark)
Fratini, Gerardo; Ibrom, Andreas; Arriga, Nicola
2012-01-01
It has been formerly recognised that increasing relative humidity in the sampling line of closed-path eddy-covariance systems leads to increasing attenuation of water vapour turbulent fluctuations, resulting in strong latent heat flux losses. This occurrence has been analyzed for very long (50 m)...
Bondas, Terese
2006-07-01
The aim was to explore why nurses enter nursing leadership and apply for a management position in health care. The study is part of a research programme in nursing leadership and evidence-based care. Nursing has not invested enough in the development of nursing leadership for the development of patient care. There is scarce research on nurses' motives and reasons for committing themselves to a career in nursing leadership. A strategic sample of 68 Finnish nurse leaders completed a semistructured questionnaire. Analytic induction was applied in an attempt to generate a theory. A theory, Paths to Nursing Leadership, is proposed for further research. Four different paths were found according to variations between the nurse leaders' education, primary commitment and situational factors. They are called the Path of Ideals, the Path of Chance, the Career Path and the Temporary Path. Situational factors and role models of good but also bad nursing leadership besides motivational and educational factors have played a significant role when Finnish nurses have entered nursing leadership. The educational requirements for nurse leaders and recruitment to nursing management positions need serious attention in order to develop a competent nursing leadership.
Tailored Random Graph Ensembles
Roberts, E. S.; Annibale, A.; Coolen, A. C. C.
2013-02-01
Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.
Kingston Soundpainting Ensemble
Minors, Helen Julia; Kingston Soundpainting Ensemble
2012-01-01
This performance is designed to introduce teachers and school musicians to this live multidisciplinary live composing sign language.\\ud \\ud Led by Dr. Helen Julia Minors (soundpainter, trumpet, voice), the Kingston Soundpainting Ensemble, led by Dr. Minors at Kington University, is representated by a section a varied set of performers, using woodwind, brass, voice and percussion, spanning popular, classical and world styles. This performance consists of:\\ud \\ud Philip Warda (electronic instru...
A review of issues in ensemble-based Kalman filtering
Energy Technology Data Exchange (ETDEWEB)
Ehrendorfer, M. [Dept. of Meteorology and Geophysics, The Univ. of Reading (United Kingdom)
2007-12-15
Ensemble-based data assimilation methods related to the fundamental theory of Kalman filtering have been explored in a variety of mostly non-operational data assimilation contexts over the past decade with increasing intensity. While promising properties have been reported, a number of issues that arise in the development and application of ensemble-based data assimilation techniques, such as in the basic form of the ensemble Kalman filter (EnKF), still deserve particular attention. The necessity of employing an ensemble of small size represents a fundamental issue which in turn leads to several related points that must be carefully considered. In particular, the need to correct for sampling noise in the covariance structure estimated from the finite ensemble must be mentioned. Covariance inflation, localization through a Schur/Hadamard product, preventing the occurrence of filter divergence and inbreeding, as well as the loss of dynamical balances, are all issues directly related to the use of small ensemble sizes. Attempts to reduce effectively the sampling error due to small ensembles and at the same time maintaining an ensemble spread that realistically describes error structures have given rise to the development of variants of the basic form of the EnKF. These include, for example, the Ensemble Adjustment Kalman Filter (EAKF), the Ensemble Transform Kalman Filter (ETKF), the Ensemble Square-Root Filter (EnSRF), and the Local Ensemble Kalman Filter (LEKF). Further important considerations within ensemble-based Kalman filtering concern issues such as the treatment of model error, stochastic versus deterministic updating algorithms, the case of implementation and computational cost, serial processing of observations, avoiding the appearance of undesired dynamic imbalances, and the treatment of non-Gaussianity and nonlinearity. The discussion of the above issues within ensemble-based Kalman filtering forms the central topic of this article, that starts out with a
Gorman, Mark J.; Sogg, Stephanie; Lamont, Evan M.; Eddy, Kamryn T.; Becker, Anne E.; Thomas, Jennifer J.
2014-01-01
Objective: To evaluate the effectiveness of specific self-report questionnaires in detecting DSM-5 eating disorders identified via structured clinical interview in a weight-loss treatment–seeking obese sample, to improve eating disorder recognition in general clinical settings. Method: Individuals were recruited over a 3-month period (November 2, 2011, to January 10, 2012) when initially presenting to a hospital-based weight-management center in the northeastern United States, which offers evaluation and treatment for outpatients who are overweight or obese. Participants (N = 100) completed the Structured Clinical Interview for DSM-IV eating disorder module, a DSM-5 feeding and eating disorders interview, and a battery of self-report questionnaires. Results: Self-reports and interviews agreed substantially in the identification of bulimia nervosa (DSM-IV and DSM-5: tau-b = 0.71, P eating disorder (DSM-IV and DSM-5: tau-b = 0.60, P eating disorder (tau-b = 0.44, P eating syndrome: tau-b = –0.04, P = .72, r = 0.06 [DSM-5]). Discussion: Current self-report assessments are likely to identify full syndrome DSM-5 eating disorders in treatment-seeking obese samples, but unlikely to detect DSM-5 other specified feeding or eating disorders. We propose specific content changes that might enhance clinical utility as suggestions for future evaluation. PMID:25667810
Climate Model Ensemble Methodology: Rationale and Challenges
Vezer, M. A.; Myrvold, W.
2012-12-01
A tractable model of the Earth's atmosphere, or, indeed, any large, complex system, is inevitably unrealistic in a variety of ways. This will have an effect on the model's output. Nonetheless, we want to be able to rely on certain features of the model's output in studies aiming to detect, attribute, and project climate change. For this, we need assurance that these features reflect the target system, and are not artifacts of the unrealistic assumptions that go into the model. One technique for overcoming these limitations is to study ensembles of models which employ different simplifying assumptions and different methods of modelling. One then either takes as reliable certain outputs on which models in the ensemble agree, or takes the average of these outputs as the best estimate. Since the Intergovernmental Panel on Climate Change's Fourth Assessment Report (IPCC AR4) modellers have aimed to improve ensemble analysis by developing techniques to account for dependencies among models, and to ascribe unequal weights to models according to their performance. The goal of this paper is to present as clearly and cogently as possible the rationale for climate model ensemble methodology, the motivation of modellers to account for model dependencies, and their efforts to ascribe unequal weights to models. The method of our analysis is as follows. We will consider a simpler, well-understood case of taking the mean of a number of measurements of some quantity. Contrary to what is sometimes said, it is not a requirement of this practice that the errors of the component measurements be independent; one must, however, compensate for any lack of independence. We will also extend the usual accounts to include cases of unknown systematic error. We draw parallels between this simpler illustration and the more complex example of climate model ensembles, detailing how ensembles can provide more useful information than any of their constituent models. This account emphasizes the
Quinn, Elizabeth A; Largado, Fe; Borja, Judith B; Kuzawa, Christopher W
2015-05-01
Human milk contains many metabolic hormones that may influence infant growth. Milk leptin is positively associated with maternal adiposity and inversely associated with infant growth. Most research has been conducted in populations with higher leptin levels; it is not well understood how milk leptin may vary in lean populations or the associations that reduced leptin may have with infant size for age. It is also largely unknown if associations between maternal body composition and milk leptin persist past 1 year of age. We investigated the association between maternal body composition and milk leptin content in a sample of lean Filipino women and the association between milk leptin content and infant size for age. Milk samples were collected at in-home visits from 113 mothers from Cebu, Philippines. Milk leptin content was measured using EIA techniques; anthropometric data, dietary recalls, and household information were also collected. Mean ± standard deviation (SD) milk leptin in this sample was 300.7 ± 293.6 pg/mL, among the lowest previously reported. Mean ± SD maternal percentage body fat was 24.8% ± 3.5%. Mean ± SD infant age was 9.9 ± 7.0 months, and mean ± SD weight for age z-score was -0.98 ± 1.06. Maternal percentage body fat was a significant, positive predictor of milk leptin content. Milk leptin was a significant, inverse predictor of infant weight and body mass index z-scores in infants 1 year old or younger. The association between maternal body composition, milk leptin, and infant growth persists in mothers with lean body composition. Milk leptin is not associated with growth in older infants. © The Author(s) 2014.
Energy Technology Data Exchange (ETDEWEB)
Man, Jun [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Zhang, Jiangjiang [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Li, Weixuan [Pacific Northwest National Laboratory, Richland Washington USA; Zeng, Lingzao [Zhejiang Provincial Key Laboratory of Agricultural Resources and Environment, Institute of Soil and Water Resources and Environmental Science, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou China; Wu, Laosheng [Department of Environmental Sciences, University of California, Riverside California USA
2016-10-01
The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees of freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.
Imprinting and recalling cortical ensembles.
Carrillo-Reid, Luis; Yang, Weijian; Bando, Yuki; Peterka, Darcy S; Yuste, Rafael
2016-08-12
Neuronal ensembles are coactive groups of neurons that may represent building blocks of cortical circuits. These ensembles could be formed by Hebbian plasticity, whereby synapses between coactive neurons are strengthened. Here we report that repetitive activation with two-photon optogenetics of neuronal populations from ensembles in the visual cortex of awake mice builds neuronal ensembles that recur spontaneously after being imprinted and do not disrupt preexisting ones. Moreover, imprinted ensembles can be recalled by single- cell stimulation and remain coactive on consecutive days. Our results demonstrate the persistent reconfiguration of cortical circuits by two-photon optogenetics into neuronal ensembles that can perform pattern completion. Copyright © 2016, American Association for the Advancement of Science.
Modality-Driven Classification and Visualization of Ensemble Variance
Energy Technology Data Exchange (ETDEWEB)
Bensema, Kevin; Gosink, Luke; Obermaier, Harald; Joy, Kenneth I.
2016-10-01
Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no information about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.
Ensemble Kalman filtering without the intrinsic need for inflation
Directory of Open Access Journals (Sweden)
M. Bocquet
2011-10-01
Full Text Available The main intrinsic source of error in the ensemble Kalman filter (EnKF is sampling error. External sources of error, such as model error or deviations from Gaussianity, depend on the dynamical properties of the model. Sampling errors can lead to instability of the filter which, as a consequence, often requires inflation and localization. The goal of this article is to derive an ensemble Kalman filter which is less sensitive to sampling errors. A prior probability density function conditional on the forecast ensemble is derived using Bayesian principles. Even though this prior is built upon the assumption that the ensemble is Gaussian-distributed, it is different from the Gaussian probability density function defined by the empirical mean and the empirical error covariance matrix of the ensemble, which is implicitly used in traditional EnKFs. This new prior generates a new class of ensemble Kalman filters, called finite-size ensemble Kalman filter (EnKF-N. One deterministic variant, the finite-size ensemble transform Kalman filter (ETKF-N, is derived. It is tested on the Lorenz '63 and Lorenz '95 models. In this context, ETKF-N is shown to be stable without inflation for ensemble size greater than the model unstable subspace dimension, at the same numerical cost as the ensemble transform Kalman filter (ETKF. One variant of ETKF-N seems to systematically outperform the ETKF with optimally tuned inflation. However it is shown that ETKF-N does not account for all sampling errors, and necessitates localization like any EnKF, whenever the ensemble size is too small. In order to explore the need for inflation in this small ensemble size regime, a local version of the new class of filters is defined (LETKF-N and tested on the Lorenz '95 toy model. Whatever the size of the ensemble, the filter is stable. Its performance without inflation is slightly inferior to that of LETKF with optimally tuned inflation for small interval between updates, and
DEFF Research Database (Denmark)
Karnøe, Peter; Garud, Raghu
2012-01-01
. Competencies emerged through processes and mechanisms such as co-creation that implicated multiple learning processes. The process was not an orderly linear one as emergent contingencies influenced the learning processes. An implication is that public policy to catalyse clusters cannot be based......This paper employs path creation as a lens to follow the emergence of the Danish wind turbine cluster. Supplier competencies, regulations, user preferences and a market for wind power did not pre-exist; all had to emerge in a tranformative manner involving multiple actors and artefacts...
Fleary, Sasha A; Ettienne, Reynolette
2014-01-01
Objectives. To identify the extent to which (1) beliefs about obesity and obesity-related behaviors distinguish individuals based on weight perception (WP) and (2) beliefs about obesity predict perceived health status and WP and how these in turn predict decisions to try to lose weight. Method. 7456 noninstitutionalized US adults (Mage = 54.13, SDage = 16.93; 61.2% female; 75.9% White) completed the 2007 Health Information National Trends Survey. Multinomial logistic regressions and structural equation modeling were used to accomplish study objectives. Results. Age, gender, information-seeking, health status, belief that obesity is inherited, and knowledge of fruits and vegetables recommendations distinguished participants based on WP. Beliefs about obesity predicted health status, WP, and trying to lose weight in the general model. The models varied based on gender, race/ethnicity, education, and weight misperception. Conclusion. This study supports the role of beliefs about obesity, WP, and health perceptions in individuals' decisions and actions regarding weight management. This study increases our understanding of gender, race/ethnicity, education, and weight misperceptions differences in decisions to lose weight. This knowledge may lead to targeted interventions, rather than "one size fits all" interventions, to promote health and prevent obesity.
A new method for determining the optimal lagged ensemble
DelSole, T.; Tippett, M. K.; Pegion, K.
2017-01-01
Abstract We propose a general methodology for determining the lagged ensemble that minimizes the mean square forecast error. The MSE of a lagged ensemble is shown to depend only on a quantity called the cross‐lead error covariance matrix, which can be estimated from a short hindcast data set and parameterized in terms of analytic functions of time. The resulting parameterization allows the skill of forecasts to be evaluated for an arbitrary ensemble size and initialization frequency. Remarkably, the parameterization also can estimate the MSE of a burst ensemble simply by taking the limit of an infinitely small interval between initialization times. This methodology is applied to forecasts of the Madden Julian Oscillation (MJO) from version 2 of the Climate Forecast System version 2 (CFSv2). For leads greater than a week, little improvement is found in the MJO forecast skill when ensembles larger than 5 days are used or initializations greater than 4 times per day. We find that if the initialization frequency is too infrequent, important structures of the lagged error covariance matrix are lost. Lastly, we demonstrate that the forecast error at leads ≥10 days can be reduced by optimally weighting the lagged ensemble members. The weights are shown to depend only on the cross‐lead error covariance matrix. While the methodology developed here is applied to CFSv2, the technique can be easily adapted to other forecast systems. PMID:28580050
Protein Remote Homology Detection Based on an Ensemble Learning Approach.
Chen, Junjie; Liu, Bingquan; Huang, Dong
2016-01-01
Protein remote homology detection is one of the central problems in bioinformatics. Although some computational methods have been proposed, the problem is still far from being solved. In this paper, an ensemble classifier for protein remote homology detection, called SVM-Ensemble, was proposed with a weighted voting strategy. SVM-Ensemble combined three basic classifiers based on different feature spaces, including Kmer, ACC, and SC-PseAAC. These features consider the characteristics of proteins from various perspectives, incorporating both the sequence composition and the sequence-order information along the protein sequences. Experimental results on a widely used benchmark dataset showed that the proposed SVM-Ensemble can obviously improve the predictive performance for the protein remote homology detection. Moreover, it achieved the best performance and outperformed other state-of-the-art methods.
Evolutionary Cluster-Based Synthetic Oversampling Ensemble (ECO-Ensemble) for Imbalance Learning.
Lim, Pin; Goh, Chi Keong; Tan, Kay Chen
2017-09-01
Class imbalance problems, where the number of samples in each class is unequal, is prevalent in numerous real world machine learning applications. Traditional methods which are biased toward the majority class are ineffective due to the relative severity of misclassifying rare events. This paper proposes a novel evolutionary cluster-based oversampling ensemble framework, which combines a novel cluster-based synthetic data generation method with an evolutionary algorithm (EA) to create an ensemble. The proposed synthetic data generation method is based on contemporary ideas of identifying oversampling regions using clusters. The novel use of EA serves a twofold purpose of optimizing the parameters of the data generation method while generating diverse examples leveraging on the characteristics of EAs, reducing overall computational cost. The proposed method is evaluated on a set of 40 imbalance datasets obtained from the University of California, Irvine, database, and outperforms current state-of-the-art ensemble algorithms tackling class imbalance problems.
Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation
Vašát, Radim; Kodešová, Radka; Borůvka, Luboš
2017-07-01
A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.
Liu, Li; Gao, Chao; Xuan, Weidong; Xu, Yue-Ping
2017-11-01
Ensemble flood forecasts by hydrological models using numerical weather prediction products as forcing data are becoming more commonly used in operational flood forecasting applications. In this study, a hydrological ensemble flood forecasting system comprised of an automatically calibrated Variable Infiltration Capacity model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated. The hydrological model is optimized by the parallel programmed ε-NSGA II multi-objective algorithm. According to the solutions by ε-NSGA II, two differently parameterized models are determined to simulate daily flows and peak flows at each of the three hydrological stations. Then a simple yet effective modular approach is proposed to combine these daily and peak flows at the same station into one composite series. Five ensemble methods and various evaluation metrics are adopted. The results show that ε-NSGA II can provide an objective determination on parameter estimation, and the parallel program permits a more efficient simulation. It is also demonstrated that the forecasts from ECMWF have more favorable skill scores than other Ensemble Prediction Systems. The multimodel ensembles have advantages over all the single model ensembles and the multimodel methods weighted on members and skill scores outperform other methods. Furthermore, the overall performance at three stations can be satisfactory up to ten days, however the hydrological errors can degrade the skill score by approximately 2 days, and the influence persists until a lead time of 10 days with a weakening trend. With respect to peak flows selected by the Peaks Over Threshold approach, the ensemble means from single models or multimodels are generally underestimated, indicating that the ensemble mean can bring overall improvement in forecasting of flows. For
Thompson, Steven K
2012-01-01
Praise for the Second Edition "This book has never had a competitor. It is the only book that takes a broad approach to sampling . . . any good personal statistics library should include a copy of this book." —Technometrics "Well-written . . . an excellent book on an important subject. Highly recommended." —Choice "An ideal reference for scientific researchers and other professionals who use sampling." —Zentralblatt Math Features new developments in the field combined with all aspects of obtaining, interpreting, and using sample data Sampling provides an up-to-date treat
Multivariate localization methods for ensemble Kalman filtering
Roh, S.
2015-12-03
In ensemble Kalman filtering (EnKF), the small number of ensemble members that is feasible to use in a practical data assimilation application leads to sampling variability of the estimates of the background error covariances. The standard approach to reducing the effects of this sampling variability, which has also been found to be highly efficient in improving the performance of EnKF, is the localization of the estimates of the covariances. One family of localization techniques is based on taking the Schur (element-wise) product of the ensemble-based sample covariance matrix and a correlation matrix whose entries are obtained by the discretization of a distance-dependent correlation function. While the proper definition of the localization function for a single state variable has been extensively investigated, a rigorous definition of the localization function for multiple state variables that exist at the same locations has been seldom considered. This paper introduces two strategies for the construction of localization functions for multiple state variables. The proposed localization functions are tested by assimilating simulated observations experiments into the bivariate Lorenz 95 model with their help.
Multivariate localization methods for ensemble Kalman filtering
Roh, S.
2015-05-08
In ensemble Kalman filtering (EnKF), the small number of ensemble members that is feasible to use in a practical data assimilation application leads to sampling variability of the estimates of the background error covariances. The standard approach to reducing the effects of this sampling variability, which has also been found to be highly efficient in improving the performance of EnKF, is the localization of the estimates of the covariances. One family of localization techniques is based on taking the Schur (entry-wise) product of the ensemble-based sample covariance matrix and a correlation matrix whose entries are obtained by the discretization of a distance-dependent correlation function. While the proper definition of the localization function for a single state variable has been extensively investigated, a rigorous definition of the localization function for multiple state variables has been seldom considered. This paper introduces two strategies for the construction of localization functions for multiple state variables. The proposed localization functions are tested by assimilating simulated observations experiments into the bivariate Lorenz 95 model with their help.
Directory of Open Access Journals (Sweden)
Gil Rahamim
Full Text Available Most active biopolymers are dynamic structures; thus, ensembles of such molecules should be characterized by distributions of intra- or intermolecular distances and their fast fluctuations. A method of choice to determine intramolecular distances is based on Förster resonance energy transfer (FRET measurements. Major advances in such measurements were achieved by single molecule FRET measurements. Here, we show that by global analysis of the decay of the emission of both the donor and the acceptor it is also possible to resolve two sub-populations in a mixture of two ensembles of biopolymers by time resolved FRET (trFRET measurements at the ensemble level. We show that two individual intramolecular distance distributions can be determined and characterized in terms of their individual means, full width at half maximum (FWHM, and two corresponding diffusion coefficients which reflect the rates of fast ns fluctuations within each sub-population. An important advantage of the ensemble level trFRET measurements is the ability to use low molecular weight small-sized probes and to determine nanosecond fluctuations of the distance between the probes. The limits of the possible resolution were first tested by simulation and then by preparation of mixtures of two model peptides. The first labeled polypeptide was a relatively rigid Pro7 and the second polypeptide was a flexible molecule consisting of (Gly-Ser7 repeats. The end to end distance distributions and the diffusion coefficients of each peptide were determined. Global analysis of trFRET measurements of a series of mixtures of polypeptides recovered two end-to-end distance distributions and associated intramolecular diffusion coefficients, which were very close to those determined from each of the pure samples. This study is a proof of concept study demonstrating the power of ensemble level trFRET based methods in resolution of subpopulations in ensembles of flexible macromolecules.
Verma, Gaurav; Chawla, Sanjeev; Nagarajan, Rajakumar; Iqbal, Zohaib; Albert Thomas, M.; Poptani, Harish
2017-04-01
Two-dimensional localized correlated spectroscopy (2D L-COSY) offers greater spectral dispersion than conventional one-dimensional (1D) MRS techniques, yet long acquisition times and limited post-processing support have slowed its clinical adoption. Improving acquisition efficiency and developing versatile post-processing techniques can bolster the clinical viability of 2D MRS. The purpose of this study was to implement a non-uniformly weighted sampling (NUWS) scheme for faster acquisition of 2D-MRS. A NUWS 2D L-COSY sequence was developed for 7T whole-body MRI. A phantom containing metabolites commonly observed in the brain at physiological concentrations was scanned ten times with both the NUWS scheme of 12:48 duration and a 17:04 constant eight-average sequence using a 32-channel head coil. 2D L-COSY spectra were also acquired from the occipital lobe of four healthy volunteers using both the proposed NUWS and the conventional uniformly-averaged L-COSY sequence. The NUWS 2D L-COSY sequence facilitated 25% shorter acquisition time while maintaining comparable SNR in humans (+0.3%) and phantom studies (+6.0%) compared to uniform averaging. NUWS schemes successfully demonstrated improved efficiency of L-COSY, by facilitating a reduction in scan time without affecting signal quality.
Activity recall in a visual cortical ensemble.
Xu, Shengjin; Jiang, Wanchen; Poo, Mu-Ming; Dan, Yang
2012-01-22
Cue-triggered recall of learned temporal sequences is an important cognitive function that has been attributed to higher brain areas. Here recordings in both anesthetized and awake rats demonstrate that after repeated stimulation with a moving spot that evoked sequential firing of an ensemble of primary visual cortex (V1) neurons, just a brief flash at the starting point of the motion path was sufficient to evoke a sequential firing pattern that reproduced the activation order evoked by the moving spot. The speed of recalled spike sequences may reflect the internal dynamics of the network rather than the motion speed. In awake rats, such recall was observed during a synchronized ('quiet wakeful') brain state having large-amplitude, low-frequency local field potential (LFP) but not in a desynchronized ('active') state having low-amplitude, high-frequency LFP. Such conditioning-enhanced, cue-evoked sequential spiking of a V1 ensemble may contribute to experience-based perceptual inference in a brain state-dependent manner.
Ensemble Pruning for Glaucoma Detection in an Unbalanced Data Set.
Adler, Werner; Gefeller, Olaf; Gul, Asma; Horn, Folkert K; Khan, Zardad; Lausen, Berthold
2016-12-07
Random forests are successful classifier ensemble methods consisting of typically 100 to 1000 classification trees. Ensemble pruning techniques reduce the computational cost, especially the memory demand, of random forests by reducing the number of trees without relevant loss of performance or even with increased performance of the sub-ensemble. The application to the problem of an early detection of glaucoma, a severe eye disease with low prevalence, based on topographical measurements of the eye background faces specific challenges. We examine the performance of ensemble pruning strategies for glaucoma detection in an unbalanced data situation. The data set consists of 102 topographical features of the eye background of 254 healthy controls and 55 glaucoma patients. We compare the area under the receiver operating characteristic curve (AUC), and the Brier score on the total data set, in the majority class, and in the minority class of pruned random forest ensembles obtained with strategies based on the prediction accuracy of greedily grown sub-ensembles, the uncertainty weighted accuracy, and the similarity between single trees. To validate the findings and to examine the influence of the prevalence of glaucoma in the data set, we additionally perform a simulation study with lower prevalences of glaucoma. In glaucoma classification all three pruning strategies lead to improved AUC and smaller Brier scores on the total data set with sub-ensembles as small as 30 to 80 trees compared to the classification results obtained with the full ensemble consisting of 1000 trees. In the simulation study, we were able to show that the prevalence of glaucoma is a critical factor and lower prevalence decreases the performance of our pruning strategies. The memory demand for glaucoma classification in an unbalanced data situation based on random forests could effectively be reduced by the application of pruning strategies without loss of performance in a population with increased
Diurnal Ensemble Surface Meteorology Statistics
U.S. Environmental Protection Agency — Excel file containing diurnal ensemble statistics of 2-m temperature, 2-m mixing ratio and 10-m wind speed. This Excel file contains figures for Figure 2 in the...
DEFF Research Database (Denmark)
2004-01-01
Within the framework of the PSO-Ensemble project (FU2101) a demo application has been created. The application use ECMWF ensemble forecasts. Two instances of the application are running; one for Nysted Offshore and one for the total production (except Horns Rev) in the Eltra area. The output is a...... is available via two password-protected web-pages hosted at IMM and is used daily by Elsam and E2....
Directory of Open Access Journals (Sweden)
Sayed-Mohsen Hosseini
2014-01-01
Full Text Available Background: Growth is one of the most important indices in child health. The best and most effective way to investigate child health is measuring the physical growth indices such as weight, height and head circumference. Among these measures, weight growth is the simplest and the most effective way to determine child growth status. Weight trend at a given age is the result of cumulative growth experience, whereas growth velocity represents what is happening at the time. Methods: This longitudinal study was conducted among 606 children repeatedly measured from birth until 2 years of age. We used linear mixed model to analyze repeated measures and to determine factors affecting the growth trajectory. LOWESS smooth curve was used to draw velocity curves. Results: Gender, child rank, birth status and feeding mode had a significant effect on weight trajectory. Boys had higher weight during the study. Infants with exclusive breast feeding had higher weight than other infants. Boys had higher growth velocity up to age 6 month. Breast fed infants had higher growth velocity up to 6 month, but thereafter the velocity was higher in other infants. Conclusions: Many of the studies have investigated child growth, but most of them used cross-sectional design. In this study, we used longitudinal method to determine effective factors on weight trend in children from birth until 2-year-old. The effects of perinatal factors on further growth should be considered for prevention of growth disorders and their late complications.
Seong, Min-Gyu; Suh, Myoung-Seok; Kim, Chansoo
2017-08-01
This study focuses on an objective comparison of eight ensemble methods using the same data, training period, training method, and validation period. The eight ensemble methods are: BMA (Bayesian Model Averaging), HMR (Homogeneous Multiple Regression), EMOS (Ensemble Model Output Statistics), HMR+ with positive coefficients, EMOS+ with positive coefficients, PEA_ROC (Performance-based Ensemble Averaging using ROot mean square error and temporal Correlation coefficient), WEA_Tay (Weighted Ensemble Averaging based on Taylor's skill score), and MME (Multi-Model Ensemble). Forty-five years (1961-2005) of data from 14 CMIP5 models and APHRODITE (Asian Precipitation- Highly-Resolved Observational Data Integration Towards Evaluation of Water Resources) data were used to compare the performance of the eight ensemble methods. Although some models underestimated the variability of monthly mean temperature (MMT), most of the models effectively simulated the spatial distribution of MMT. Regardless of training periods and the number of ensemble members, the prediction skills of BMA and the four multiple linear regressions (MLR) were superior to the other ensemble methods (PEA_ROC, WEA_Tay, MME) in terms of deterministic prediction. In terms of probabilistic prediction, the four MLRs showed better prediction skills than BMA. However, the differences among the four MLRs and BMA were not significant. This resulted from the similarity of BMA weights and regression coefficients. Furthermore, prediction skills of the four MLRs were very similar. Overall, the four MLRs showed the best prediction skills among the eight ensemble methods. However, more comprehensive work is needed to select the best ensemble method among the numerous ensemble methods.
Deformed Ginibre ensembles and integrable systems
Energy Technology Data Exchange (ETDEWEB)
Orlov, A.Yu., E-mail: orlovs@ocean.ru
2014-01-17
We consider three Ginibre ensembles (real, complex and quaternion-real) with deformed measures and relate them to known integrable systems by presenting partition functions of these ensembles in form of fermionic expectation values. We also introduce double deformed Dyson–Wigner ensembles and compare their fermionic representations with those of Ginibre ensembles.
Hur, Yoon-Mi
2003-12-01
The degree of assortative mating for psychological and physical traits in Asian societies in relatively unknown. The present study examined assortative mating for educational level, personality traits, religious affiliation, height, weight, and body mass index in a korean sample. Age-adjusted spouse correlations were high for educational level (r = .63) and religious affiliation (r = .67), modest for most personality traits (rs = -.01 to .26), and trivial for height (r = .04), weight (r = .05)m and body mass index (r = .11). These results were remarkably similar to those found from the western samples. Implications of the present findings in behavior genetic studies and human mating patterns were briefly discussed.
Path analysis of risk factors leading to premature birth.
Fields, S J; Livshits, G; Sirotta, L; Merlob, P
1996-01-01
The present study tested whether various sociodemographic, anthropometric, behavioral, and medical/physiological factors act in a direct or indirect manner on the risk of prematurity using path analysis on a sample of Israeli births. The path model shows that medical complications, primarily toxemia, chorioammionitis, and a previous low birth weight delivery directly and significantly act on the risk of prematurity as do low maternal pregnancy weight gain and ethnicity. Other medical complications, including chronic hypertension, preclampsia, and placental abruption, although significantly correlated with prematurity, act indirectly on prematurity through toxemia. The model further shows that the commonly accepted sociodemographic, anthropometric, and behavioral risk factors act by modifying the development of medical complications that lead to prematurity as opposed to having a direct effect on premature delivery. © 1996 Wiley-Liss, Inc. Copyright © 1996 Wiley-Liss, Inc.
Drewnowski, A; Rehm, C D
2016-03-07
Low-calorie sweeteners (LCSs) are said to be a risk factor for obesity and diabetes. Reverse causality may be an alternative explanation. Data on LCS use, from a single 24-h dietary recall, for a representative sample of 22 231 adults were obtained from 5 cycles of the National Health and Nutrition Examination Survey (1999-2008 NHANES). Retrospective data on intent to lose or maintain weight during the prior 12-months and 10-year weight history were obtained from the weight history questionnaire. Objectively measured heights and weights were obtained from the examination. Primary analyses evaluated the association between intent to lose/maintain weight and use of LCSs and specific LCS product types using survey-weighted generalized linear models. We further evaluated whether body mass index (BMI) may mediate the association between weight loss intent and use of LCSs. The association between 10-year weight history and current LCS use was evaluated using restricted cubic splines. In cross-sectional analyses, LCS use was associated with a higher prevalence of obesity and diabetes. Adults who tried to lose weight during the previous 12 months were more likely to consume LCS beverages (prevalence ratio=1.64, 95% confidence interval (CI) 1.54-1.75), tabletop LCS (prevalence ratio=1.68, 95% CI 1.47-1.91) and LCS foods (prevalence ratio=1.93, 95% CI 1.60-2.33) as compared with those who did not. In mediation analyses, BMI only partially mediated the association between weight control history and the use of LCS beverages, tabletop LCS, but not LCS foods. Current LCS use was further associated with a history of prior weight change (for example, weight loss and gain). LCS use was associated with self-reported intent to lose weight during the previous 12 months. This association was only partially mediated by differences in BMI. Any inference of causality between attempts at weight control and LCS use is tempered by the cross-sectional nature of these data and retrospective
Bioprocess optimization under uncertainty using ensemble modeling.
Liu, Yang; Gunawan, Rudiyanto
2017-02-20
The performance of model-based bioprocess optimizations depends on the accuracy of the mathematical model. However, models of bioprocesses often have large uncertainty due to the lack of model identifiability. In the presence of such uncertainty, process optimizations that rely on the predictions of a single "best fit" model, e.g. the model resulting from a maximum likelihood parameter estimation using the available process data, may perform poorly in real life. In this study, we employed ensemble modeling to account for model uncertainty in bioprocess optimization. More specifically, we adopted a Bayesian approach to define the posterior distribution of the model parameters, based on which we generated an ensemble of model parameters using a uniformly distributed sampling of the parameter confidence region. The ensemble-based process optimization involved maximizing the lower confidence bound of the desired bioprocess objective (e.g. yield or product titer), using a mean-standard deviation utility function. We demonstrated the performance and robustness of the proposed strategy in an application to a monoclonal antibody batch production by mammalian hybridoma cell culture. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Wilkerson, Amanda H; Hackman, Christine L; Rush, Sarah E; Usdan, Stuart L; Smith, Corinne S
2017-10-01
Behaviors of weight conscious drinkers (BWCD) include disordered eating, excessive physical activity (PA), and heavy episodic drinking. Considering that approximately 25% of the college students report BWCD, it is important to investigate what characteristics increase the likelihood of college students engaged in BWCD for both moderate and vigorous PAs. A total of 510 college students were recruited from a large, public southeastern university. Participants completed a cross-sectional survey during the spring 2015 semester. Of 510 respondents, 11.2% reported moderate PA-based BWCD and 14.7% reported vigorous PA-based BWCD. Weight loss intention, BMI and Greek affiliation predicted both moderate and vigorous BWCD. Study findings suggest that Greek-affiliated students and students with weight loss intentions might be at an increased risk for BWCD. Along with promoting lower levels of alcohol consumption, college practitioners should consider discussing issues of weight and body image with college students as they relate to maladaptive drinking behavior.
Slof-Op 't Landt, Margarita C T; van Furth, Eric F; van Beijsterveldt, Catharina E M; Bartels, Meike; Willemsen, Gonneke; de Geus, Eco J; Ligthart, Lannie; Boomsma, Dorret I
2017-11-01
The current study aimed to define the prevalence of dieting and fear of weight gain among men and women across the entire lifespan and identify factors associated with them. Data were available for 31,636 participants (60.2% women; age 13-98 years) from the Netherlands Twin Register. Dieting and fear of weight gain were described by age and sex. Associations with BMI, exercise behavior, urbanization and educational attainment were examined by regression analyses in 19,294 participants. Dieting was most frequently reported by 35- to 65-year-old women (56.6-63%), and 45- to 65-year-old men (31.7-31.9%). Fear of weight gain was most prevalent in women between 16 and 25 (73.2-74.3%), and in 25- to 55-year-old men (43.2-46.1%). In addition to sex and BMI, dieting and fear of weight gain were associated with each other. Furthermore, fear was associated with the age × sex interaction and educational attainment. Dieting and fear of weight gain is common during the entire lifespan for women, but is also endorsed by a substantial number of men. Given the low rate of overweight in young women, the high levels of fear of weight gain are striking.
Adaptive correction of ensemble forecasts
Pelosi, Anna; Battista Chirico, Giovanni; Van den Bergh, Joris; Vannitsem, Stephane
2017-04-01
Forecasts from numerical weather prediction (NWP) models often suffer from both systematic and non-systematic errors. These are present in both deterministic and ensemble forecasts, and originate from various sources such as model error and subgrid variability. Statistical post-processing techniques can partly remove such errors, which is particularly important when NWP outputs concerning surface weather variables are employed for site specific applications. Many different post-processing techniques have been developed. For deterministic forecasts, adaptive methods such as the Kalman filter are often used, which sequentially post-process the forecasts by continuously updating the correction parameters as new ground observations become available. These methods are especially valuable when long training data sets do not exist. For ensemble forecasts, well-known techniques are ensemble model output statistics (EMOS), and so-called "member-by-member" approaches (MBM). Here, we introduce a new adaptive post-processing technique for ensemble predictions. The proposed method is a sequential Kalman filtering technique that fully exploits the information content of the ensemble. One correction equation is retrieved and applied to all members, however the parameters of the regression equations are retrieved by exploiting the second order statistics of the forecast ensemble. We compare our new method with two other techniques: a simple method that makes use of a running bias correction of the ensemble mean, and an MBM post-processing approach that rescales the ensemble mean and spread, based on minimization of the Continuous Ranked Probability Score (CRPS). We perform a verification study for the region of Campania in southern Italy. We use two years (2014-2015) of daily meteorological observations of 2-meter temperature and 10-meter wind speed from 18 ground-based automatic weather stations distributed across the region, comparing them with the corresponding COSMO
Strings, paths, and standard tableaux
Dasmahapatra, S
1996-01-01
For the vacuum sectors of regime-III ABF models, we observe that two sets of combinatorial objects: the strings which parametrize the row-to-row transfer matrix eigenvectors, and the paths which parametrize the corner transfer matrix eigenvectors, can both be expressed in terms of the same set of standard tableaux. Furthermore, the momenta of the strings, the energies of the paths, and the charges of the tableaux are such that there is a weight-preserving bijection between the two sets of eigenvectors, wherein the tableaux play an interpolating role. This bijection is so natural, that we conjecture that it exists in general.
Genetic programming based ensemble system for microarray data classification.
Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To
2015-01-01
Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.
Evolutionary Ensemble for In Silico Prediction of Ames Test Mutagenicity
Chen, Huanhuan; Yao, Xin
Driven by new regulations and animal welfare, the need to develop in silico models has increased recently as alternative approaches to safety assessment of chemicals without animal testing. This paper describes a novel machine learning ensemble approach to building an in silico model for the prediction of the Ames test mutagenicity, one of a battery of the most commonly used experimental in vitro and in vivo genotoxicity tests for safety evaluation of chemicals. Evolutionary random neural ensemble with negative correlation learning (ERNE) [1] was developed based on neural networks and evolutionary algorithms. ERNE combines the method of bootstrap sampling on training data with the method of random subspace feature selection to ensure diversity in creating individuals within an initial ensemble. Furthermore, while evolving individuals within the ensemble, it makes use of the negative correlation learning, enabling individual NNs to be trained as accurate as possible while still manage to maintain them as diverse as possible. Therefore, the resulting individuals in the final ensemble are capable of cooperating collectively to achieve better generalization of prediction. The empirical experiment suggest that ERNE is an effective ensemble approach for predicting the Ames test mutagenicity of chemicals.
Genetic Programming Based Ensemble System for Microarray Data Classification
Directory of Open Access Journals (Sweden)
Kun-Hong Liu
2015-01-01
Full Text Available Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP based new ensemble system (named GPES, which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.
Ensemble: an Architecture for Mission-Operations Software
Norris, Jeffrey; Powell, Mark; Fox, Jason; Rabe, Kenneth; Shu, IHsiang; McCurdy, Michael; Vera, Alonso
2008-01-01
Ensemble is the name of an open architecture for, and a methodology for the development of, spacecraft mission operations software. Ensemble is also potentially applicable to the development of non-spacecraft mission-operations- type software. Ensemble capitalizes on the strengths of the open-source Eclipse software and its architecture to address several issues that have arisen repeatedly in the development of mission-operations software: Heretofore, mission-operations application programs have been developed in disparate programming environments and integrated during the final stages of development of missions. The programs have been poorly integrated, and it has been costly to develop, test, and deploy them. Users of each program have been forced to interact with several different graphical user interfaces (GUIs). Also, the strategy typically used in integrating the programs has yielded serial chains of operational software tools of such a nature that during use of a given tool, it has not been possible to gain access to the capabilities afforded by other tools. In contrast, the Ensemble approach offers a low-risk path towards tighter integration of mission-operations software tools.
An ensemble rank learning approach for gene prioritization.
Lee, Po-Feng; Soo, Von-Wun
2013-01-01
Several different computational approaches have been developed to solve the gene prioritization problem. We intend to use the ensemble boosting learning techniques to combine variant computational approaches for gene prioritization in order to improve the overall performance. In particular we add a heuristic weighting function to the Rankboost algorithm according to: 1) the absolute ranks generated by the adopted methods for a certain gene, and 2) the ranking relationship between all gene-pairs from each prioritization result. We select 13 known prostate cancer genes in OMIM database as training set and protein coding gene data in HGNC database as test set. We adopt the leave-one-out strategy for the ensemble rank boosting learning. The experimental results show that our ensemble learning approach outperforms the four gene-prioritization methods in ToppGene suite in the ranking results of the 13 known genes in terms of mean average precision, ROC and AUC measures.
WE-E-BRE-05: Ensemble of Graphical Models for Predicting Radiation Pneumontis Risk
Energy Technology Data Exchange (ETDEWEB)
Lee, S; Ybarra, N; Jeyaseelan, K; El Naqa, I [McGill University, Montreal, Quebec (Canada); Faria, S; Kopek, N [Montreal General Hospital, Montreal, Quebec (Canada)
2014-06-15
Purpose: We propose a prior knowledge-based approach to construct an interaction graph of biological and dosimetric radiation pneumontis (RP) covariates for the purpose of developing a RP risk classifier. Methods: We recruited 59 NSCLC patients who received curative radiotherapy with minimum 6 month follow-up. 16 RP events was observed (CTCAE grade ≥2). Blood serum was collected from every patient before (pre-RT) and during RT (mid-RT). From each sample the concentration of the following five candidate biomarkers were taken as covariates: alpha-2-macroglobulin (α2M), angiotensin converting enzyme (ACE), transforming growth factor β (TGF-β), interleukin-6 (IL-6), and osteopontin (OPN). Dose-volumetric parameters were also included as covariates. The number of biological and dosimetric covariates was reduced by a variable selection scheme implemented by L1-regularized logistic regression (LASSO). Posterior probability distribution of interaction graphs between the selected variables was estimated from the data under the literature-based prior knowledge to weight more heavily the graphs that contain the expected associations. A graph ensemble was formed by averaging the most probable graphs weighted by their posterior, creating a Bayesian Network (BN)-based RP risk classifier. Results: The LASSO selected the following 7 RP covariates: (1) pre-RT concentration level of α2M, (2) α2M level mid- RT/pre-RT, (3) pre-RT IL6 level, (4) IL6 level mid-RT/pre-RT, (5) ACE mid-RT/pre-RT, (6) PTV volume, and (7) mean lung dose (MLD). The ensemble BN model achieved the maximum sensitivity/specificity of 81%/84% and outperformed univariate dosimetric predictors as shown by larger AUC values (0.78∼0.81) compared with MLD (0.61), V20 (0.65) and V30 (0.70). The ensembles obtained by incorporating the prior knowledge improved classification performance for the ensemble size 5∼50. Conclusion: We demonstrated a probabilistic ensemble method to detect robust associations between
Chen, Gui; Guo, Guiping; Gong, Jingbo; Xiao, Shuiyuan
2015-01-01
The current study investigated the moderating effects of gender, age, and weight status on the relationship between body dissatisfaction and depression among adolescents. Data were collected on body dissatisfaction, depression, and demographic characteristics from a convenience sample of 1,101 adolescents (505 girls, 596 boys). The relationship…
Algorithms on ensemble quantum computers.
Boykin, P Oscar; Mor, Tal; Roychowdhury, Vwani; Vatan, Farrokh
2010-06-01
In ensemble (or bulk) quantum computation, all computations are performed on an ensemble of computers rather than on a single computer. Measurements of qubits in an individual computer cannot be performed; instead, only expectation values (over the complete ensemble of computers) can be measured. As a result of this limitation on the model of computation, many algorithms cannot be processed directly on such computers, and must be modified, as the common strategy of delaying the measurements usually does not resolve this ensemble-measurement problem. Here we present several new strategies for resolving this problem. Based on these strategies we provide new versions of some of the most important quantum algorithms, versions that are suitable for implementing on ensemble quantum computers, e.g., on liquid NMR quantum computers. These algorithms are Shor's factorization algorithm, Grover's search algorithm (with several marked items), and an algorithm for quantum fault-tolerant computation. The first two algorithms are simply modified using a randomizing and a sorting strategies. For the last algorithm, we develop a classical-quantum hybrid strategy for removing measurements. We use it to present a novel quantum fault-tolerant scheme. More explicitly, we present schemes for fault-tolerant measurement-free implementation of Toffoli and σ(z)(¼) as these operations cannot be implemented "bitwise", and their standard fault-tolerant implementations require measurement.
Ensemble algorithms in reinforcement learning.
Wiering, Marco A; van Hasselt, Hado
2008-08-01
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and implemented four different ensemble methods combining the following five different RL algorithms: Q-learning, Sarsa, actor-critic (AC), QV-learning, and AC learning automaton. The intuitively designed ensemble methods, namely, majority voting (MV), rank voting, Boltzmann multiplication (BM), and Boltzmann addition, combine the policies derived from the value functions of the different RL algorithms, in contrast to previous work where ensemble methods have been used in RL for representing and learning a single value function. We show experiments on five maze problems of varying complexity; the first problem is simple, but the other four maze tasks are of a dynamic or partially observable nature. The results indicate that the BM and MV ensembles significantly outperform the single RL algorithms.
On Constructing Ensembles for Combinatorial Optimisation.
Hart, Emma; Sim, Kevin
2017-01-10
Although the use of ensemble methods in machine-learning is ubiquitous due to their proven ability to outperform their constituent algorithms, ensembles of optimisation algorithms have received relatively little attention. Existing approaches lag behind machine-learning in both theory and practice, with no principled design guidelines available. In this article, we address fundamental questions regarding ensemble composition in optimisation using the domain of bin-packing as an example. In particular, we investigate the trade-off between accuracy and diversity, and whether diversity metrics can be used as a proxy for constructing an ensemble, proposing a number of novel metrics for comparing algorithm diversity. We find that randomly composed ensembles can outperform ensembles of high-performing algorithms under certain conditions and that judicious choice of diversity metric is required to construct good ensembles. The method and findings can be generalised to any metaheuristic ensemble, and lead to better understanding of how to undertake principled ensemble design.
Fuzzy ARTMAP Ensemble Based Decision Making and Application
Directory of Open Access Journals (Sweden)
Min Jin
2013-01-01
Full Text Available Because the performance of single FAM is affected by the sequence of sample presentation for the offline mode of training, a fuzzy ARTMAP (FAM ensemble approach based on the improved Bayesian belief method is supposed to improve the classification accuracy. The training samples are input into a committee of FAMs in different sequence, the output from these FAMs is combined, and the final decision is derived by the improved Bayesian belief method. The experiment results show that the proposed FAMs’ ensemble can classify the different category reliably and has a better classification performance compared with single FAM.
Li, Fang
2011-06-01
The skill of probability density function (PDF) prediction of summer rainfall over East China using optimal ensemble schemes is evaluated based on the precipitation data from five coupled atmosphere-ocean general circulation models that participate in the ENSEMBLES project. The optimal ensemble scheme in each region is the scheme with the highest skill among the four commonly-used ones: the equally-weighted ensemble (EE), EE for calibrated model-simulations (Cali-EE), the ensemble scheme based on multiple linear regression analysis (MLR), and the Bayesian ensemble scheme (Bayes). The results show that the optimal ensemble scheme is the Bayes in the southern part of East China; the Cali-EE in the Yangtze River valley, the Yangtze-Huaihe River basin, and the central part of northern China; and the MLR in the eastern part of northern China. Their PDF predictions are well calibrated, and are sharper than or have approximately equal interval-width to the climatology prediction. In all regions, these optimal ensemble schemes outperform the climatology prediction, indicating that current commonly-used multi-model ensemble schemes are able to produce skillful PDF prediction of summer rainfall over East China, even though more information for other model variables is not derived.
Wilkerson, Amanda H.; Hackman, Christine L.; Rush, Sarah E.; Usdan, Stuart L.; Smith, Corinne S.
2017-01-01
Objective: Behaviors of weight conscious drinkers (BWCD) include disordered eating, excessive physical activity (PA), and heavy episodic drinking. Considering that approximately 25% of the college students report BWCD, it is important to investigate what characteristics increase the likelihood of college students engaged in BWCD for both moderate…
Energy Technology Data Exchange (ETDEWEB)
Zhang, R; Baer, E; Jee, K; Sharp, G; Flanz, J; Lu, H [Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States)
2016-06-15
Purpose: For proton therapy, an accurate model of CT HU to relative stopping power (RSP) conversion is essential. In current practice, validation of these models relies solely on measurements of tissue substitutes with standard compositions. Validation based on real tissue samples would be much more direct and can address variations between patients. This study intends to develop an efficient and accurate system based on the concept of dose extinction to measure WEPL and retrieve RSP in biological tissue in large number of types. Methods: A broad AP proton beam delivering a spread out Bragg peak (SOBP) is used to irradiate the samples with a Matrixx detector positioned immediately below. A water tank was placed on top of the samples, with the water level controllable in sub-millimeter by a remotely controlled dosing pump. While gradually lowering the water level with beam on, the transmission dose was recorded at 1 frame/sec. The WEPL were determined as the difference between the known beam range of the delivered SOBP (80%) and the water level corresponding to 80% of measured dose profiles in time. A Gammex 467 phantom was used to test the system and various types of biological tissue was measured. Results: RSP for all Gammex inserts, expect the one made with lung-450 material (<2% error), were determined within ±0.5% error. Depends on the WEPL of investigated phantom, a measurement takes around 10 min, which can be accelerated by a faster pump. Conclusion: Based on the concept of dose extinction, a system was explored to measure WEPL efficiently and accurately for a large number of samples. This allows the validation of CT HU to stopping power conversions based on large number of samples and real tissues. It also allows the assessment of beam uncertainties due to variations over patients, which issue has never been sufficiently studied before.
Directory of Open Access Journals (Sweden)
Shulin Lyu
2018-01-01
The σ function, namely, the derivative of the log of the smallest eigenvalue distributions of the finite-n LUE or the JUE, satisfies the Jimbo–Miwa–Okamoto σ form of PV and PVI, although in the shift Jacobi case, with the weight xα(1−xβ, the β parameter does not show up in the equation. We also obtain the asymptotic expansions for the smallest eigenvalue distributions of the Laguerre unitary and Jacobi unitary ensembles after appropriate double scalings, and obtained the constants in the asymptotic expansion of the gap probabilities, expressed in term of the Barnes G-function valuated at special point.
Similarity measures for protein ensembles
DEFF Research Database (Denmark)
Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper
2009-01-01
a synthetic example from molecular dynamics simulations. We then apply the algorithms to revisit the problem of ensemble averaging during structure determination of proteins, and find that an ensemble refinement method is able to recover the correct distribution of conformations better than standard single....... However, instead of examining individual conformations it is in many cases more relevant to analyse ensembles of conformations that have been obtained either through experiments or from methods such as molecular dynamics simulations. We here present three approaches that can be used to compare......Analyses of similarities and changes in protein conformation can provide important information regarding protein function and evolution. Many scores, including the commonly used root mean square deviation, have therefore been developed to quantify the similarities of different protein conformations...
Quantifying Monte Carlo uncertainty in ensemble Kalman filter
Energy Technology Data Exchange (ETDEWEB)
Thulin, Kristian; Naevdal, Geir; Skaug, Hans Julius; Aanonsen, Sigurd Ivar
2009-01-15
This report is presenting results obtained during Kristian Thulin PhD study, and is a slightly modified form of a paper submitted to SPE Journal. Kristian Thulin did most of his portion of the work while being a PhD student at CIPR, University of Bergen. The ensemble Kalman filter (EnKF) is currently considered one of the most promising methods for conditioning reservoir simulation models to production data. The EnKF is a sequential Monte Carlo method based on a low rank approximation of the system covariance matrix. The posterior probability distribution of model variables may be estimated fram the updated ensemble, but because of the low rank covariance approximation, the updated ensemble members become correlated samples from the posterior distribution. We suggest using multiple EnKF runs, each with smaller ensemble size to obtain truly independent samples from the posterior distribution. This allows a point-wise confidence interval for the posterior cumulative distribution function (CDF) to be constructed. We present a methodology for finding an optimal combination of ensemble batch size (n) and number of EnKF runs (m) while keeping the total number of ensemble members ( m x n) constant. The optimal combination of n and m is found through minimizing the integrated mean square error (MSE) for the CDFs and we choose to define an EnKF run with 10.000 ensemble members as having zero Monte Carlo error. The methodology is tested on a simplistic, synthetic 2D model, but should be applicable also to larger, more realistic models. (author). 12 refs., figs.,tabs
Ensemble method for dengue prediction.
Buczak, Anna L; Baugher, Benjamin; Moniz, Linda J; Bagley, Thomas; Babin, Steven M; Guven, Erhan
2018-01-01
In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.
Advanced Atmospheric Ensemble Modeling Techniques
Energy Technology Data Exchange (ETDEWEB)
Buckley, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Chiswell, S. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Kurzeja, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Maze, G. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Viner, B. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Werth, D. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL)
2017-09-29
Ensemble modeling (EM), the creation of multiple atmospheric simulations for a given time period, has become an essential tool for characterizing uncertainties in model predictions. We explore two novel ensemble modeling techniques: (1) perturbation of model parameters (Adaptive Programming, AP), and (2) data assimilation (Ensemble Kalman Filter, EnKF). The current research is an extension to work from last year and examines transport on a small spatial scale (<100 km) in complex terrain, for more rigorous testing of the ensemble technique. Two different release cases were studied, a coastal release (SF6) and an inland release (Freon) which consisted of two release times. Observations of tracer concentration and meteorology are used to judge the ensemble results. In addition, adaptive grid techniques have been developed to reduce required computing resources for transport calculations. Using a 20- member ensemble, the standard approach generated downwind transport that was quantitatively good for both releases; however, the EnKF method produced additional improvement for the coastal release where the spatial and temporal differences due to interior valley heating lead to the inland movement of the plume. The AP technique showed improvements for both release cases, with more improvement shown in the inland release. This research demonstrated that transport accuracy can be improved when models are adapted to a particular location/time or when important local data is assimilated into the simulation and enhances SRNL’s capability in atmospheric transport modeling in support of its current customer base and local site missions, as well as our ability to attract new customers within the intelligence community.
Alvionita; Sutikno; Suharsono, A.
2017-03-01
Cluster analysis is a technique in multivariate analysis methods that reduces (classifying) data. This analysis has the main purpose to classify the objects of observation into groups based on characteristics. In the process, a cluster analysis is not only used for numerical data or categorical data but also developed for mixed data. There are several methods in analyzing the mixed data as ensemble methods and methods Similarity Weight and Filter Methods (SWFM). There is a lot of research on these methods, but the study did not compare the performance given by both of these methods. Therefore, this paper will be compared the performance between the clustering ensemble ROCK methods and ensemble SWFM methods. These methods will be used in clustering cross citrus accessions based on the characteristics of fruit and leaves that involve variables that are a mixture of numerical and categorical. Clustering methods with the best performance determined by looking at the ratio of standard deviation values within groups (SW) with a standard deviation between groups (SB). Methods with the best performance has the smallest ratio. From the result, we get that the performance of ensemble ROCK methods is better than ensemble SWFM methods.
Ensembl Genomes: extending Ensembl across the taxonomic space.
Kersey, P J; Lawson, D; Birney, E; Derwent, P S; Haimel, M; Herrero, J; Keenan, S; Kerhornou, A; Koscielny, G; Kähäri, A; Kinsella, R J; Kulesha, E; Maheswari, U; Megy, K; Nuhn, M; Proctor, G; Staines, D; Valentin, F; Vilella, A J; Yates, A
2010-01-01
Ensembl Genomes (http://www.ensemblgenomes.org) is a new portal offering integrated access to genome-scale data from non-vertebrate species of scientific interest, developed using the Ensembl genome annotation and visualisation platform. Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. Many of the databases supporting the portal have been built in close collaboration with the scientific community, which we consider as essential for maintaining the accuracy and usefulness of the resource. A common set of user interfaces (which include a graphical genome browser, FTP, BLAST search, a query optimised data warehouse, programmatic access, and a Perl API) is provided for all domains. Data types incorporated include annotation of (protein and non-protein coding) genes, cross references to external resources, and high throughput experimental data (e.g. data from large scale studies of gene expression and polymorphism visualised in their genomic context). Additionally, extensive comparative analysis has been performed, both within defined clades and across the wider taxonomy, and sequence alignments and gene trees resulting from this can be accessed through the site.
Hybrid ensemble 4DVar assimilation of stratospheric ozone using a global shallow water model
Directory of Open Access Journals (Sweden)
D. R. Allen
2016-07-01
Full Text Available Wind extraction from stratospheric ozone (O3 assimilation is examined using a hybrid ensemble 4-D variational assimilation (4DVar shallow water model (SWM system coupled to the tracer advection equation. Stratospheric radiance observations are simulated using global observations of the SWM fluid height (Z, while O3 observations represent sampling by a typical polar-orbiting satellite. Four ensemble sizes were examined (25, 50, 100, and 1518 members, with the largest ensemble equal to the number of dynamical state variables. The optimal length scale for ensemble localization was found by tuning an ensemble Kalman filter (EnKF. This scale was then used for localizing the ensemble covariances that were blended with conventional covariances in the hybrid 4DVar experiments. Both optimal length scale and optimal blending coefficient increase with ensemble size, with optimal blending coefficients varying from 0.2–0.5 for small ensembles to 0.5–1.0 for large ensembles. The hybrid system outperforms conventional 4DVar for all ensemble sizes, while for large ensembles the hybrid produces similar results to the offline EnKF. Assimilating O3 in addition to Z benefits the winds in the hybrid system, with the fractional improvement in global vector wind increasing from ∼ 35 % with 25 and 50 members to ∼ 50 % with 1518 members. For the smallest ensembles (25 and 50 members, the hybrid 4DVar assimilation improves the zonal wind analysis over conventional 4DVar in the Northern Hemisphere (winter-like region and also at the Equator, where Z observations alone have difficulty constraining winds due to lack of geostrophy. For larger ensembles (100 and 1518 members, the hybrid system results in both zonal and meridional wind error reductions, relative to 4DVar, across the globe.
Baghurst, Timothy; Lirgg, Cathy
2009-06-01
The purpose of this study was to identify differences in traits associated with muscle dysmorphia between collegiate football players (n=66), weight trainers for physique (n=115), competitive non-natural bodybuilders (n=47), and competitive natural bodybuilders (n=65). All participants completed demographic questionnaires in addition to the Muscle Dysmorphia Inventory (Rhea, Lantz, & Cornelius, 2004). Results revealed a significant main effect for group, and post hoc tests found that the non-natural bodybuilding group did not score significantly higher than the natural bodybuilding group on any subscale except for Pharmacological Use. Both the non-natural and natural bodybuilding groups scored significantly higher than those that weight trained for physique on the Dietary Behavior and Supplement Use subscales. The collegiate football players scored lowest on all subscales of the Muscle Dysmorphia Inventory except for Physique Protection where they scored highest. Findings are discussed with future research expounded.
Malignancy and Abnormality Detection of Mammograms using Classifier Ensembling
Directory of Open Access Journals (Sweden)
Nawazish Naveed
2011-07-01
Full Text Available The breast cancer detection and diagnosis is a critical and complex procedure that demands high degree of accuracy. In computer aided diagnostic systems, the breast cancer detection is a two stage procedure. First, to classify the malignant and benign mammograms, while in second stage, the type of abnormality is detected. In this paper, we have developed a novel architecture to enhance the classification of malignant and benign mammograms using multi-classification of malignant mammograms into six abnormality classes. DWT (Discrete Wavelet Transformation features are extracted from preprocessed images and passed through different classifiers. To improve accuracy, results generated by various classifiers are ensembled. The genetic algorithm is used to find optimal weights rather than assigning weights to the results of classifiers on the basis of heuristics. The mammograms declared as malignant by ensemble classifiers are divided into six classes. The ensemble classifiers are further used for multiclassification using one-against-all technique for classification. The output of all ensemble classifiers is combined by product, median and mean rule. It has been observed that the accuracy of classification of abnormalities is more than 97% in case of mean rule. The Mammographic Image Analysis Society dataset is used for experimentation.
Orgilés, Mireia; Sanz, Isabel; Piqueras, José Antonio; Espada, José Pedro
2014-08-01
Obesity is a problem with serious implications for the physical, psychological and social health that affects millions of children and adolescents worldwide. This study wants to obtain updated prevalence data of obesity and overweight in adolescents from 10 to 12 years old in the province of Alicante, information on eating habits, physical activity and selected sociodemographic variables. It is important to examine their relation with children's obesity and overweight or at risk of suffering it. A total of 623 preteens participated, 49.9% male and 50.1% female. The IBM was determined following the WHO Child Growth Standards. It was found a high prevalence of obesity and overweight in our province: 20.4% and 34% respectively. The results showed no statistically significant differences between the categories by sex, age and educational level of parents. Regarding eating habits and of physical exercise, the results suggest that children with normal weight make more meals per day, and boys with normal weight eat more often in school canteens. Also it s suggest that boys with normal weight perform exercise more often than those who are overweight and obese, and obese girls use more hours of sedentary leisure than overweight girls. The results reinforce the need to develop effective prevention and early intervention programs for childhood obesity. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Optimization of multi-model ensemble forecasting of typhoon waves
Directory of Open Access Journals (Sweden)
Shun-qi Pan
2016-01-01
Full Text Available Accurately forecasting ocean waves during typhoon events is extremely important in aiding the mitigation and minimization of their potential damage to the coastal infrastructure, and the protection of coastal communities. However, due to the complex hydrological and meteorological interaction and uncertainties arising from different modeling systems, quantifying the uncertainties and improving the forecasting accuracy of modeled typhoon-induced waves remain challenging. This paper presents a practical approach to optimizing model-ensemble wave heights in an attempt to improve the accuracy of real-time typhoon wave forecasting. A locally weighted learning algorithm is used to obtain the weights for the wave heights computed by the WAVEWATCH III wave model driven by winds from four different weather models (model-ensembles. The optimized weights are subsequently used to calculate the resulting wave heights from the model-ensembles. The results show that the Optimization is capable of capturing the different behavioral effects of the different weather models on wave generation. Comparison with the measurements at the selected wave buoy locations shows that the optimized weights, obtained through a training process, can significantly improve the accuracy of the forecasted wave heights over the standard mean values, particularly for typhoon-induced peak waves. The results also indicate that the algorithm is easy to implement and practical for real-time wave forecasting.
Application Bayesian Model Averaging method for ensemble system for Poland
Guzikowski, Jakub; Czerwinska, Agnieszka
2014-05-01
The aim of the project is to evaluate methods for generating numerical ensemble weather prediction using a meteorological data from The Weather Research & Forecasting Model and calibrating this data by means of Bayesian Model Averaging (WRF BMA) approach. We are constructing height resolution short range ensemble forecasts using meteorological data (temperature) generated by nine WRF's models. WRF models have 35 vertical levels and 2.5 km x 2.5 km horizontal resolution. The main emphasis is that the used ensemble members has a different parameterization of the physical phenomena occurring in the boundary layer. To calibrate an ensemble forecast we use Bayesian Model Averaging (BMA) approach. The BMA predictive Probability Density Function (PDF) is a weighted average of predictive PDFs associated with each individual ensemble member, with weights that reflect the member's relative skill. For test we chose a case with heat wave and convective weather conditions in Poland area from 23th July to 1st August 2013. From 23th July to 29th July 2013 temperature oscillated below or above 30 Celsius degree in many meteorology stations and new temperature records were added. During this time the growth of the hospitalized patients with cardiovascular system problems was registered. On 29th July 2013 an advection of moist tropical air masses was recorded in the area of Poland causes strong convection event with mesoscale convection system (MCS). MCS caused local flooding, damage to the transport infrastructure, destroyed buildings, trees and injuries and direct threat of life. Comparison of the meteorological data from ensemble system with the data recorded on 74 weather stations localized in Poland is made. We prepare a set of the model - observations pairs. Then, the obtained data from single ensemble members and median from WRF BMA system are evaluated on the basis of the deterministic statistical error Root Mean Square Error (RMSE), Mean Absolute Error (MAE). To evaluation
Ensemble algorithms in reinforcement learning
Wiering, Marco A; van Hasselt, Hado
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms. We designed and
Zhu, Guanhua; Liu, Wei; Bao, Chenglong; Tong, Dudu; Ji, Hui; Shen, Zuowei; Yang, Daiwen; Lu, Lanyuan
2018-01-31
The structural variations of multi-domain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints such as inter-atomic distance. In this study, a hybrid structure-based and physics-based atomistic force field with an efficient sampling strategy is adopted to simulate a model di-domain protein against experimental paramagnetic relaxation enhancement (PRE) data that correspond to distance constraints. The molecular dynamics simulations produce a wide range of conformations depicted on a protein energy landscape. Subsequently, a conformational ensemble recovered with low-energy structures and the minimum-size restraint is identified in good agreement with experimental PRE rates, and the result is also supported by chemical shift perturbations and small-angle X-ray scattering data. It is illustrated that the regularizations of energy and ensemble-size prevent an arbitrary interpretation of protein conformations. Moreover, energy is found to serve as a critical control to refine the structure pool and prevent data over-fitting, because the absence of energy regularization exposes ensemble construction to the noise from high-energy structures and causes a more ambiguous representation of protein conformations. Finally, we perform structure-ensemble optimizations with a topology-based structure pool, to enhance the understanding on the ensemble results from different sources of pool candidates. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.
Global Ensemble Forecast System (GEFS) [1 Deg.
National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...
Schur, Ellen A; Heckbert, Susan R; Goldberg, Jack H
2010-06-01
We investigated the association of restrained eating with BMI and weight gain while controlling for the influence of genes and shared environment. Participants were 1,587 twins enrolled in the University of Washington Twin Registry (UWTR). Restrained eating was assessed by the Herman and Polivy Restraint Scale. Height and weight were self-reported on two occasions. Analyses used generalized estimating equations or multiple linear regression techniques. Restraint Scale scores were positively associated with both BMI (adjusted beta = 0.39 kg/m(2); 95% confidence interval (CI) = 0.34-0.44; P Scale scorers had an adjusted mean BMI of 27.9 kg/m(2) (95% CI = 27.4-28.4) as compared to intermediate (mean = 25.5 kg/m(2); 95% CI = 25.2-25.8) and low scorers (mean = 23.0 kg/m(2); 95% CI = 22.7-23.3). In within-pair analyses among 598 same-sex twin pairs, the adjusted association between Restraint Scale scores and BMI persisted even when genetic and shared environmental factors were controlled for (adjusted beta = 0.18; 95% CI = 0.12-0.24; P Scale score than monozygotic (MZ) twins, for whom genetics are 100% controlled (adjusted beta = 0.32; 95% CI = 0.20-0.44 vs. adjusted beta = 0.10; 95% CI = 0.04-0.16; P = 0.001 for test of interaction). These data demonstrate that observed relationships between BMI, weight gain, and restrained eating, as assessed by the Restraint Scale, have a strong environmental influence and are not solely due to shared genetic factors.
A Path Space Extension for Robust Light Transport Simulation
DEFF Research Database (Denmark)
Hachisuka, Toshiya; Pantaleoni, Jacopo; Jensen, Henrik Wann
2012-01-01
We present a new sampling space for light transport paths that makes it possible to describe Monte Carlo path integration and photon density estimation in the same framework. A key contribution of our paper is the introduction of vertex perturbations, which extends the space of paths with loosely...... coupled connections. The new framework enables the computation of path probabilities in the same space under the same measure, which allows us to use multiple importance sampling to combine Monte Carlo path integration and photon density estimation. The resulting algorithm, unified path sampling, can...
Abdollahi, Abbas; Abu Talib, Mansor
2016-01-01
To examine the relationships between self-esteem, body-esteem, emotional intelligence, and social anxiety, as well as to examine the moderating role of weight between exogenous variables and social anxiety, 520 university students completed the self-report measures. Structural equation modeling revealed that individuals with low self-esteem, body-esteem, and emotional intelligence were more likely to report social anxiety. The findings indicated that obese and overweight individuals with low body-esteem, emotional intelligence, and self-esteem had higher social anxiety than others. Our results highlight the roles of body-esteem, self-esteem, and emotional intelligence as influencing factors for reducing social anxiety.
The Identity Threat of Weight Stigma in Adolescents.
Hand, Wren B; Robinson, Jennifer C; Stewart, Mary W; Zhang, Lei; Hand, Samuel C
2017-08-01
Obesity remains a serious public health issue in adolescents, who may be subjected to weight stigma leading to increased stress and poor health outcomes. Stigma can be detrimental to adolescents during self-identity formation. The purpose of this study was to examine weight stigma in adolescents in light of the Identity Threat Model of Stigma. A cross-sectional correlational design was used to examine the relationships among the variables of weight stigma, psychosocial stress, coping styles, disordered eating, and physical inactivity. Regression modeling and path analysis were used to analyze the data. Over 90% of the sample had scores indicating weight stigma or antifat bias. Avoidant coping style and psychosocial stress predicted disordered eating. The strongest path in the model was from avoidant coping to disordered eating. The Identity Threat Model of Stigma partially explained adolescents' weight stigma. Nursing practice implications are discussed.
Multicomponent ensemble models to forecast induced seismicity
Király-Proag, E.; Gischig, V.; Zechar, J. D.; Wiemer, S.
2018-01-01
In recent years, human-induced seismicity has become a more and more relevant topic due to its economic and social implications. Several models and approaches have been developed to explain underlying physical processes or forecast induced seismicity. They range from simple statistical models to coupled numerical models incorporating complex physics. We advocate the need for forecast testing as currently the best method for ascertaining if models are capable to reasonably accounting for key physical governing processes—or not. Moreover, operational forecast models are of great interest to help on-site decision-making in projects entailing induced earthquakes. We previously introduced a standardized framework following the guidelines of the Collaboratory for the Study of Earthquake Predictability, the Induced Seismicity Test Bench, to test, validate, and rank induced seismicity models. In this study, we describe how to construct multicomponent ensemble models based on Bayesian weightings that deliver more accurate forecasts than individual models in the case of Basel 2006 and Soultz-sous-Forêts 2004 enhanced geothermal stimulation projects. For this, we examine five calibrated variants of two significantly different model groups: (1) Shapiro and Smoothed Seismicity based on the seismogenic index, simple modified Omori-law-type seismicity decay, and temporally weighted smoothed seismicity; (2) Hydraulics and Seismicity based on numerically modelled pore pressure evolution that triggers seismicity using the Mohr-Coulomb failure criterion. We also demonstrate how the individual and ensemble models would perform as part of an operational Adaptive Traffic Light System. Investigating seismicity forecasts based on a range of potential injection scenarios, we use forecast periods of different durations to compute the occurrence probabilities of seismic events M ≥ 3. We show that in the case of the Basel 2006 geothermal stimulation the models forecast hazardous levels
Measuring social interaction in music ensembles
Volpe, Gualtiero; D’Ausilio, Alessandro; Badino, Leonardo; Camurri, Antonio; Fadiga, Luciano
2016-01-01
Music ensembles are an ideal test-bed for quantitative analysis of social interaction. Music is an inherently social activity, and music ensembles offer a broad variety of scenarios which are particularly suitable for investigation. Small ensembles, such as string quartets, are deemed a significant example of self-managed teams, where all musicians contribute equally to a task. In bigger ensembles, such as orchestras, the relationship between a leader (the conductor) and a group of followers ...
Statistical Mechanics of Time Domain Ensemble Learning
Miyoshi, Seiji; Uezu, Tatsuya; Okada, Masato
2006-01-01
Conventional ensemble learning combines students in the space domain. On the other hand, in this paper we combine students in the time domain and call it time domain ensemble learning. In this paper, we analyze the generalization performance of time domain ensemble learning in the framework of online learning using a statistical mechanical method. We treat a model in which both the teacher and the student are linear perceptrons with noises. Time domain ensemble learning is twice as effective ...
Understanding the structural ensembles of a highly extended disordered protein.
Daughdrill, Gary W; Kashtanov, Stepan; Stancik, Amber; Hill, Shannon E; Helms, Gregory; Muschol, Martin; Receveur-Bréchot, Véronique; Ytreberg, F Marty
2012-01-01
Developing a comprehensive description of the equilibrium structural ensembles for intrinsically disordered proteins (IDPs) is essential to understanding their function. The p53 transactivation domain (p53TAD) is an IDP that interacts with multiple protein partners and contains numerous phosphorylation sites. Multiple techniques were used to investigate the equilibrium structural ensemble of p53TAD in its native and chemically unfolded states. The results from these experiments show that the native state of p53TAD has dimensions similar to a classical random coil while the chemically unfolded state is more extended. To investigate the molecular properties responsible for this behavior, a novel algorithm that generates diverse and unbiased structural ensembles of IDPs was developed. This algorithm was used to generate a large pool of plausible p53TAD structures that were reweighted to identify a subset of structures with the best fit to small angle X-ray scattering data. High weight structures in the native state ensemble show features that are localized to protein binding sites and regions with high proline content. The features localized to the protein binding sites are mostly eliminated in the chemically unfolded ensemble; while, the regions with high proline content remain relatively unaffected. Data from NMR experiments support these results, showing that residues from the protein binding sites experience larger environmental changes upon unfolding by urea than regions with high proline content. This behavior is consistent with the urea-induced exposure of nonpolar and aromatic side-chains in the protein binding sites that are partially excluded from solvent in the native state ensemble.
A New Path Generation Algorithm Based on Accurate NURBS Curves
Sawssen Jalel; Philippe Marthon; Atef Hamouda
2016-01-01
The process of finding an optimum, smooth and feasible global path for mobile robot navigation usually involves determining the shortest polyline path, which will be subsequently smoothed to satisfy the requirements. Within this context, this paper deals with a novel roadmap algorithm for generating an optimal path in terms of Non-Uniform Rational B-Splines (NURBS) curves. The generated path is well constrained within the curvature limit by exploiting the influence of the weight parameter of ...
Ducrot, Pauline; Méjean, Caroline; Aroumougame, Vani; Ibanez, Gladys; Allès, Benjamin; Kesse-Guyot, Emmanuelle; Hercberg, Serge; Péneau, Sandrine
2017-02-02
Meal planning could be a potential tool to offset time scarcity and therefore encourage home meal preparation, which has been linked with an improved diet quality. However, to date, meal planning has received little attention in the scientific literature. The aim of our cross-sectional study was to investigate the association between meal planning and diet quality, including adherence to nutritional guidelines and food variety, as well as weight status. Meal planning, i.e. planning ahead the foods that will be eaten for the next few days, was assessed in 40,554 participants of the web-based observational NutriNet-Santé study. Dietary measurements included intakes of energy, nutrients, food groups, and adherence to the French nutritional guidelines (mPNNS-GS) estimated through repeated 24-h dietary records. A food variety score was also calculated using Food Frequency Questionnaire. Weight and height were self-reported. Association between meal planning and dietary intakes were assessed using ANCOVAs, while associations with quartiles of mPNNS-GS scores, quartiles of food variety score and weight status categories (overweight, obesity) were evaluated using logistic regression models. A total of 57% of the participants declared to plan meals at least occasionally. Meal planners were more likely to have a higher mPNNS-GS (OR quartile 4 vs. 1 = 1.13, 95% CI: [1.07-1.20]), higher overall food variety (OR quartile 4 vs. 1 = 1.25, 95% CI: [1.18-1.32]). In women, meal planning was associated with lower odds of being overweight (OR = 0.92 [0.87-0.98]) and obese (OR = 0.79 [0.73-0.86]). In men, the association was significant for obesity only (OR = 0.81 [0.69-0.94]). Meal planning was associated with a healthier diet and less obesity. Although no causality can be inferred from the reported associations, these data suggest that meal planning could potentially be relevant for obesity prevention.
Bayesian ensemble approach to error estimation of interatomic potentials
DEFF Research Database (Denmark)
Frederiksen, Søren Lund; Jacobsen, Karsten Wedel; Brown, K.S.
2004-01-01
Using a Bayesian approach a general method is developed to assess error bars on predictions made by models fitted to data. The error bars are estimated from fluctuations in ensembles of models sampling the model-parameter space with a probability density set by the minimum cost. The method is app...
Ensemble Kalman filtering with one-step-ahead smoothing
Raboudi, Naila F.
2018-01-11
The ensemble Kalman filter (EnKF) is widely used for sequential data assimilation. It operates as a succession of forecast and analysis steps. In realistic large-scale applications, EnKFs are implemented with small ensembles and poorly known model error statistics. This limits their representativeness of the background error covariances and, thus, their performance. This work explores the efficiency of the one-step-ahead (OSA) smoothing formulation of the Bayesian filtering problem to enhance the data assimilation performance of EnKFs. Filtering with OSA smoothing introduces an updated step with future observations, conditioning the ensemble sampling with more information. This should provide an improved background ensemble in the analysis step, which may help to mitigate the suboptimal character of EnKF-based methods. Here, the authors demonstrate the efficiency of a stochastic EnKF with OSA smoothing for state estimation. They then introduce a deterministic-like EnKF-OSA based on the singular evolutive interpolated ensemble Kalman (SEIK) filter. The authors show that the proposed SEIK-OSA outperforms both SEIK, as it efficiently exploits the data twice, and the stochastic EnKF-OSA, as it avoids observational error undersampling. They present extensive assimilation results from numerical experiments conducted with the Lorenz-96 model to demonstrate SEIK-OSA’s capabilities.
Ensemble size impact on the decadal predictive skill assessment
Directory of Open Access Journals (Sweden)
Frank Sienz
2016-12-01
Full Text Available Retrospective prediction experiments have to be performed to estimate the skill of decadal prediction systems. These are necessarily restricted in the number due to the computational constraints. From weather and seasonal prediction it is known that the ensemble size is crucial to yield reliable predictions. Differences are expected for decadal predictions due to the differing time-scales of the involved processes and the longer prediction horizon. A conceptual model is applied that enables the systematic analysis of ensemble size dependencies in a framework close to that of decadal predictions. Differences are quantified in terms of the confidence intervals coverage and the power of statistical tests for prediction scores. In addition, the concepts are applied to decadal predicitions of the MiKlip Baseline1 system. It is shown that small ensemble, as well as hindcast sample sizes lead to biased test performances in a way that the detection of a present prediction skill is hampered. Experiments with ensemble sizes smaller than 10 are not recommended to evaluate decadal prediction skill or as basis for the prediction system developement. For regions with low signal-to-noise ratios much larger ensembles are required and it is shown that in this case successful decadal predictions are possible for the Central European summer temperatures.
Directory of Open Access Journals (Sweden)
Laura J Kingsley
Full Text Available Computational prediction of ligand entry and egress paths in proteins has become an emerging topic in computational biology and has proven useful in fields such as protein engineering and drug design. Geometric tunnel prediction programs, such as Caver3.0 and MolAxis, are computationally efficient methods to identify potential ligand entry and egress routes in proteins. Although many geometric tunnel programs are designed to accommodate a single input structure, the increasingly recognized importance of protein flexibility in tunnel formation and behavior has led to the more widespread use of protein ensembles in tunnel prediction. However, there has not yet been an attempt to directly investigate the influence of ensemble size and composition on geometric tunnel prediction. In this study, we compared tunnels found in a single crystal structure to ensembles of various sizes generated using different methods on both the apo and holo forms of cytochrome P450 enzymes CYP119, CYP2C9, and CYP3A4. Several protein structure clustering methods were tested in an attempt to generate smaller ensembles that were capable of reproducing the data from larger ensembles. Ultimately, we found that by including members from both the apo and holo data sets, we could produce ensembles containing less than 15 members that were comparable to apo or holo ensembles containing over 100 members. Furthermore, we found that, in the absence of either apo or holo crystal structure data, pseudo-apo or -holo ensembles (e.g. adding ligand to apo protein throughout MD simulations could be used to resemble the structural ensembles of the corresponding apo and holo ensembles, respectively. Our findings not only further highlight the importance of including protein flexibility in geometric tunnel prediction, but also suggest that smaller ensembles can be as capable as larger ensembles at capturing many of the protein motions important for tunnel prediction at a lower computational
Exploring diversity in ensemble classification: Applications in large area land cover mapping
Mellor, Andrew; Boukir, Samia
2017-07-01
Ensemble classifiers, such as random forests, are now commonly applied in the field of remote sensing, and have been shown to perform better than single classifier systems, resulting in reduced generalisation error. Diversity across the members of ensemble classifiers is known to have a strong influence on classification performance - whereby classifier errors are uncorrelated and more uniformly distributed across ensemble members. The relationship between ensemble diversity and classification performance has not yet been fully explored in the fields of information science and machine learning and has never been examined in the field of remote sensing. This study is a novel exploration of ensemble diversity and its link to classification performance, applied to a multi-class canopy cover classification problem using random forests and multisource remote sensing and ancillary GIS data, across seven million hectares of diverse dry-sclerophyll dominated public forests in Victoria Australia. A particular emphasis is placed on analysing the relationship between ensemble diversity and ensemble margin - two key concepts in ensemble learning. The main novelty of our work is on boosting diversity by emphasizing the contribution of lower margin instances used in the learning process. Exploring the influence of tree pruning on diversity is also a new empirical analysis that contributes to a better understanding of ensemble performance. Results reveal insights into the trade-off between ensemble classification accuracy and diversity, and through the ensemble margin, demonstrate how inducing diversity by targeting lower margin training samples is a means of achieving better classifier performance for more difficult or rarer classes and reducing information redundancy in classification problems. Our findings inform strategies for collecting training data and designing and parameterising ensemble classifiers, such as random forests. This is particularly important in large area
A Flexible Approach for the Statistical Visualization of Ensemble Data
Energy Technology Data Exchange (ETDEWEB)
Potter, K. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute; Wilson, A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bremer, P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Williams, Dean N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, V. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute; Johnson, C. [Univ. of Utah, Salt Lake City, UT (United States). SCI Institute
2009-09-29
Scientists are increasingly moving towards ensemble data sets to explore relationships present in dynamic systems. Ensemble data sets combine spatio-temporal simulation results generated using multiple numerical models, sampled input conditions and perturbed parameters. While ensemble data sets are a powerful tool for mitigating uncertainty, they pose significant visualization and analysis challenges due to their complexity. We present a collection of overview and statistical displays linked through a high level of interactivity to provide a framework for gaining key scientific insight into the distribution of the simulation results as well as the uncertainty associated with the data. In contrast to methods that present large amounts of diverse information in a single display, we argue that combining multiple linked statistical displays yields a clearer presentation of the data and facilitates a greater level of visual data analysis. We demonstrate this approach using driving problems from climate modeling and meteorology and discuss generalizations to other fields.
Describing sequence-ensemble relationships for intrinsically disordered proteins.
Mao, Albert H; Lyle, Nicholas; Pappu, Rohit V
2013-01-15
Intrinsically disordered proteins participate in important protein-protein and protein-nucleic acid interactions and control cellular phenotypes through their prominence as dynamic organizers of transcriptional, post-transcriptional and signalling networks. These proteins challenge the tenets of the structure-function paradigm and their functional mechanisms remain a mystery given that they fail to fold autonomously into specific structures. Solving this mystery requires a first principles understanding of the quantitative relationships between information encoded in the sequences of disordered proteins and the ensemble of conformations they sample. Advances in quantifying sequence-ensemble relationships have been facilitated through a four-way synergy between bioinformatics, biophysical experiments, computer simulations and polymer physics theories. In the present review we evaluate these advances and the resultant insights that allow us to develop a concise quantitative framework for describing the sequence-ensemble relationships of intrinsically disordered proteins.
R.C. Laskar (R.C.); H.M. Mulder (Martyn)
2013-01-01
textabstractA path-neighborhood graph is a connected graph in which every neighborhood induces a path. In the main results the 3-sun-free path-neighborhood graphs are characterized. The 3-sun is obtained from a 6-cycle by adding three chords between the three pairs of vertices at distance 2. A Pk
Hoogerheide, L.F.; Opschoor, A.; van Dijk, H.K.
2012-01-01
A class of adaptive sampling methods is introduced for efficient posterior and predictive simulation. The proposed methods are robust in the sense that they can handle target distributions that exhibit non-elliptical shapes such as multimodality and skewness. The basic method makes use of sequences
Super-ensemble techniques: Application to surface drift prediction
Vandenbulcke, L.; Beckers, J.-M.; Lenartz, F.; Barth, A.; Poulain, P.-M.; Aidonidis, M.; Meyrat, J.; Ardhuin, F.; Tonani, M.; Fratianni, C.; Torrisi, L.; Pallela, D.; Chiggiato, J.; Tudor, M.; Book, J. W.; Martin, P.; Peggion, G.; Rixen, M.
2009-09-01
The prediction of surface drift of floating objects is an important task, with applications such as marine transport, pollutant dispersion, and search-and-rescue activities. But forecasting even the drift of surface waters is very challenging, because it depends on complex interactions of currents driven by the wind, the wave field and the general prevailing circulation. Furthermore, although each of those can be forecasted by deterministic models, the latter all suffer from limitations, resulting in imperfect predictions. In the present study, we try and predict the drift of two buoys launched during the DART06 (Dynamics of the Adriatic sea in Real-Time 2006) and MREA07 (Maritime Rapid Environmental Assessment 2007) sea trials, using the so-called hyper-ensemble technique: different models are combined in order to minimize departure from independent observations during a training period; the obtained combination is then used in forecasting mode. We review and try out different hyper-ensemble techniques, such as the simple ensemble mean, least-squares weighted linear combinations, and techniques based on data assimilation, which dynamically update the model’s weights in the combination when new observations become available. We show that the latter methods alleviate the need of fixing the training length a priori, as older information is automatically discarded. When the forecast period is relatively short (12 h), the discussed methods lead to much smaller forecasting errors compared with individual models (at least three times smaller), with the dynamic methods leading to the best results. When many models are available, errors can be further reduced by removing colinearities between them by performing a principal component analysis. At the same time, this reduces the amount of weights to be determined. In complex environments when meso- and smaller scale eddy activity is strong, such as the Ligurian Sea, the skill of individual models may vary over time periods
An alternative path integral for quantum gravity
Energy Technology Data Exchange (ETDEWEB)
Krishnan, Chethan; Kumar, K.V. Pavan; Raju, Avinash [Center for High Energy Physics, Indian Institute of Science,Bangalore 560012 (India)
2016-10-10
We define a (semi-classical) path integral for gravity with Neumann boundary conditions in D dimensions, and show how to relate this new partition function to the usual picture of Euclidean quantum gravity. We also write down the action in ADM Hamiltonian formulation and use it to reproduce the entropy of black holes and cosmological horizons. A comparison between the (background-subtracted) covariant and Hamiltonian ways of semi-classically evaluating this path integral in flat space reproduces the generalized Smarr formula and the first law. This “Neumann ensemble” perspective on gravitational thermodynamics is parallel to the canonical (Dirichlet) ensemble of Gibbons-Hawking and the microcanonical approach of Brown-York.
Probabilistic Flash Flood Forecasting using Stormscale Ensembles
Hardy, J.; Gourley, J. J.; Kain, J. S.; Clark, A.; Novak, D.; Hong, Y.
2013-12-01
Carlo sampling. This yields an ensemble of flash flood simulations. These simulated flows are compared to historically-based flow thresholds at each grid point to identify basin scales most susceptible to flash flooding, therefore, deriving PFFF products. This new approach is shown to: 1) identify the specific basin scales within the broader regions that are forecast to be impacted by flash flooding based on cell movement, rainfall intensity, duration, and the basin's susceptibility factors such as initial soil moisture conditions; 2) yield probabilistic information about on the forecast hydrologic response; and 3) improve lead time by using stormscale NWP ensemble forecasts.
A brief history of the introduction of generalized ensembles to Markov chain Monte Carlo simulations
Berg, Bernd A.
2017-03-01
The most efficient weights for Markov chain Monte Carlo calculations of physical observables are not necessarily those of the canonical ensemble. Generalized ensembles, which do not exist in nature but can be simulated on computers, lead often to a much faster convergence. In particular, they have been used for simulations of first order phase transitions and for simulations of complex systems in which conflicting constraints lead to a rugged free energy landscape. Starting off with the Metropolis algorithm and Hastings' extension, I present a minireview which focuses on the explosive use of generalized ensembles in the early 1990s. Illustrations are given, which range from spin models to peptides.
Directory of Open Access Journals (Sweden)
Janevski Milica Ranković
2016-04-01
Full Text Available Background: Salivary cortisol measurement is a non-invasive method suitable for use in neonatal research. Mother-infant separation after birth represents stress and skin-to-skin contact (SSC has numerous benefits. The aim of the study was to measure salivary cortisol in mothers and newborns before and after SSC in order to assess the effect of SSC on mothers’ and infants’ stress and to estimate the efficacy of collecting small saliva samples in newborns.
Directory of Open Access Journals (Sweden)
Chad Davis
2014-01-01
Full Text Available Ethnic minorities continue to be disproportionately affected by obesity and are less likely to access healthcare than Caucasians. It is therefore imperative that researchers develop novel methods that will attract these difficult-to-reach groups. The purpose of the present study is to describe characteristics of an urban community sample attracted to a spiritually based, weight loss intervention. Methods. Thirteen participants enrolled in a pilot version of Spiritual Self-Schema Therapy (3S applied to disordered eating behavior and obesity. Treatment consisted of 12 one-hour sessions in a group therapy format. At baseline, participants were measured for height and weight and completed a battery of self-report measures. Results. The sample was predominantly African-American and Hispanic and a large percentage of the sample was male. Mean baseline scores of the EDE-Q, YFAS, and the CES-D revealed clinically meaningful levels of eating disordered pathology and depression, respectively. The overall attrition rate was quite low for interventions targeting obesity. Discussion. This application of a spiritually centered intervention seemed to attract and retain a predominantly African-American and Hispanic sample. By incorporating a culturally congruent focus, this approach may have been acceptable to individuals who are traditionally more difficult to reach.
Method to detect gravitational waves from an ensemble of known pulsars
Fan, Xilong; Messenger, Christopher
2016-01-01
Combining information from weak sources, such as known pulsars, for gravitational wave detection, is an attractive approach to improve detection efficiency. We propose an optimal statistic for a general ensemble of signals and apply it to an ensemble of known pulsars. Our method combines $\\mathcal F$-statistic values from individual pulsars using weights proportional to each pulsar's expected optimal signal-to-noise ratio to improve the detection efficiency. We also point out that to detect at least one pulsar within an ensemble, different thresholds should be designed for each source based on the expected signal strength. The performance of our proposed detection statistic is demonstrated using simulated sources, with the assumption that all pulsars' ellipticities belong to a common (yet unknown) distribution. Comparing with an equal-weight strategy and with individual source approaches, we show that the weighted-combination of all known pulsars, where weights are assigned based on the pulsars' known informa...
Trends in the predictive performance of raw ensemble weather forecasts
Hemri, Stephan; Scheuerer, Michael; Pappenberger, Florian; Bogner, Konrad; Haiden, Thomas
2015-04-01
Over the last two decades the paradigm in weather forecasting has shifted from being deterministic to probabilistic. Accordingly, numerical weather prediction (NWP) models have been run increasingly as ensemble forecasting systems. The goal of such ensemble forecasts is to approximate the forecast probability distribution by a finite sample of scenarios. Global ensemble forecast systems, like the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble, are prone to probabilistic biases, and are therefore not reliable. They particularly tend to be underdispersive for surface weather parameters. Hence, statistical post-processing is required in order to obtain reliable and sharp forecasts. In this study we apply statistical post-processing to ensemble forecasts of near-surface temperature, 24-hour precipitation totals, and near-surface wind speed from the global ECMWF model. Our main objective is to evaluate the evolution of the difference in skill between the raw ensemble and the post-processed forecasts. The ECMWF ensemble is under continuous development, and hence its forecast skill improves over time. Parts of these improvements may be due to a reduction of probabilistic bias. Thus, we first hypothesize that the gain by post-processing decreases over time. Based on ECMWF forecasts from January 2002 to March 2014 and corresponding observations from globally distributed stations we generate post-processed forecasts by ensemble model output statistics (EMOS) for each station and variable. Parameter estimates are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over rolling training periods that consist of the n days preceding the initialization dates. Given the higher average skill in terms of CRPS of the post-processed forecasts for all three variables, we analyze the evolution of the difference in skill between raw ensemble and EMOS forecasts. The fact that the gap in skill remains almost constant over time, especially for near
Ensemble Kalman filtering with residual nudging
Luo, X.
2012-10-03
Covariance inflation and localisation are two important techniques that are used to improve the performance of the ensemble Kalman filter (EnKF) by (in effect) adjusting the sample covariances of the estimates in the state space. In this work, an additional auxiliary technique, called residual nudging, is proposed to monitor and, if necessary, adjust the residual norms of state estimates in the observation space. In an EnKF with residual nudging, if the residual norm of an analysis is larger than a pre-specified value, then the analysis is replaced by a new one whose residual norm is no larger than a pre-specified value. Otherwise, the analysis is considered as a reasonable estimate and no change is made. A rule for choosing the pre-specified value is suggested. Based on this rule, the corresponding new state estimates are explicitly derived in case of linear observations. Numerical experiments in the 40-dimensional Lorenz 96 model show that introducing residual nudging to an EnKF may improve its accuracy and/or enhance its stability against filter divergence, especially in the small ensemble scenario.
Deterministic Mean-Field Ensemble Kalman Filtering
Law, Kody
2016-05-03
The proof of convergence of the standard ensemble Kalman filter (EnKF) from Le Gland, Monbet, and Tran [Large sample asymptotics for the ensemble Kalman filter, in The Oxford Handbook of Nonlinear Filtering, Oxford University Press, Oxford, UK, 2011, pp. 598--631] is extended to non-Gaussian state-space models. A density-based deterministic approximation of the mean-field limit EnKF (DMFEnKF) is proposed, consisting of a PDE solver and a quadrature rule. Given a certain minimal order of convergence k between the two, this extends to the deterministic filter approximation, which is therefore asymptotically superior to standard EnKF for dimension d<2k. The fidelity of approximation of the true distribution is also established using an extension of the total variation metric to random measures. This is limited by a Gaussian bias term arising from nonlinearity/non-Gaussianity of the model, which arises in both deterministic and standard EnKF. Numerical results support and extend the theory.
Ensemble Kalman filtering with residual nudging
Directory of Open Access Journals (Sweden)
Xiaodong Luo
2012-10-01
Full Text Available Covariance inflation and localisation are two important techniques that are used to improve the performance of the ensemble Kalman filter (EnKF by (in effect adjusting the sample covariances of the estimates in the state space. In this work, an additional auxiliary technique, called residual nudging, is proposed to monitor and, if necessary, adjust the residual norms of state estimates in the observation space. In an EnKF with residual nudging, if the residual norm of an analysis is larger than a pre-specified value, then the analysis is replaced by a new one whose residual norm is no larger than a pre-specified value. Otherwise, the analysis is considered as a reasonable estimate and no change is made. A rule for choosing the pre-specified value is suggested. Based on this rule, the corresponding new state estimates are explicitly derived in case of linear observations. Numerical experiments in the 40-dimensional Lorenz 96 model show that introducing residual nudging to an EnKF may improve its accuracy and/or enhance its stability against filter divergence, especially in the small ensemble scenario.
Hosseini, Mostafa; Navidi, Iman; Hesamifard, Bahare; Yousefifard, Mahmoud; Jafari, Nasim; Poorchaloo, Sakine Ranji; Ataei, Neamatollah
2013-12-01
Assessing growth is a useful tool for defining health and nutritional status of children. The objective of this study was to construct growth reference curves of Iranian infants and children (0-6 years old) and compare them with previous and international references. Weight, height or length of 2107 Iranian infants and children aged 0-6 years old were measured using a cross-sectional survey in Tehran in 2010. Standard smooth reference curves for Iranian population were constructed and compared to multinational World Health Organization 2006 reference standards as well as a previous study from two decades ago. Growth index references for Iranian girls are increased in compare to data from two decades ago and are approximately close to the international references. In boys; however, the increment was considerably large as it passed the international references. Not only the values for indexes was changed during two decades, but also the age at adiposity rebound came near the age of 3, which is an important risk factor for later obesity. During two decades, growth indexes of Iranian children raised noticeable. Risk factors for later obesity are now apparent and demand immediate policy formulations. In addition, reference curves presented in this paper can be used as a diagnostic tool for monitoring growth of Iranian children.
Directory of Open Access Journals (Sweden)
Mostafa Hosseini
2013-01-01
Full Text Available Background: Assessing growth is a useful tool for defining health and nutritional status of children. The objective of this study was to construct growth reference curves of Iranian infants and children (0-6 years old and compare them with previous and international references. Methods: Weight, height or length of 2107 Iranian infants and children aged 0-6 years old were measured using a cross-sectional survey in Tehran in 2010. Standard smooth reference curves for Iranian population were constructed and compared to multinational World Health Organization 2006 reference standards as well as a previous study from two decades ago. Results: Growth index references for Iranian girls are increased in compare to data from two decades ago and are approximately close to the international references. In boys; however, the increment was considerably large as it passed the international references. Not only the values for indexes was changed during two decades, but also the age at adiposity rebound came near the age of 3, which is an important risk factor for later obesity. Conclusions: During two decades, growth indexes of Iranian children raised noticeable. Risk factors for later obesity are now apparent and demand immediate policy formulations. In addition, reference curves presented in this paper can be used as a diagnostic tool for monitoring growth of Iranian children.
Ensemble hydromoeteorological forecasting in Denmark
DEFF Research Database (Denmark)
Lucatero Villasenor, Diana
of the main sources of uncertainty in hydrological forecasts. This is the reason why substantiated efforts to include information from Numerical Weather Predictors (NWP) or General Circulation Models (GCM) have been made over the last couple of decades. The present thesis expects to advance the field......teorological extremes such as flood and droughts cause economical and live losses that could be, if not prevented, at least dampened if sufficient time is given to respond to potential threats. This is the ultimate purpose of forecasting which then translates into making reliable predictions...... of ensemble hydrometeorological forecasting by evaluating the added value of NWP and GCM ensemble prediction systems (EPS) for hydrological purposes. The use of NWP EPS that differ in both spatial and temporal resolution to feed a hydrological model for discharge forecasts at specific points, revealed two...
Symanzik flow on HISQ ensembles
Bazavov, A; Brown, N; DeTar, C; Foley, J; Gottlieb, Steven; Heller, U M; Hetrick, J E; Laiho, J; Levkova, L; Oktay, M; Sugar, R L; Toussaint, D; Van de Water, R S; Zhou, R
2013-01-01
We report on a scale determination with gradient-flow techniques on the $N_f = 2 + 1 + 1$ HISQ ensembles generated by the MILC collaboration. The lattice scale $w_0/a$, originally proposed by the BMW collaboration, is computed using Symanzik flow at four lattice spacings ranging from 0.15 to 0.06 fm. With a Taylor series ansatz, the results are simultaneously extrapolated to the continuum and interpolated to physical quark masses. We give a preliminary determination of the scale $w_0$ in physical units, along with associated systematic errors, and compare with results from other groups. We also present a first estimate of autocorrelation lengths as a function of flowtime for these ensembles.
Nonequilibrium statistical mechanics ensemble method
Eu, Byung Chan
1998-01-01
In this monograph, nonequilibrium statistical mechanics is developed by means of ensemble methods on the basis of the Boltzmann equation, the generic Boltzmann equations for classical and quantum dilute gases, and a generalised Boltzmann equation for dense simple fluids The theories are developed in forms parallel with the equilibrium Gibbs ensemble theory in a way fully consistent with the laws of thermodynamics The generalised hydrodynamics equations are the integral part of the theory and describe the evolution of macroscopic processes in accordance with the laws of thermodynamics of systems far removed from equilibrium Audience This book will be of interest to researchers in the fields of statistical mechanics, condensed matter physics, gas dynamics, fluid dynamics, rheology, irreversible thermodynamics and nonequilibrium phenomena
van Zutven, K; Mond, J; Latner, J; Rodgers, B
2015-02-01
We examined the relative importance of physical health status, weight/shape concerns and binge eating as mediators of the association between obesity and psychosocial impairment in a community sample of women and men. Self-report measures of eating disorder features, perceived physical health and psychosocial functioning were completed by a general population sample of women and men classified as obese or non-obese (women: obese=276, non-obese=1220; men: obese=169, non-obese=769). Moderated mediation analysis was used to assess the relative importance of each of the putative mediators in accounting for observed associations between obesity and each outcome measure and possible moderation of these effects by sex. Weight/shape concerns and physical health were equally strong mediators of the association between obesity and psychosocial impairment. This was the case for both men and women and for each of three measures of psychosocial functioning-general psychological distress, life satisfaction and social support-employed. The effects of binge eating were modest and reached statistical significance only for the life satisfaction measure in men. A greater focus on body acceptance may be indicated in obesity prevention and weight-management programs.
Link fermions and dynamically correlated paths for lattice gauge theory
Energy Technology Data Exchange (ETDEWEB)
Brower, R.C. (Harvard Univ., Cambridge, MA (USA). Lyman Lab. of Physics); Giles, R.C. (Massachusetts Inst. of Tech., Cambridge (USA). Lab. for Nuclear Science); Kessler, D.A. (Los Alamos National Lab., NM (USA). Theoretical Div.); Maturana, G. (California Univ., Santa Cruz (USA). Physics Dept.)
1983-07-07
The calculation of fermion bound states in lattice QCD is discussed from the point of view of the Feynman path integral and the corresponding lattice 'path sum' representation of the fermion propagator. Path sum methods which correlate the trajectories of valence fermion and antifermion constituents of a meson bound state are presented. The resultant Monte Carlo algorithm for the meson propagator samples predominantly those configurations which are expected to be most important for a tightly bound system. Relative to other techniques, this procedure anticipates cancellations due to gauge field averaging, and in addition, allows a more detailed examination of the bound state wavefunction. Inspired by the fermionic path representation of the 2D Ising model, we also introduce a new class of lattice fermion actions with nearest neighbor interactions between Grassman variables associated with links. These link fermions are a simple generalization of Wilson's fermions. They have an additional corner weight parameter which can be adjusted to obtain a much improved dispersion relation for moderate and parge lattice momenta.
Multimodel ensembles of wheat growth
DEFF Research Database (Denmark)
Martre, Pierre; Wallach, Daniel; Asseng, Senthold
2015-01-01
, but such studies are difficult to organize and have only recently begun. We report on the largest ensemble study to date, of 27 wheat models tested in four contrasting locations for their accuracy in simulating multiple crop growth and yield variables. The relative error averaged over models was 24...... are applicable to other crop species, and hypothesize that they apply more generally to ecological system models....
Dimensionality Reduction Through Classifier Ensembles
Oza, Nikunj C.; Tumer, Kagan; Norwig, Peter (Technical Monitor)
1999-01-01
In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.
Kim, Kue Bum; Kwon, Hyun-Han; Han, Dawei
2016-05-01
This study presents a novel bias correction scheme for regional climate model (RCM) precipitation ensembles. A primary advantage of using model ensembles for climate change impact studies is that the uncertainties associated with the systematic error can be quantified through the ensemble spread. Currently, however, most of the conventional bias correction methods adjust all the ensemble members to one reference observation. As a result, the ensemble spread is degraded during bias correction. Since the observation is only one case of many possible realizations due to the climate natural variability, a successful bias correction scheme should preserve the ensemble spread within the bounds of its natural variability (i.e. sampling uncertainty). To demonstrate a new bias correction scheme conforming to RCM precipitation ensembles, an application to the Thorverton catchment in the south-west of England is presented. For the ensemble, 11 members from the Hadley Centre Regional Climate Model (HadRM3-PPE) data are used and monthly bias correction has been done for the baseline time period from 1961 to 1990. In the typical conventional method, monthly mean precipitation of each of the ensemble members is nearly identical to the observation, i.e. the ensemble spread is removed. In contrast, the proposed method corrects the bias while maintaining the ensemble spread within the natural variability of the observations.
Multsch, S.; Exbrayat, J.-F.; Kirby, M.; Viney, N. R.; Frede, H.-G.; Breuer, L.
2015-04-01
Irrigation agriculture plays an increasingly important role in food supply. Many evapotranspiration models are used today to estimate the water demand for irrigation. They consider different stages of crop growth by empirical crop coefficients to adapt evapotranspiration throughout the vegetation period. We investigate the importance of the model structural versus model parametric uncertainty for irrigation simulations by considering six evapotranspiration models and five crop coefficient sets to estimate irrigation water requirements for growing wheat in the Murray-Darling Basin, Australia. The study is carried out using the spatial decision support system SPARE:WATER. We find that structural model uncertainty among reference ET is far more important than model parametric uncertainty introduced by crop coefficients. These crop coefficients are used to estimate irrigation water requirement following the single crop coefficient approach. Using the reliability ensemble averaging (REA) technique, we are able to reduce the overall predictive model uncertainty by more than 10%. The exceedance probability curve of irrigation water requirements shows that a certain threshold, e.g. an irrigation water limit due to water right of 400 mm, would be less frequently exceeded in case of the REA ensemble average (45%) in comparison to the equally weighted ensemble average (66%). We conclude that multi-model ensemble predictions and sophisticated model averaging techniques are helpful in predicting irrigation demand and provide relevant information for decision making.
Covington, M. A.
2005-01-01
New tests and analyses are reported that were carried out to resolve testing uncertainties in the original development and qualification of a lightweight ablative material used for the Stardust spacecraft forebody heat shield. These additional arcjet tests and analyses confirmed the ablative and thermal performance of low density Phenolic Impregnated Carbon Ablator (PICA) material used for the Stardust design. Testing was done under conditions that simulate the peak convective heating conditions (1200 W/cm2 and 0.5 atm) expected during Earth entry of the Stardust Sample Return Capsule. Test data and predictions from an ablative material response computer code for the in-depth temperatures were compared to guide iterative adjustment of material thermophysical properties used in the code so that the measured and predicted temperatures agreed. The PICA recession rates and maximum internal temperatures were satisfactorily predicted by the computer code with the revised properties. Predicted recession rates were also in acceptable agreement with measured rates for heating conditions 37% greater than the nominal peak heating rate of 1200 W/sq cm. The measured in-depth temperature response data show consistent temperature rise deviations that may be caused by an undocumented endothermic process within the PICA material that is not accurately modeled by the computer code. Predictions of the Stardust heat shield performance based on the present evaluation provide evidence that the maximum adhesive bondline temperature will be much lower than the maximum allowable of 250 C and an earlier design prediction. The re-evaluation also suggests that even with a 25 percent increase in peak heating rates, the total recession of the heat shield would be a small fraction of the as-designed thickness. These results give confidence in the Stardust heat shield design and confirm the potential of PICA material for use in new planetary probe and sample return applications.
Directory of Open Access Journals (Sweden)
Paul D. Loprinzi
2017-06-01
Full Text Available Background: Very few studies have evaluated the independent and combined associations of sedentary behavior (SB, moderate-to-vigorous physical activity (MVPA and cardiorespiratory fitness (CRF on obesity. Our recent work has evaluated this paradigm in the adult population,but no study has evaluated this paradigm in the child population, which was the purpose of this study. Methods: A national sample of children (N=680, 6-11 years were evaluated via the National Youth Fitness Survey; this study was conducted in 2012, employing a nationally representative sample, occurring across 15 different geographic regions in the United States. SB and MVPA were assessed via parental recall, with CRF objectively measured via a treadmill-based aerobic test. Obesity was determined for measured body mass index. A PACS (Physical Activity Cardiorespiratory Sedentary score was created ranging from 0-3, indicating each child’s number of positive characteristics (PA, CRF, SB. Results: Meeting MVPA guidelines (OR adjusted=0.47; 95% CI: 0.29-0.77 and above-median CRF (OR adjusted=0.12; 95% CI: 0.07-0.21, but not SB (OR adjusted=0.62; 95% CI: 0.35-1.10,were associated with reduced odds of obesity. Compared to those with a PACS score of 0, the odds of obesity for PACS scores of 1-3, respectively, were: 0.31 (0.18-0.53, 0.12 (0.04-0.34,and 0.05 (0.02-0.10. Conclusion: These findings highlight the need for public health strategies to promote child MVPA and CRF, and to reduce SB.
Directory of Open Access Journals (Sweden)
A Townsend Peterson
Full Text Available Lassa fever is a disease that has been reported from sites across West Africa; it is caused by an arenavirus that is hosted by the rodent M. natalensis. Although it is confined to West Africa, and has been documented in detail in some well-studied areas, the details of the distribution of risk of Lassa virus infection remain poorly known at the level of the broader region. In this paper, we explored the effects of certainty of diagnosis, oversampling in well-studied region, and error balance on results of mapping exercises. Each of the three factors assessed in this study had clear and consistent influences on model results, overestimating risk in southern, humid zones in West Africa, and underestimating risk in drier and more northern areas. The final, adjusted risk map indicates broad risk areas across much of West Africa. Although risk maps are increasingly easy to develop from disease occurrence data and raster data sets summarizing aspects of environments and landscapes, this process is highly sensitive to issues of data quality, sampling design, and design of analysis, with macrogeographic implications of each of these issues and the potential for misrepresenting real patterns of risk.
Directory of Open Access Journals (Sweden)
Wei Liu
Full Text Available One important method to obtain the continuous surfaces of soil properties from point samples is spatial interpolation. In this paper, we propose a method that combines ensemble learning with ancillary environmental information for improved interpolation of soil properties (hereafter, EL-SP. First, we calculated the trend value for soil potassium contents at the Qinghai Lake region in China based on measured values. Then, based on soil types, geology types, land use types, and slope data, the remaining residual was simulated with the ensemble learning model. Next, the EL-SP method was applied to interpolate soil potassium contents at the study site. To evaluate the utility of the EL-SP method, we compared its performance with other interpolation methods including universal kriging, inverse distance weighting, ordinary kriging, and ordinary kriging combined geographic information. Results show that EL-SP had a lower mean absolute error and root mean square error than the data produced by the other models tested in this paper. Notably, the EL-SP maps can describe more locally detailed information and more accurate spatial patterns for soil potassium content than the other methods because of the combined use of different types of environmental information; these maps are capable of showing abrupt boundary information for soil potassium content. Furthermore, the EL-SP method not only reduces prediction errors, but it also compliments other environmental information, which makes the spatial interpolation of soil potassium content more reasonable and useful.
Liu, Wei; Du, Peijun; Wang, Dongchen
2015-01-01
One important method to obtain the continuous surfaces of soil properties from point samples is spatial interpolation. In this paper, we propose a method that combines ensemble learning with ancillary environmental information for improved interpolation of soil properties (hereafter, EL-SP). First, we calculated the trend value for soil potassium contents at the Qinghai Lake region in China based on measured values. Then, based on soil types, geology types, land use types, and slope data, the remaining residual was simulated with the ensemble learning model. Next, the EL-SP method was applied to interpolate soil potassium contents at the study site. To evaluate the utility of the EL-SP method, we compared its performance with other interpolation methods including universal kriging, inverse distance weighting, ordinary kriging, and ordinary kriging combined geographic information. Results show that EL-SP had a lower mean absolute error and root mean square error than the data produced by the other models tested in this paper. Notably, the EL-SP maps can describe more locally detailed information and more accurate spatial patterns for soil potassium content than the other methods because of the combined use of different types of environmental information; these maps are capable of showing abrupt boundary information for soil potassium content. Furthermore, the EL-SP method not only reduces prediction errors, but it also compliments other environmental information, which makes the spatial interpolation of soil potassium content more reasonable and useful.
Transition path time distributions
Laleman, M.; Carlon, E.; Orland, H.
2017-12-01
Biomolecular folding, at least in simple systems, can be described as a two state transition in a free energy landscape with two deep wells separated by a high barrier. Transition paths are the short part of the trajectories that cross the barrier. Average transition path times and, recently, their full probability distribution have been measured for several biomolecular systems, e.g., in the folding of nucleic acids or proteins. Motivated by these experiments, we have calculated the full transition path time distribution for a single stochastic particle crossing a parabolic barrier, including inertial terms which were neglected in previous studies. These terms influence the short time scale dynamics of a stochastic system and can be of experimental relevance in view of the short duration of transition paths. We derive the full transition path time distribution as well as the average transition path times and discuss the similarities and differences with the high friction limit.
Dynamic principle for ensemble control tools
Samoletov, A.; Vasiev, B.
2017-11-01
Dynamical equations describing physical systems in contact with a thermal bath are commonly extended by mathematical tools called "thermostats." These tools are designed for sampling ensembles in statistical mechanics. Here we propose a dynamic principle underlying a range of thermostats which is derived using fundamental laws of statistical physics and ensures invariance of the canonical measure. The principle covers both stochastic and deterministic thermostat schemes. Our method has a clear advantage over a range of proposed and widely used thermostat schemes that are based on formal mathematical reasoning. Following the derivation of the proposed principle, we show its generality and illustrate its applications including design of temperature control tools that differ from the Nosé-Hoover-Langevin scheme.
Mielke, Steven L; Dinpajooh, Mohammadhasan; Siepmann, J Ilja; Truhlar, Donald G
2013-01-07
We present a procedure to calculate ensemble averages, thermodynamic derivatives, and coordinate distributions by effective classical potential methods. In particular, we consider the displaced-points path integral (DPPI) method, which yields exact quantal partition functions and ensemble averages for a harmonic potential and approximate quantal ones for general potentials, and we discuss the implementation of the new procedure in two Monte Carlo simulation codes, one that uses uncorrelated samples to calculate absolute free energies, and another that employs Metropolis sampling to calculate relative free energies. The results of the new DPPI method are compared to those from accurate path integral calculations as well as to results of two other effective classical potential schemes for the case of an isolated water molecule. In addition to the partition function, we consider the heat capacity and expectation values of the energy, the potential energy, the bond angle, and the OH distance. We also consider coordinate distributions. The DPPI scheme performs best among the three effective potential schemes considered and achieves very good accuracy for all of the properties considered. A key advantage of the effective potential schemes is that they display much lower statistical sampling variances than those for accurate path integral calculations. The method presented here shows great promise for including quantum effects in calculations on large systems.
A hyper-ensemble forecast of surface drift
Vandenbulcke, L.; Lenartz, F.; Poulain, P. M.; Rixen, M.; DART Consortium; MREA Consortium
2009-04-01
The prediction of surface drift of water is an important task, with applications such as marine transport, pollutant dispersion, and search-and-rescue activities. However, it is also very challenging, because it depends on ocean models that (usually) do not completely accurately represent wind-induced current, that do not include wave-driven currents, etc. However, the real surface drift depends on all present physical phenomena, which moreover interact in complex ways. Furthermore, although each of these factors can be forecasted by deterministic models, the latter all suffer from limitations, resulting in imperfect predictions. In the present study, we try and predict the drift of buoys launched during the DART06 (Dynamics of the Adriatic sea in Real-Time 2006) and MREA07 (Maritime Rapid Environmental Assessment 2007) sea trials, using the so-called hyper-ensemble technique: different models are combined in order to minimize departure from independent observations during a training period. The obtained combination is then used in forecasting mode. We review and try out different hyper-ensemble techniques, such as the simple ensemble mean, least-squares weighted linear combinations, and techniques based on data assimilation, which dynamically update the model's weights in the combination when new observations become available. We show that the latter methods alleviate the need of a priori fixing the training length. When the forecast period is relatively short, the discussed methods lead to much smaller forecasting errors compared with individual models (at least 3 times smaller), with the dynamic methods leading to the best results. When many models are available, errors can be further reduced by removing colinearities between them by performing a principal component analysis. At the same time, this reduces the amount of weights to be determined. In complex environments, the skill of individual models may vary over time periods smaller than the desired
Measuring social interaction in music ensembles.
Volpe, Gualtiero; D'Ausilio, Alessandro; Badino, Leonardo; Camurri, Antonio; Fadiga, Luciano
2016-05-05
Music ensembles are an ideal test-bed for quantitative analysis of social interaction. Music is an inherently social activity, and music ensembles offer a broad variety of scenarios which are particularly suitable for investigation. Small ensembles, such as string quartets, are deemed a significant example of self-managed teams, where all musicians contribute equally to a task. In bigger ensembles, such as orchestras, the relationship between a leader (the conductor) and a group of followers (the musicians) clearly emerges. This paper presents an overview of recent research on social interaction in music ensembles with a particular focus on (i) studies from cognitive neuroscience; and (ii) studies adopting a computational approach for carrying out automatic quantitative analysis of ensemble music performances. © 2016 The Author(s).
Analysis of mesoscale forecasts using ensemble methods
Gross, Markus
2016-01-01
Mesoscale forecasts are now routinely performed as elements of operational forecasts and their outputs do appear convincing. However, despite their realistic appearance at times the comparison to observations is less favorable. At the grid scale these forecasts often do not compare well with observations. This is partly due to the chaotic system underlying the weather. Another key problem is that it is impossible to evaluate the risk of making decisions based on these forecasts because they do not provide a measure of confidence. Ensembles provide this information in the ensemble spread and quartiles. However, running global ensembles at the meso or sub mesoscale involves substantial computational resources. National centers do run such ensembles, but the subject of this publication is a method which requires significantly less computation. The ensemble enhanced mesoscale system presented here aims not at the creation of an improved mesoscale forecast model. Also it is not to create an improved ensemble syste...
Attenberger, Ulrike I; Runge, Val M; Williams, Kenneth D; Stemmer, Alto; Michaely, Henrik J; Schoenberg, Stefan O; Reiser, Maximilian F; Wintersperger, Bernd J
2009-03-01
Motion artifacts often markedly degrade image quality in clinical scans. The BLADE technique offers an alternative k-space sampling scheme reducing the effect of patient related motion on image quality. The purpose of this study is the comparison of imaging artifacts, signal-to-noise (SNR), and contrast-to-noise ratio (CNR) of a new turboFLASH BLADE k-space trajectory with the standard Cartesian k-space sampling for brain imaging, using a 32-channel coil at 3T. The results from 32 patients included after informed consent are reported. This study was performed with a 32-channel head coil on a 3T scanner. Sagittal and axial T1-weighted FLASH sequences (TR/TE 250/2.46 milliseconds, flip angle 70-degree), acquired with Cartesian k-space sampling and T1-weighted turboFLASH sequences (TR/TE/TIsag/TIax 3200/2.77/1144/1056 milliseconds, flip angle 20-degree), using PROPELLER (BLADE) k-space trajectory, were compared. SNR and CNR were evaluated using a paired student t test. The frequency of motion artifacts was assessed in a blinded read. To analyze the differences between both techniques a McNemar test was performed. A P value FLASH for both sagittal (P axial (P FLASH sequences than for BLADE sequences on both axial (47%, P axial plane) of in-patient data sets and in 68% (sagittal plane) and 73% (axial plane) of out-patient data sets.The BLADE T1 scan did have lower SNRmean (BLADEax 179 +/- 98, Cartesianax 475 +/- 145, BLADEsag 171 +/- 51, and Cartesiansag 697 +/- 129) with P values indicating accordingly a statistically significant difference (Pax FLASH vs. turboFLASH). Differences for CNR were also statistically significant, independent of imaging plane (Pax = 0.001, Psag = 0.02). Results demonstrate that turboFLASH BLADE is applicable at 3T with a 32-channel head coil for T1-weighted imaging, with reduced ghost artifacts. This approach offers the first truly clinically applicable T1-weighted BLADE technique for brain imaging at 3T, with consistent excellent image
Directory of Open Access Journals (Sweden)
Greet Cardon
Full Text Available The aim of this study was to investigate the associations of health related behaviours (HRB with Body Mass Index (BMI in preschoolers, and to study the likelihood of being overweight/obese in relation to compliance with recommended HRB. The sample consisted of 3301 normal weight and overweight/obese preschoolers (mean age: 4.7 years; 52% boys, 85% normal weight from six European countries (Belgium, Bulgaria, Germany, Greece, Poland, Spain. Height and weight were measured, total daily step counts were registered during six days, and HRB were assessed with validated parental surveys in 2012. Multiple linear and logistic regression analyses were performed. Only few HRB were significantly associated with BMI. In boys, higher water intake and higher soft drink and higher fruit consumption were significantly associated with higher BMI. Boys drinking less water than recommended were less likely to be overweight/obese (OR = 0.60, while boys who consume soft drinks were more likely to be overweight/obese (OR = 1.52. In girls, higher water intake, higher vegetable consumption, and more TV time on weekend days were significantly associated with higher BMI. Girls eating less vegetables than recommended were less likely to be overweight/obese (OR = 0.62, and girls who engaged in quiet play for more than 90 minutes on weekend days were more likely to be overweight/obese (OR = 1.64. In general, the associations between HRB and BMI or being overweight/obese were limited and mainly related to dietary intake. Awareness campaigns for caregivers should stress that HRB of young children are important and independent of children's weight status.
A Localized Ensemble Kalman Smoother
Butala, Mark D.
2012-01-01
Numerous geophysical inverse problems prove difficult because the available measurements are indirectly related to the underlying unknown dynamic state and the physics governing the system may involve imperfect models or unobserved parameters. Data assimilation addresses these difficulties by combining the measurements and physical knowledge. The main challenge in such problems usually involves their high dimensionality and the standard statistical methods prove computationally intractable. This paper develops and addresses the theoretical convergence of a new high-dimensional Monte-Carlo approach called the localized ensemble Kalman smoother.
Wind Power Prediction using Ensembles
DEFF Research Database (Denmark)
Giebel, Gregor; Badger, Jake; Landberg, Lars
2005-01-01
offshore wind farm and the whole Jutland/Funen area. The utilities used these forecasts for maintenance planning, fuel consumption estimates and over-the-weekend trading on the Leipzig power exchange. Othernotable scientific results include the better accuracy of forecasts made up from a simple...... superposition of two NWP provider (in our case, DMI and DWD), an investigation of the merits of a parameterisation of the turbulent kinetic energy within thedelivered wind speed forecasts, and the finding that a “naïve” downscaling of each of the coarse ECMWF ensemble members with higher resolution HIRLAM did...
Kew, William; Mitchell, John B O
2015-09-01
The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. This investigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the 'wisdom of crowds' principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data preprocessing methodology was found to be crucial to performance of each method too. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Gaussian mixture ensemble transform filter
Reich, Sebastian
2011-01-01
We generalize the popular ensemble Kalman filter to an ensemble transform filter where the prior distribution can take the form of a Gaussian mixture or a Gaussian kernel density estimator. The design of the filter is based on a continuous formulation of the Bayesian filter analysis step. We call the new filter algorithm the ensemble Gaussian mixture filter (EGMF). The EGMF is implemented for three simple test problems (Brownian dynamics in one dimension, Langevin dynamics in two dimensions, ...
Multi-Dimensional Path Queries
DEFF Research Database (Denmark)
Bækgaard, Lars
1998-01-01
to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments......We present the path-relationship model that supports multi-dimensional data modeling and querying. A path-relationship database is composed of sets of paths and sets of relationships. A path is a sequence of related elements (atoms, paths, and sets of paths). A relationship is a binary path...
Multi-Model Ensemble Wake Vortex Prediction
Koerner, Stephan; Holzaepfel, Frank; Ahmad, Nash'at N.
2015-01-01
Several multi-model ensemble methods are investigated for predicting wake vortex transport and decay. This study is a joint effort between National Aeronautics and Space Administration and Deutsches Zentrum fuer Luft- und Raumfahrt to develop a multi-model ensemble capability using their wake models. An overview of different multi-model ensemble methods and their feasibility for wake applications is presented. The methods include Reliability Ensemble Averaging, Bayesian Model Averaging, and Monte Carlo Simulations. The methodologies are evaluated using data from wake vortex field experiments.
Ensemble methods for handwritten digit recognition
DEFF Research Database (Denmark)
Hansen, Lars Kai; Liisberg, Christian; Salamon, P.
1992-01-01
by optimizing the receptive fields. It is concluded that it is possible to improve performance significantly by introducing moderate-size ensembles; in particular, a 20-25% improvement has been found. The ensemble random LUTs, when trained on a medium-size database, reach a performance (without rejects) of 94....... It is further shown that it is possible to estimate the ensemble performance as well as the learning curve on a medium-size database. In addition the authors present preliminary analysis of experiments on a large database and show that state-of-the-art performance can be obtained using the ensemble approach...
Insights into the deterministic skill of air quality ensembles from the analysis of AQMEII data
Directory of Open Access Journals (Sweden)
I. Kioutsioukis
2016-12-01
level observations collected from the EMEP (European Monitoring and Evaluation Programme and AirBase databases. The goal of the study is to quantify to what extent we can extract predictable signals from an ensemble with superior skill over the single models and the ensemble mean. Verification statistics show that the deterministic models simulate better O3 than NO2 and PM10, linked to different levels of complexity in the represented processes. The unconditional ensemble mean achieves higher skill compared to each station's best deterministic model at no more than 60 % of the sites, indicating a combination of members with unbalanced skill difference and error dependence for the rest. The promotion of the right amount of accuracy and diversity within the ensemble results in an average additional skill of up to 31 % compared to using the full ensemble in an unconditional way. The skill improvements were higher for O3 and lower for PM10, associated with the extent of potential changes in the joint distribution of accuracy and diversity in the ensembles. The skill enhancement was superior using the weighting scheme, but the training period required to acquire representative weights was longer compared to the sub-selecting schemes. Further development of the method is discussed in the conclusion.
Lahmiri, Salim; Boukadoum, Mounir
2015-08-01
We present a new ensemble system for stock market returns prediction where continuous wavelet transform (CWT) is used to analyze return series and backpropagation neural networks (BPNNs) for processing CWT-based coefficients, determining the optimal ensemble weights, and providing final forecasts. Particle swarm optimization (PSO) is used for finding optimal weights and biases for each BPNN. To capture symmetry/asymmetry in the underlying data, three wavelet functions with different shapes are adopted. The proposed ensemble system was tested on three Asian stock markets: The Hang Seng, KOSPI, and Taiwan stock market data. Three statistical metrics were used to evaluate the forecasting accuracy; including, mean of absolute errors (MAE), root mean of squared errors (RMSE), and mean of absolute deviations (MADs). Experimental results showed that our proposed ensemble system outperformed the individual CWT-ANN models each with different wavelet function. In addition, the proposed ensemble system outperformed the conventional autoregressive moving average process. As a result, the proposed ensemble system is suitable to capture symmetry/asymmetry in financial data fluctuations for better prediction accuracy.
DEFF Research Database (Denmark)
Bessenrodt, Christine; Olsson, Jørn Børling; Sellers, James A.
2013-01-01
We give a complete classification of the unique path partitions and study congruence properties of the function which enumerates such partitions.......We give a complete classification of the unique path partitions and study congruence properties of the function which enumerates such partitions....
DEFF Research Database (Denmark)
Garud, Raghu; Karnøe, Peter
the place of agency in these theories that take history so seriously. In the end, they are as interested in path creation and destruction as they are in path dependence. This book is compiled of both theoretical and empirical writing. It shows relatively well-known industries such as the automobile...
Enhancing COSMO-DE ensemble forecasts by inexpensive techniques
Directory of Open Access Journals (Sweden)
Zied Ben Bouallègue
2013-02-01
Full Text Available COSMO-DE-EPS, a convection-permitting ensemble prediction system based on the high-resolution numerical weather prediction model COSMO-DE, is pre-operational since December 2010, providing probabilistic forecasts which cover Germany. This ensemble system comprises 20 members based on variations of the lateral boundary conditions, the physics parameterizations and the initial conditions. In order to increase the sample size in a computationally inexpensive way, COSMO-DE-EPS is combined with alternative ensemble techniques: the neighborhood method and the time-lagged approach. Their impact on the quality of the resulting probabilistic forecasts is assessed. Objective verification is performed over a six months period, scores based on the Brier score and its decomposition are shown for June 2011. The combination of the ensemble system with the alternative approaches improves probabilistic forecasts of precipitation in particular for high precipitation thresholds. Moreover, combining COSMO-DE-EPS with only the time-lagged approach improves the skill of area probabilities for precipitation and does not deteriorate the skill of 2 m-temperature and wind gusts forecasts.
Joys of Community Ensemble Playing: The Case of the Happy Roll Elastic Ensemble in Taiwan
Hsieh, Yuan-Mei; Kao, Kai-Chi
2012-01-01
The Happy Roll Elastic Ensemble (HREE) is a community music ensemble supported by Tainan Culture Centre in Taiwan. With enjoyment and friendship as its primary goals, it aims to facilitate the joys of ensemble playing and the spirit of social networking. This article highlights the key aspects of HREE's development in its first two years…
A Simple Approach to Account for Climate Model Interdependence in Multi-Model Ensembles
Herger, N.; Abramowitz, G.; Angelil, O. M.; Knutti, R.; Sanderson, B.
2016-12-01
Multi-model ensembles are an indispensable tool for future climate projection and its uncertainty quantification. Ensembles containing multiple climate models generally have increased skill, consistency and reliability. Due to the lack of agreed-on alternatives, most scientists use the equally-weighted multi-model mean as they subscribe to model democracy ("one model, one vote").Different research groups are known to share sections of code, parameterizations in their model, literature, or even whole model components. Therefore, individual model runs do not represent truly independent estimates. Ignoring this dependence structure might lead to a false model consensus, wrong estimation of uncertainty and effective number of independent models.Here, we present a way to partially address this problem by selecting a subset of CMIP5 model runs so that its climatological mean minimizes the RMSE compared to a given observation product. Due to the cancelling out of errors, regional biases in the ensemble mean are reduced significantly.Using a model-as-truth experiment we demonstrate that those regional biases persist into the future and we are not fitting noise, thus providing improved observationally-constrained projections of the 21st century. The optimally selected ensemble shows significantly higher global mean surface temperature projections than the original ensemble, where all the model runs are considered. Moreover, the spread is decreased well beyond that expected from the decreased ensemble size.Several previous studies have recommended an ensemble selection approach based on performance ranking of the model runs. Here, we show that this approach can perform even worse than randomly selecting ensemble members and can thus be harmful. We suggest that accounting for interdependence in the ensemble selection process is a necessary step for robust projections for use in impact assessments, adaptation and mitigation of climate change.
Forward, backward, and weighted stochastic bridges
Drummond, Peter D.
2017-10-01
We define stochastic bridges as conditional distributions of stochastic paths that leave a specified point in phase-space in the past and arrive at another one in the future. These can be defined relative to either forward or backward stochastic differential equations and with the inclusion of arbitrary path-dependent weights. The underlying stochastic equations are not the same except in linear cases. Accordingly, we generalize the theory of stochastic bridges to include time-reversed and weighted stochastic processes. We show that the resulting stochastic bridges are identical, whether derived from a forward or a backward time stochastic process. A numerical algorithm is obtained to sample these distributions. This technique, which uses partial stochastic equations, is robust and easily implemented. Examples are given, and comparisons are made to previous work. In stochastic equations without a gradient drift, our results confirm an earlier conjecture, while generalizing this to cases with path-dependent weights. An example of a two-dimensional stochastic equation with no potential solution is analyzed and numerically solved. We show how this method can treat unexpectedly large excursions occurring during a tunneling or escape event, in which a system escapes from one quasistable point to arrive at another one at a later time.
Aszyk, Justyna; Kot, Jacek; Tkachenko, Yurii; Woźniak, Michał; Bogucka-Kocka, Anna; Kot-Wasik, Agata
2017-04-15
A simple, fast, sensitive and accurate methodology based on a LLE followed by liquid chromatography-tandem mass spectrometry for simultaneous determination of four regioisomers (8-iso prostaglandin F2α, 8-iso-15(R)-prostaglandin F2α, 11β-prostaglandin F2α, 15(R)-prostaglandin F2α) in routine analysis of human plasma samples was developed. Isoprostanes are stable products of arachidonic acid peroxidation and are regarded as the most reliable markers of oxidative stress in vivo. Validation of method was performed by evaluation of the key analytical parameters such as: matrix effect, analytical curve, trueness, precision, limits of detection and limits of quantification. As a homoscedasticity was not met for analytical data, weighted linear regression was applied in order to improve the accuracy at the lower end points of calibration curve. The detection limits (LODs) ranged from 1.0 to 2.1pg/mL. For plasma samples spiked with the isoprostanes at the level of 50pg/mL, intra-and interday repeatability ranged from 2.1 to 3.5% and 0.1 to 5.1%, respectively. The applicability of the proposed approach has been verified by monitoring of isoprostane isomers level in plasma samples collected from young patients (n=8) subjected to hyperbaric hyperoxia (100% oxygen at 280kPa(a) for 30min) in a multiplace hyperbaric chamber. Copyright © 2017 Elsevier B.V. All rights reserved.
Transition from Poisson to circular unitary ensemble
Indian Academy of Sciences (India)
Transitions to universality classes of random matrix ensembles have been useful in the study of weakly-broken symmetries in quantum chaotic systems. Transitions involving Poisson as the initial ensemble have been particularly interesting. The exact two-point correlation function was derived by one of the present authors ...
Improved customer choice predictions using ensemble methods
M.C. van Wezel (Michiel); R. Potharst (Rob)
2005-01-01
textabstractIn this paper various ensemble learning methods from machine learning and statistics are considered and applied to the customer choice modeling problem. The application of ensemble learning usually improves the prediction quality of flexible models like decision trees and thus leads to
Urban runoff forecasting with ensemble weather predictions
DEFF Research Database (Denmark)
Pedersen, Jonas Wied; Courdent, Vianney Augustin Thomas; Vezzaro, Luca
This research shows how ensemble weather forecasts can be used to generate urban runoff forecasts up to 53 hours into the future. The results highlight systematic differences between ensemble members that needs to be accounted for when these forecasts are used in practice....
Transition from Poisson to circular unitary ensemble
Indian Academy of Sciences (India)
Abstract. Transitions to universality classes of random matrix ensembles have been useful in the study of weakly-broken symmetries in quantum chaotic systems. Transitions involving Poisson as the initial ensemble have been particularly interesting. The exact two-point correlation function was derived by one of the present ...
Harmonic superposition method for grand-canonical ensembles
Calvo, F.; Wales, D. J.
2015-03-01
The harmonic superposition method provides a unified framework to the equilibrium and relaxation kinetics on complex potential energy landscapes. Here we extend it to grand-canonical statistical ensembles governed by chemical potentials or chemical potential differences, by sampling energy minima corresponding to the various relevant sizes or compositions. The method is applied and validated against conventional Monte Carlo simulations for the problems of chemical equilibrium in nanoalloys and hydrogen absorption in bulk and nanoscale palladium.
Probing RNA native conformational ensembles with structural constraints
DEFF Research Database (Denmark)
Fonseca, Rasmus; van den Bedem, Henry; Bernauer, Julie
2016-01-01
substates, which are difficult to characterize experimentally and computationally. Here, we present an innovative, entirely kinematic computational procedure to efficiently explore the native ensemble of RNA molecules. Our procedure projects degrees of freedom onto a subspace of conformation space defined...... by distance constraints in the tertiary structure. The dimensionality reduction enables efficient exploration of conformational space. We show that the conformational distributions obtained with our method broadly sample the conformational landscape observed in NMR experiments. Compared to normal mode...
Simons, Jacob V., Jr.
2017-01-01
The critical path method/program evaluation and review technique method of project scheduling is based on the importance of managing a project's critical path(s). Although a critical path is the longest path through a network, its location in large projects is facilitated by the computation of activity slack. However, logical fallacies in…
Ensemble models of neutrophil trafficking in severe sepsis.
Directory of Open Access Journals (Sweden)
Sang Ok Song
Full Text Available A hallmark of severe sepsis is systemic inflammation which activates leukocytes and can result in their misdirection. This leads to both impaired migration to the locus of infection and increased infiltration into healthy tissues. In order to better understand the pathophysiologic mechanisms involved, we developed a coarse-grained phenomenological model of the acute inflammatory response in CLP (cecal ligation and puncture-induced sepsis in rats. This model incorporates distinct neutrophil kinetic responses to the inflammatory stimulus and the dynamic interactions between components of a compartmentalized inflammatory response. Ensembles of model parameter sets consistent with experimental observations were statistically generated using a Markov-Chain Monte Carlo sampling. Prediction uncertainty in the model states was quantified over the resulting ensemble parameter sets. Forward simulation of the parameter ensembles successfully captured experimental features and predicted that systemically activated circulating neutrophils display impaired migration to the tissue and neutrophil sequestration in the lung, consequently contributing to tissue damage and mortality. Principal component and multiple regression analyses of the parameter ensembles estimated from survivor and non-survivor cohorts provide insight into pathologic mechanisms dictating outcome in sepsis. Furthermore, the model was extended to incorporate hypothetical mechanisms by which immune modulation using extracorporeal blood purification results in improved outcome in septic rats. Simulations identified a sub-population (about 18% of the treated population that benefited from blood purification. Survivors displayed enhanced neutrophil migration to tissue and reduced sequestration of lung neutrophils, contributing to improved outcome. The model ensemble presented herein provides a platform for generating and testing hypotheses in silico, as well as motivating further experimental
Perception of ensemble statistics requires attention.
Jackson-Nielsen, Molly; Cohen, Michael A; Pitts, Michael A
2017-02-01
To overcome inherent limitations in perceptual bandwidth, many aspects of the visual world are represented as summary statistics (e.g., average size, orientation, or density of objects). Here, we investigated the relationship between summary (ensemble) statistics and visual attention. Recently, it was claimed that one ensemble statistic in particular, color diversity, can be perceived without focal attention. However, a broader debate exists over the attentional requirements of conscious perception, and it is possible that some form of attention is necessary for ensemble perception. To test this idea, we employed a modified inattentional blindness paradigm and found that multiple types of summary statistics (color and size) often go unnoticed without attention. In addition, we found attentional costs in dual-task situations, further implicating a role for attention in statistical perception. Overall, we conclude that while visual ensembles may be processed efficiently, some amount of attention is necessary for conscious perception of ensemble statistics. Copyright © 2016 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Leandro Homrich Lorentz
2011-06-01
Full Text Available Breast weight has great economic importance in poultry industry, and may be associated with other variables. This work aimed to estimate phenotypic correlations between performance (live body weight at 7 and 28 days, and at slaughter, and depth of the breast muscle measured by ultrasonography, carcass (eviscerated body weight and leg weight and body composition (heart, liver and abdominal fat weight traits in a broiler line, and quantify the direct and indirect influence of these traits on breast weight. Path analysis was used by expanding the matrix of partial correlation in coefficients which give the direct influence of one trait on another, regardless the effect of the other traits. The simultaneous maintenance of live body weight at slaughter and eviscerated body weight in the matrix of correlations might be harmful for statistical analysis involving systems of normal equations, like path analysis, due to the observed multicollinearity. The live body weight at slaughter and the depth of the breast muscle as measured by ultrasonography directly affected breast weight and were identified as the most responsible factors for the magnitude of the correlation coefficients obtained between the studied traits and breast weight. Individual pre-selection for these traits could favor an increased breast weight in the future reproducer candidates of this line if the broilers' environmental conditions and housing are maintained, since the live body weight at slaughter and the depth of breast muscle measured by ultrasonography were directly related to breast weight.O peso do peito possui grande importância econômica na indústria de frangos, podendo estar associado a outras variáveis passíveis de seleção. Estimaram-se correlações fenotípicas entre características de desempenho (peso vivo aos 7, 28 dias e ao abate e profundidade de músculo peitoral por ultra-sonografia, carcaça (peso eviscerado e de pernas e composição corporal (peso do cora
Career paths in occupational medicine.
Harber, Philip; Bontemps, Johnny; Saechao, Kaochoy; Wu, Samantha; Liu, Yihang; Elashoff, David
2012-11-01
To describe career path patterns for occupational medicine (OM) physicians. A convenience sample of 129 occupational physicians described work activities and locations at several career points up to 20 years ago, first OM position, and 10 years after expectations. Clinical activities were important throughout (eg, 41% and 46% of occupational physicians reported frequently treating patients 20 years ago and currently). Practice locations changed more markedly, with increased multisite clinics and hospital/medical center-based practices. Performing mainly clinical activities in a first job increased from 82% to 97% over the past 20 years. Career transitions between clinical and nonclinical roles were common (40% of participants). Many anticipate transition to nonclinical work over 10 years. Activities have not fundamentally changed, but practice locations have evolved. Both clinical and management activities remain important, and path to managerial positions increasingly begins in clinical practice.
Bouallegue, Zied Ben; Theis, Susanne E; Pinson, Pierre
2015-01-01
Probabilistic forecasts in the form of ensemble of scenarios are required for complex decision making processes. Ensemble forecasting systems provide such products but the spatio-temporal structures of the forecast uncertainty is lost when statistical calibration of the ensemble forecasts is applied for each lead time and location independently. Non-parametric approaches allow the reconstruction of spatio-temporal joint probability distributions at a low computational cost.For example, the ensemble copula coupling (ECC) method consists in rebuilding the multivariate aspect of the forecast from the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error. The new approach which preserves the dynamical development of the ensemble members is called dynamic ensemble copula coupling (...
Gibbs Ensemble Simulations of the Solvent Swelling of Polymer Films
Gartner, Thomas; Epps, Thomas, III; Jayaraman, Arthi
Solvent vapor annealing (SVA) is a useful technique to tune the morphology of block polymer, polymer blend, and polymer nanocomposite films. Despite SVA's utility, standardized SVA protocols have not been established, partly due to a lack of fundamental knowledge regarding the interplay between the polymer(s), solvent, substrate, and free-surface during solvent annealing and evaporation. An understanding of how to tune polymer film properties in a controllable manner through SVA processes is needed. Herein, the thermodynamic implications of the presence of solvent in the swollen polymer film is explored through two alternative Gibbs ensemble simulation methods that we have developed and extended: Gibbs ensemble molecular dynamics (GEMD) and hybrid Monte Carlo (MC)/molecular dynamics (MD). In this poster, we will describe these simulation methods and demonstrate their application to polystyrene films swollen by toluene and n-hexane. Polymer film swelling experiments, Gibbs ensemble molecular simulations, and polymer reference interaction site model (PRISM) theory are combined to calculate an effective Flory-Huggins χ (χeff) for polymer-solvent mixtures. The effects of solvent chemistry, solvent content, polymer molecular weight, and polymer architecture on χeff are examined, providing a platform to control and understand the thermodynamics of polymer film swelling.
Subseasonal Predictability of Boreal Summer Monsoon Rainfall from Ensemble Forecasts
Directory of Open Access Journals (Sweden)
Nicolas Vigaud
2017-10-01
Full Text Available Subseasonal forecast skill over the broadly defined North American (NAM, West African (WAM and Asian (AM summer monsoon regions is investigated using three Ensemble Prediction Systems (EPS at sub-monthly lead times. Extended Logistic Regression (ELR is used to produce probabilistic forecasts of weekly and week 3–4 averages of precipitation with starts in May–Aug, over the 1999–2010 period. The ELR tercile category probabilities for each model gridpoint are then averaged together with equal weight. The resulting Multi-Model Ensemble (MME forecasts exhibit good reliability, but have generally low sharpness for forecasts beyond 1 week; Multi-model ensembling largely removes negative values of the Ranked Probability Skill Score (RPSS seen in individual forecasts, and broadly improves the skill obtained in any of the three individual models except for the AM. The MME week 3–4 forecasts have generally higher RPSS and comparable reliability over all monsoon regions, compared to week 3 or week 4 forecast separately. Skill is higher during La Niña compared to El Niño and ENSO-neutral conditions over the 1999–2010 period, especially for the NAM. Regionally averaged RPSS is significantly correlated with the Maden-Julian Oscillation (MJO for the AM and WAM. Our results indicate potential for skillful predictions at subseasonal time-scales over the three summer monsoon regions of the Northern Hemisphere.
Insights into the deterministic skill of air quality ensembles from the analysis of AQMEII data
Kioutsioukis, Ioannis; Im, Ulas; Solazzo, Efisio; Bianconi, Roberto; Badia, Alba; Balzarini, Alessandra; Baró, Rocío; Bellasio, Roberto; Brunner, Dominik; Chemel, Charles; Curci, Gabriele; Denier van der Gon, Hugo; Flemming, Johannes; Forkel, Renate; Giordano, Lea; Jiménez-Guerrero, Pedro; Hirtl, Marcus; Jorba, Oriol; Manders-Groot, Astrid; Neal, Lucy; Pérez, Juan L.; Pirovano, Guidio; San Jose, Roberto; Savage, Nicholas; Schroder, Wolfram; Sokhi, Ranjeet S.; Syrakov, Dimiter; Tuccella, Paolo; Werhahn, Johannes; Wolke, Ralf; Hogrefe, Christian; Galmarini, Stefano
2016-12-01
Simulations from chemical weather models are subject to uncertainties in the input data (e.g. emission inventory, initial and boundary conditions) as well as those intrinsic to the model (e.g. physical parameterization, chemical mechanism). Multi-model ensembles can improve the forecast skill, provided that certain mathematical conditions are fulfilled. In this work, four ensemble methods were applied to two different datasets, and their performance was compared for ozone (O3), nitrogen dioxide (NO2) and particulate matter (PM10). Apart from the unconditional ensemble average, the approach behind the other three methods relies on adding optimum weights to members or constraining the ensemble to those members that meet certain conditions in time or frequency domain. The two different datasets were created for the first and second phase of the Air Quality Model Evaluation International Initiative (AQMEII). The methods are evaluated against ground level observations collected from the EMEP (European Monitoring and Evaluation Programme) and AirBase databases. The goal of the study is to quantify to what extent we can extract predictable signals from an ensemble with superior skill over the single models and the ensemble mean. Verification statistics show that the deterministic models simulate better O3 than NO2 and PM10, linked to different levels of complexity in the represented processes. The unconditional ensemble mean achieves higher skill compared to each station's best deterministic model at no more than 60 % of the sites, indicating a combination of members with unbalanced skill difference and error dependence for the rest. The promotion of the right amount of accuracy and diversity within the ensemble results in an average additional skill of up to 31 % compared to using the full ensemble in an unconditional way. The skill improvements were higher for O3 and lower for PM10, associated with the extent of potential changes in the joint distribution of accuracy
Insights into the deterministic skill of air quality ensembles from the analysis of AQMEII data
Directory of Open Access Journals (Sweden)
I. Kioutsioukis
2016-12-01
Full Text Available Simulations from chemical weather models are subject to uncertainties in the input data (e.g. emission inventory, initial and boundary conditions as well as those intrinsic to the model (e.g. physical parameterization, chemical mechanism. Multi-model ensembles can improve the forecast skill, provided that certain mathematical conditions are fulfilled. In this work, four ensemble methods were applied to two different datasets, and their performance was compared for ozone (O3, nitrogen dioxide (NO2 and particulate matter (PM10. Apart from the unconditional ensemble average, the approach behind the other three methods relies on adding optimum weights to members or constraining the ensemble to those members that meet certain conditions in time or frequency domain. The two different datasets were created for the first and second phase of the Air Quality Model Evaluation International Initiative (AQMEII. The methods are evaluated against ground level observations collected from the EMEP (European Monitoring and Evaluation Programme and AirBase databases. The goal of the study is to quantify to what extent we can extract predictable signals from an ensemble with superior skill over the single models and the ensemble mean. Verification statistics show that the deterministic models simulate better O3 than NO2 and PM10, linked to different levels of complexity in the represented processes. The unconditional ensemble mean achieves higher skill compared to each station's best deterministic model at no more than 60 % of the sites, indicating a combination of members with unbalanced skill difference and error dependence for the rest. The promotion of the right amount of accuracy and diversity within the ensemble results in an average additional skill of up to 31 % compared to using the full ensemble in an unconditional way. The skill improvements were higher for O3 and lower for PM10, associated with the extent of potential changes in the joint
Charkiewicz, A
2000-01-01
Before the Career Path system, jobs were classified according to grades with general statutory definitions, guided by the "Job Catalogue" which defined 6 evaluation criteria with example illustrations in the form of "typical" job descriptions. Career Paths were given concise statutory definitions necessitating a method of description and evaluation adapted to their new wider-band salary concept. Evaluations were derived from the same 6 criteria but the typical descriptions became unusable. In 1999, a sub-group of the Standing Concertation Committee proposed a new guide for describing Career Paths, adapted to their wider career concept by expanding the 6 evaluation criteria into 9. For each criterion several levels were established tracing the expected evolution of job level profiles and personal competencies over their longer salary ranges. While providing more transparency to supervisors and staff, the Guide's official use would be by services responsible for vacancy notices, Career Path evaluations and rela...
Spanier, Graham B.; Glick, Paul C.
1980-01-01
Presents a demographic analysis of the paths to remarriage--the extent and timing of remarriage, social factors associated with remarriage, and the impact of the event which preceded remarriage (divorce or widowhood). (Author)
CSIR Research Space (South Africa)
Khuluse-Makhanya, Sibusisiwe A
2017-10-01
Full Text Available likelihood classification [34]. The training samples were the basis for signature development for the ML classifier and for determining NDVI thresholds for the bare ground, water and vegetation classes. In this step two classes for vegetation (grass...-BS) which are defined by mixture of bare ground and grass (typically degraded) or synthetic materials is derived in the aggregation step of the ensemble classifier. The ensemble classification output consists of aggregating output from three iterations...
Path planning in changeable environments
Nieuwenhuisen, D.
2007-01-01
This thesis addresses path planning in changeable environments. In contrast to traditional path planning that deals with static environments, in changeable environments objects are allowed to change their configurations over time. In many cases, path planning algorithms must facilitate quick
Sugimura, Natsuhiko; Furuya, Asami; Yatsu, Takahiro; Shibue, Toshimichi
2015-01-01
To provide a practical guideline for the selection of a mass spectrometer ion source, we compared the applicability of three types of ion source: direct analysis in real time (DART), electrospray ionization (ESI) and fast atom bombardment (FAB), using an in-house high-resolution mass spectrometry sample library consisting of approximately 600 compounds. The great majority of the compounds (92%), whose molecular weights (MWs) were broadly distributed between 150 and 1000, were detected using all the ion sources. Nevertheless, some compounds were not detected using specific ion sources. The use of FAB resulted in the highest sample detection rate (>98%), whereas the detection rates obtained using DART and ESI were slightly lower (>96%). A scattergram constructed using MW and topological polar surface area (tPSA) as a substitute for molecular polarity showed that the performance of ESI was weak in the low-MW (800) area. These results might provide guidelines for the selection of ion sources for inexperienced mass spectrometry users.
Ovis: A framework for visual analysis of ocean forecast ensembles
Hollt, Thomas
2014-08-01
We present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations of the sea surface height that is used in ocean forecasting. The position of eddies can be derived directly from the sea surface height and our visualization approach enables their interactive exploration and analysis.The behavior of eddies is important in different application settings of which we present two in this paper. First, we show an application for interactive planning of placement as well as operation of off-shore structures using real-world ensemble simulation data of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by eddies, and the oil and gas industry relies on ocean forecasts for efficient operations. We enable analysis of the spatial domain, as well as the temporal evolution, for planning the placement and operation of structures.Eddies are also important for marine life. They transport water over large distances and with it also heat and other physical properties as well as biological organisms. In the second application we present the usefulness of our tool, which could be used for planning the paths of autonomous underwater vehicles, so called gliders, for marine scientists to study simulation data of the largely unexplored Red Sea. © 1995-2012 IEEE.
Ovis: A Framework for Visual Analysis of Ocean Forecast Ensembles.
Höllt, Thomas; Magdy, Ahmed; Zhan, Peng; Chen, Guoning; Gopalakrishnan, Ganesh; Hoteit, Ibrahim; Hansen, Charles D; Hadwiger, Markus
2014-08-01
We present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations of the sea surface height that is used in ocean forecasting. The position of eddies can be derived directly from the sea surface height and our visualization approach enables their interactive exploration and analysis.The behavior of eddies is important in different application settings of which we present two in this paper. First, we show an application for interactive planning of placement as well as operation of off-shore structures using real-world ensemble simulation data of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by eddies, and the oil and gas industry relies on ocean forecasts for efficient operations. We enable analysis of the spatial domain, as well as the temporal evolution, for planning the placement and operation of structures.Eddies are also important for marine life. They transport water over large distances and with it also heat and other physical properties as well as biological organisms. In the second application we present the usefulness of our tool, which could be used for planning the paths of autonomous underwater vehicles, so called gliders, for marine scientists to study simulation data of the largely unexplored Red Sea.
Ling, Qing-Hua; Song, Yu-Qing; Han, Fei; Yang, Dan; Huang, De-Shuang
2016-01-01
For ensemble learning, how to select and combine the candidate classifiers are two key issues which influence the performance of the ensemble system dramatically. Random vector functional link networks (RVFL) without direct input-to-output links is one of suitable base-classifiers for ensemble systems because of its fast learning speed, simple structure and good generalization performance. In this paper, to obtain a more compact ensemble system with improved convergence performance, an improved ensemble of RVFL based on attractive and repulsive particle swarm optimization (ARPSO) with double optimization strategy is proposed. In the proposed method, ARPSO is applied to select and combine the candidate RVFL. As for using ARPSO to select the optimal base RVFL, ARPSO considers both the convergence accuracy on the validation data and the diversity of the candidate ensemble system to build the RVFL ensembles. In the process of combining RVFL, the ensemble weights corresponding to the base RVFL are initialized by the minimum norm least-square method and then further optimized by ARPSO. Finally, a few redundant RVFL is pruned, and thus the more compact ensemble of RVFL is obtained. Moreover, in this paper, theoretical analysis and justification on how to prune the base classifiers on classification problem is presented, and a simple and practically feasible strategy for pruning redundant base classifiers on both classification and regression problems is proposed. Since the double optimization is performed on the basis of the single optimization, the ensemble of RVFL built by the proposed method outperforms that built by some single optimization methods. Experiment results on function approximation and classification problems verify that the proposed method could improve its convergence accuracy as well as reduce the complexity of the ensemble system.
Jarzynski equality in the context of maximum path entropy
González, Diego; Davis, Sergio
2017-06-01
In the global framework of finding an axiomatic derivation of nonequilibrium Statistical Mechanics from fundamental principles, such as the maximum path entropy - also known as Maximum Caliber principle -, this work proposes an alternative derivation of the well-known Jarzynski equality, a nonequilibrium identity of great importance today due to its applications to irreversible processes: biological systems (protein folding), mechanical systems, among others. This equality relates the free energy differences between two equilibrium thermodynamic states with the work performed when going between those states, through an average over a path ensemble. In this work the analysis of Jarzynski's equality will be performed using the formalism of inference over path space. This derivation highlights the wide generality of Jarzynski's original result, which could even be used in non-thermodynamical settings such as social systems, financial and ecological systems.
Physiological evaluation of air-fed ensembles.
Turner, Nina L; Powell, Jeffrey B; Sinkule, Edward J; Novak, Debra A
2014-03-01
The goal of this study was to evaluate the respiratory and metabolic stresses of air-fed ensembles used by workers in the nuclear, chemical, and pharmaceutical industries during rest, low-, and moderate-intensity treadmill exercise. Fourteen men and six women wore two different air-fed ensembles (AFE-1 and AFE-2) and one two-piece supplied-air respirator (SA) at rest (REST) and while walking for 6min at oxygen consumption (V.O2) rates of 1.0 (LOW) and 2.0 l min(-1) (MOD). Inhaled CO2 (FICO2), inhaled O2 (FIO2), pressure, and temperature were measured continuously breath-by-breath. For both LOW and MOD, FICO2 was significantly lower (P ensembles than SA (P REST, LOW, and MOD. Inhaled gas temperature was significantly lower in SA than in either air-fed ensemble (P ensembles in both men and women, an observation that has implications for the design of emergency escape protocols for air-fed ensemble wearers. Results show that inhaled gas concentrations may reach physiologically stressful levels in air-fed ensembles during moderate-intensity treadmill walking.
Bayesian refinement of protein structures and ensembles against SAXS data using molecular dynamics.
Shevchuk, Roman; Hub, Jochen S
2017-10-01
Small-angle X-ray scattering is an increasingly popular technique used to detect protein structures and ensembles in solution. However, the refinement of structures and ensembles against SAXS data is often ambiguous due to the low information content of SAXS data, unknown systematic errors, and unknown scattering contributions from the solvent. We offer a solution to such problems by combining Bayesian inference with all-atom molecular dynamics simulations and explicit-solvent SAXS calculations. The Bayesian formulation correctly weights the SAXS data versus prior physical knowledge, it quantifies the precision or ambiguity of fitted structures and ensembles, and it accounts for unknown systematic errors due to poor buffer matching. The method further provides a probabilistic criterion for identifying the number of states required to explain the SAXS data. The method is validated by refining ensembles of a periplasmic binding protein against calculated SAXS curves. Subsequently, we derive the solution ensembles of the eukaryotic chaperone heat shock protein 90 (Hsp90) against experimental SAXS data. We find that the SAXS data of the apo state of Hsp90 is compatible with a single wide-open conformation, whereas the SAXS data of Hsp90 bound to ATP or to an ATP-analogue strongly suggest heterogenous ensembles of a closed and a wide-open state.
Kant, A K
2003-02-01
This study examined the interaction between body mass index (BMI) and attempting to lose weight for reporting of: (1) macro- and micronutrient intake; (2) intake of low-nutrient-density foods; and (3) serum biomarkers of dietary exposure and cardiovascular disease risk. Dietary, anthropometric and biochemical data were from the third National Health and Nutrition Examination Survey (1988-1994), n=13 095. Multiple regression methods were used to examine the independent associations of BMI, trying to lose weight, or the interaction of BMI-trying to lose weight with reported intakes of energy, nutrients, percentage energy from low-nutrient-density foods (sweeteners, baked and dairy desserts, visible fats and salty snacks), and serum concentrations of vitamins, carotenoids and lipids. BMI was an independent positive predictor (Plose weight was a negative predictor (Plose weight (Plose weight interaction effects were noted. There was little evidence of increased nutritional risk in those reportedly trying to lose weight irrespective of weight status.
Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.
Directory of Open Access Journals (Sweden)
Nikola Simidjievski
Full Text Available Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting, significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.
Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models
Simidjievski, Nikola; Todorovski, Ljupčo; Džeroski, Sašo
2016-01-01
Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient. PMID:27078633
Multi-objective calibration of forecast ensembles using Bayesian model averaging
Vrugt, J.A.; Clark, M.P.; Diks, C.G.H.; Duan, Q.; Robinson, B.A.
2006-01-01
Bayesian Model Averaging (BMA) has recently been proposed as a method for statistical postprocessing of forecast ensembles from numerical weather prediction models. The BMA predictive probability density function (PDF) of any weather quantity of interest is a weighted average of PDFs centered on the
Ensemble Feature Learning of Genomic Data Using Support Vector Machine.
Directory of Open Access Journals (Sweden)
Ali Anaissi
Full Text Available The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD which reveals significant clusters with the selected data.
Ensemble Feature Learning of Genomic Data Using Support Vector Machine.
Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R; Braytee, Ali; Kennedy, Paul J
2016-01-01
The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data.
Fluctuations in a quasi-stationary shallow cumulus cloud ensemble
Directory of Open Access Journals (Sweden)
M. Sakradzija
2015-01-01
Full Text Available We propose an approach to stochastic parameterisation of shallow cumulus clouds to represent the convective variability and its dependence on the model resolution. To collect information about the individual cloud lifecycles and the cloud ensemble as a whole, we employ a large eddy simulation (LES model and a cloud tracking algorithm, followed by conditional sampling of clouds at the cloud-base level. In the case of a shallow cumulus ensemble, the cloud-base mass flux distribution is bimodal, due to the different shallow cloud subtypes, active and passive clouds. Each distribution mode can be approximated using a Weibull distribution, which is a generalisation of exponential distribution by accounting for the change in distribution shape due to the diversity of cloud lifecycles. The exponential distribution of cloud mass flux previously suggested for deep convection parameterisation is a special case of the Weibull distribution, which opens a way towards unification of the statistical convective ensemble formalism of shallow and deep cumulus clouds. Based on the empirical and theoretical findings, a stochastic model has been developed to simulate a shallow convective cloud ensemble. It is formulated as a compound random process, with the number of convective elements drawn from a Poisson distribution, and the cloud mass flux sampled from a mixed Weibull distribution. Convective memory is accounted for through the explicit cloud lifecycles, making the model formulation consistent with the choice of the Weibull cloud mass flux distribution function. The memory of individual shallow clouds is required to capture the correct convective variability. The resulting distribution of the subgrid convective states in the considered shallow cumulus case is scale-adaptive – the smaller the grid size, the broader the distribution.
Atomic clock ensemble in space
Cacciapuoti, L.; Salomon, C.
2011-12-01
Atomic Clock Ensemble in Space (ACES) is a mission using high-performance clocks and links to test fundamental laws of physics in space. Operated in the microgravity environment of the International Space Station, the ACES clocks, PHARAO and SHM, will generate a frequency reference reaching instability and inaccuracy at the 1 · 10-16 level. A link in the microwave domain (MWL) and an optical link (ELT) will make the ACES clock signal available to ground laboratories equipped with atomic clocks. Space-to-ground and ground-to-ground comparisons of atomic frequency standards will be used to test Einstein's theory of general relativity including a precision measurement of the gravitational red-shift, a search for time variations of fundamental constants, and Lorentz Invariance tests. Applications in geodesy, optical time transfer, and ranging will also be supported. ACES has now reached an advanced technology maturity, with engineering models completed and successfully tested and flight hardware under development. This paper presents the ACES mission concept and the status of its main instruments.
LENUS (Irish Health Repository)
DiFranco, Matthew D
2011-01-01
We present a tile-based approach for producing clinically relevant probability maps of prostatic carcinoma in histological sections from radical prostatectomy. Our methodology incorporates ensemble learning for feature selection and classification on expert-annotated images. Random forest feature selection performed over varying training sets provides a subset of generalized CIEL*a*b* co-occurrence texture features, while sample selection strategies with minimal constraints reduce training data requirements to achieve reliable results. Ensembles of classifiers are built using expert-annotated tiles from training images, and scores for the probability of cancer presence are calculated from the responses of each classifier in the ensemble. Spatial filtering of tile-based texture features prior to classification results in increased heat-map coherence as well as AUC values of 95% using ensembles of either random forests or support vector machines. Our approach is designed for adaptation to different imaging modalities, image features, and histological decision domains.
An estimate of the inflation factor and analysis sensitivity in the ensemble Kalman filter
Directory of Open Access Journals (Sweden)
G. Wu
2017-07-01
Full Text Available The ensemble Kalman filter (EnKF is a widely used ensemble-based assimilation method, which estimates the forecast error covariance matrix using a Monte Carlo approach that involves an ensemble of short-term forecasts. While the accuracy of the forecast error covariance matrix is crucial for achieving accurate forecasts, the estimate given by the EnKF needs to be improved using inflation techniques. Otherwise, the sampling covariance matrix of perturbed forecast states will underestimate the true forecast error covariance matrix because of the limited ensemble size and large model errors, which may eventually result in the divergence of the filter. In this study, the forecast error covariance inflation factor is estimated using a generalized cross-validation technique. The improved EnKF assimilation scheme is tested on the atmosphere-like Lorenz-96 model with spatially correlated observations, and is shown to reduce the analysis error and increase its sensitivity to the observations.
Using multiple travel paths to estimate daily travel distance in arboreal, group-living primates.
Steel, Ruth Irene
2015-01-01
Primate field studies often estimate daily travel distance (DTD) in order to estimate energy expenditure and/or test foraging hypotheses. In group-living species, the center of mass (CM) method is traditionally used to measure DTD; a point is marked at the group's perceived center of mass at a set time interval or upon each move, and the distance between consecutive points is measured and summed. However, for groups using multiple travel paths, the CM method potentially creates a central path that is shorter than the individual paths and/or traverses unused areas. These problems may compromise tests of foraging hypotheses, since distance and energy expenditure could be underestimated. To better understand the magnitude of these potential biases, I designed and tested the multiple travel paths (MTP) method, in which DTD was calculated by recording all travel paths taken by the group's members, weighting each path's distance based on its proportional use by the group, and summing the weighted distances. To compare the MTP and CM methods, DTD was calculated using both methods in three groups of Udzungwa red colobus monkeys (Procolobus gordonorum; group size 30-43) for a random sample of 30 days between May 2009 and March 2010. Compared to the CM method, the MTP method provided significantly longer estimates of DTD that were more representative of the actual distance traveled and the areas used by a group. The MTP method is more time-intensive and requires multiple observers compared to the CM method. However, it provides greater accuracy for testing ecological and foraging models.
Force sensor based tool condition monitoring using a heterogeneous ensemble learning model.
Wang, Guofeng; Yang, Yinwei; Li, Zhimeng
2014-11-14
Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability.
Ensemble Machine Learning Methods and Applications
Ma, Yunqian
2012-01-01
It is common wisdom that gathering a variety of views and inputs improves the process of decision making, and, indeed, underpins a democratic society. Dubbed “ensemble learning” by researchers in computational intelligence and machine learning, it is known to improve a decision system’s robustness and accuracy. Now, fresh developments are allowing researchers to unleash the power of ensemble learning in an increasing range of real-world applications. Ensemble learning algorithms such as “boosting” and “random forest” facilitate solutions to key computational issues such as face detection and are now being applied in areas as diverse as object trackingand bioinformatics. Responding to a shortage of literature dedicated to the topic, this volume offers comprehensive coverage of state-of-the-art ensemble learning techniques, including various contributions from researchers in leading industrial research labs. At once a solid theoretical study and a practical guide, the volume is a windfall for r...
Genetic Algorithm Optimized Neural Networks Ensemble as ...
African Journals Online (AJOL)
NJD
Genetic Algorithm Optimized Neural Networks Ensemble as. Calibration Model for Simultaneous Spectrophotometric. Estimation of Atenolol and Losartan Potassium in Tablets. Dondeti Satyanarayana*, Kamarajan Kannan and Rajappan Manavalan. Department of Pharmacy, Annamalai University, Annamalainagar, Tamil ...
Abrams, Gene; Siles Molina, Mercedes
2017-01-01
This book offers a comprehensive introduction by three of the leading experts in the field, collecting fundamental results and open problems in a single volume. Since Leavitt path algebras were first defined in 2005, interest in these algebras has grown substantially, with ring theorists as well as researchers working in graph C*-algebras, group theory and symbolic dynamics attracted to the topic. Providing a historical perspective on the subject, the authors review existing arguments, establish new results, and outline the major themes and ring-theoretic concepts, such as the ideal structure, Z-grading and the close link between Leavitt path algebras and graph C*-algebras. The book also presents key lines of current research, including the Algebraic Kirchberg Phillips Question, various additional classification questions, and connections to noncommutative algebraic geometry. Leavitt Path Algebras will appeal to graduate students and researchers working in the field and related areas, such as C*-algebras and...
Towards a GME ensemble forecasting system: Ensemble initialization using the breeding technique
Directory of Open Access Journals (Sweden)
Jan D. Keller
2008-12-01
Full Text Available The quantitative forecast of precipitation requires a probabilistic background particularly with regard to forecast lead times of more than 3 days. As only ensemble simulations can provide useful information of the underlying probability density function, we built a new ensemble forecasting system (GME-EFS based on the GME model of the German Meteorological Service (DWD. For the generation of appropriate initial ensemble perturbations we chose the breeding technique developed by Toth and Kalnay (1993, 1997, which develops perturbations by estimating the regions of largest model error induced uncertainty. This method is applied and tested in the framework of quasi-operational forecasts for a three month period in 2007. The performance of the resulting ensemble forecasts are compared to the operational ensemble prediction systems ECMWF EPS and NCEP GFS by means of ensemble spread of free atmosphere parameters (geopotential and temperature and ensemble skill of precipitation forecasting. This comparison indicates that the GME ensemble forecasting system (GME-EFS provides reasonable forecasts with spread skill score comparable to that of the NCEP GFS. An analysis with the continuous ranked probability score exhibits a lack of resolution for the GME forecasts compared to the operational ensembles. However, with significant enhancements during the 3 month test period, the first results of our work with the GME-EFS indicate possibilities for further development as well as the potential for later operational usage.
Directory of Open Access Journals (Sweden)
Camillo Constantini
2003-10-01
Full Text Available We prove that the hyperspace of closed bounded sets with the Hausdor_ topology, over an almost convex metric space, is an absolute retract. Dense subspaces of normed linear spaces are examples of, not necessarily connected, almost convex metric spaces. We give some necessary conditions for the path-wise connectedness of the Hausdorff metric topology on closed bounded sets. Finally, we describe properties of a separable metric space, under which its hyperspace with the Wijsman topology is path-wise connected.
DEFF Research Database (Denmark)
Garud, Raghu; Karnøe, Peter
the place of agency in these theories that take history so seriously. In the end, they are as interested in path creation and destruction as they are in path dependence. This book is compiled of both theoretical and empirical writing. It shows relatively well-known industries such as the automobile......, biotechnology and semi-conductor industries in a new light. It also invites the reader to learn more about medical practices, wind power, lasers and synthesizers. Primarily for academicians, researchers and PhD students in fields related to technology management, this book is a research-oriented textbook...
Ensemble Prediction Model with Expert Selection for Electricity Price Forecasting
Directory of Open Access Journals (Sweden)
Bijay Neupane
2017-01-01
Full Text Available Forecasting of electricity prices is important in deregulated electricity markets for all of the stakeholders: energy wholesalers, traders, retailers and consumers. Electricity price forecasting is an inherently difficult problem due to its special characteristic of dynamicity and non-stationarity. In this paper, we present a robust price forecasting mechanism that shows resilience towards the aggregate demand response effect and provides highly accurate forecasted electricity prices to the stakeholders in a dynamic environment. We employ an ensemble prediction model in which a group of different algorithms participates in forecasting 1-h ahead the price for each hour of a day. We propose two different strategies, namely, the Fixed Weight Method (FWM and the Varying Weight Method (VWM, for selecting each hour’s expert algorithm from the set of participating algorithms. In addition, we utilize a carefully engineered set of features selected from a pool of features extracted from the past electricity price data, weather data and calendar data. The proposed ensemble model offers better results than the Autoregressive Integrated Moving Average (ARIMA method, the Pattern Sequence-based Forecasting (PSF method and our previous work using Artificial Neural Networks (ANN alone on the datasets for New York, Australian and Spanish electricity markets.
Conductor gestures influence evaluations of ensemble performance.
Morrison, Steven J; Price, Harry E; Smedley, Eric M; Meals, Cory D
2014-01-01
Previous research has found that listener evaluations of ensemble performances vary depending on the expressivity of the conductor's gestures, even when performances are otherwise identical. It was the purpose of the present study to test whether this effect of visual information was evident in the evaluation of specific aspects of ensemble performance: articulation and dynamics. We constructed a set of 32 music performances that combined auditory and visual information and were designed to feature a high degree of contrast along one of two target characteristics: articulation and dynamics. We paired each of four music excerpts recorded by a chamber ensemble in both a high- and low-contrast condition with video of four conductors demonstrating high- and low-contrast gesture specifically appropriate to either articulation or dynamics. Using one of two equivalent test forms, college music majors and non-majors (N = 285) viewed sixteen 30 s performances and evaluated the quality of the ensemble's articulation, dynamics, technique, and tempo along with overall expressivity. Results showed significantly higher evaluations for performances featuring high rather than low conducting expressivity regardless of the ensemble's performance quality. Evaluations for both articulation and dynamics were strongly and positively correlated with evaluations of overall ensemble expressivity.
Ensemble forecasts of road surface temperatures
Sokol, Zbyněk; Bližňák, Vojtěch; Sedlák, Pavel; Zacharov, Petr; Pešice, Petr; Škuthan, Miroslav
2017-05-01
This paper describes a new ensemble technique for road surface temperature (RST) forecasting using an energy balance and heat conduction model. Compared to currently used deterministic forecasts, the proposed technique allows the estimation of forecast uncertainty and probabilistic forecasts. The ensemble technique is applied to the METRo-CZ model and stems from error covariance analyses of the forecasted air temperature and humidity 2 m above the ground, wind speed at 10 m and total cloud cover N in octas by the numerical weather prediction (NWP) model. N is used to estimate the shortwave and longwave radiation fluxes. These variables are used to calculate the boundary conditions in the METRo-CZ model. We found that the variable N is crucial for generating the ensembles. Nevertheless, the ensemble spread is too small and underestimates the uncertainty in the RST forecast. One of the reasons is not considering errors in the rain and snow forecast by the NWP model when generating ensembles. Technical issues, such as incorrect sky view factors and the current state of road surface conditions also contribute to errors. Although the ensemble technique underestimates the uncertainty in the RST forecasts, it provides additional information to road authorities who provide winter road maintenance.
Directory of Open Access Journals (Sweden)
André Luiz Galo
2009-01-01
Full Text Available We describe the design and tests of a set-up mounted in a conventional double beam spectrophotometer, which allows the determination of optical density of samples confined in a long liquid core waveguide (LCW capillary. Very long optical path length can be achieved with capillary cell, allowing measurements of samples with very low optical densities. The device uses a custom optical concentrator optically coupled to LCW (TEFLON® AF. Optical density measurements, carried out using a LCW of ~ 45 cm, were in accordance with the Beer-Lambert Law. Thus, it was possible to analyze quantitatively samples at concentrations 45 fold lower than that regularly used in spectrophotometric measurements.
An approach to localization for ensemble-based data assimilation.
Wang, Bin; Liu, Juanjuan; Liu, Li; Xu, Shiming; Huang, Wenyu
2018-01-01
Localization techniques are commonly used in ensemble-based data assimilation (e.g., the Ensemble Kalman Filter (EnKF) method) because of insufficient ensemble samples. They can effectively ameliorate the spurious long-range correlations between the background and observations. However, localization is very expensive when the problem to be solved is of high dimension (say 106 or higher) for assimilating observations simultaneously. To reduce the cost of localization for high-dimension problems, an approach is proposed in this paper, which approximately expands the correlation function of the localization matrix using a limited number of principal eigenvectors so that the Schür product between the localization matrix and a high-dimension covariance matrix is reduced to the sum of a series of Schür products between two simple vectors. These eigenvectors are actually the sine functions with different periods and phases. Numerical experiments show that when the number of principal eigenvectors used reaches 20, the approximate expansion of the correlation function is very close to the exact one in the one-dimensional (1D) and two-dimensional (2D) cases. The new approach is then applied to localization in the EnKF method, and its performance is evaluated in assimilation-cycle experiments with the Lorenz-96 model and single assimilation experiments using a barotropic shallow water model. The results suggest that the approach is feasible in providing comparable assimilation analysis with far less cost.
Local Ensemble Kalman Particle Filters for efficient data assimilation
Robert, Sylvain
2016-01-01
Ensemble methods such as the Ensemble Kalman Filter (EnKF) are widely used for data assimilation in large-scale geophysical applications, as for example in numerical weather prediction (NWP). There is a growing interest for physical models with higher and higher resolution, which brings new challenges for data assimilation techniques because of the presence of non-linear and non-Gaussian features that are not adequately treated by the EnKF. We propose two new localized algorithms based on the Ensemble Kalman Particle Filter (EnKPF), a hybrid method combining the EnKF and the Particle Filter (PF) in a way that maintains scalability and sample diversity. Localization is a key element of the success of EnKFs in practice, but it is much more challenging to apply to PFs. The algorithms that we introduce in the present paper provide a compromise between the EnKF and the PF while avoiding some of the problems of localization for pure PFs. Numerical experiments with a simplified model of cumulus convection based on a...
Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.
2007-01-01
To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.
Disease-associated mutations that alter the RNA structural ensemble.
Directory of Open Access Journals (Sweden)
Matthew Halvorsen
2010-08-01
Full Text Available Genome-wide association studies (GWAS often identify disease-associated mutations in intergenic and non-coding regions of the genome. Given the high percentage of the human genome that is transcribed, we postulate that for some observed associations the disease phenotype is caused by a structural rearrangement in a regulatory region of the RNA transcript. To identify such mutations, we have performed a genome-wide analysis of all known disease-associated Single Nucleotide Polymorphisms (SNPs from the Human Gene Mutation Database (HGMD that map to the untranslated regions (UTRs of a gene. Rather than using minimum free energy approaches (e.g. mFold, we use a partition function calculation that takes into consideration the ensemble of possible RNA conformations for a given sequence. We identified in the human genome disease-associated SNPs that significantly alter the global conformation of the UTR to which they map. For six disease-states (Hyperferritinemia Cataract Syndrome, beta-Thalassemia, Cartilage-Hair Hypoplasia, Retinoblastoma, Chronic Obstructive Pulmonary Disease (COPD, and Hypertension, we identified multiple SNPs in UTRs that alter the mRNA structural ensemble of the associated genes. Using a Boltzmann sampling procedure for sub-optimal RNA structures, we are able to characterize and visualize the nature of the conformational changes induced by the disease-associated mutations in the structural ensemble. We observe in several cases (specifically the 5' UTRs of FTL and RB1 SNP-induced conformational changes analogous to those observed in bacterial regulatory Riboswitches when specific ligands bind. We propose that the UTR and SNP combinations we identify constitute a "RiboSNitch," that is a regulatory RNA in which a specific SNP has a structural consequence that results in a disease phenotype. Our SNPfold algorithm can help identify RiboSNitches by leveraging GWAS data and an analysis of the mRNA structural ensemble.
Polynomial Chaos Based Acoustic Uncertainty Predictions from Ocean Forecast Ensembles
Dennis, S.
2016-02-01
Most significant ocean acoustic propagation occurs at tens of kilometers, at scales small compared basin and to most fine scale ocean modeling. To address the increased emphasis on uncertainty quantification, for example transmission loss (TL) probability density functions (PDF) within some radius, a polynomial chaos (PC) based method is utilized. In order to capture uncertainty in ocean modeling, Navy Coastal Ocean Model (NCOM) now includes ensembles distributed to reflect the ocean analysis statistics. Since the ensembles are included in the data assimilation for the new forecast ensembles, the acoustic modeling uses the ensemble predictions in a similar fashion for creating sound speed distribution over an acoustically relevant domain. Within an acoustic domain, singular value decomposition over the combined time-space structure of the sound speeds can be used to create Karhunen-Loève expansions of sound speed, subject to multivariate normality testing. These sound speed expansions serve as a basis for Hermite polynomial chaos expansions of derived quantities, in particular TL. The PC expansion coefficients result from so-called non-intrusive methods, involving evaluation of TL at multi-dimensional Gauss-Hermite quadrature collocation points. Traditional TL calculation from standard acoustic propagation modeling could be prohibitively time consuming at all multi-dimensional collocation points. This method employs Smolyak order and gridding methods to allow adaptive sub-sampling of the collocation points to determine only the most significant PC expansion coefficients to within a preset tolerance. Practically, the Smolyak order and grid sizes grow only polynomially in the number of Karhunen-Loève terms, alleviating the curse of dimensionality. The resulting TL PC coefficients allow the determination of TL PDF normality and its mean and standard deviation. In the non-normal case, PC Monte Carlo methods are used to rapidly establish the PDF. This work was
Directory of Open Access Journals (Sweden)
Peter Juhasz
2017-03-01
Full Text Available While risk management gained popularity during the last decades even some of the basic risk types are still far out of focus. One of these is path dependency that refers to the uncertainty of how we reach a certain level of total performance over time. While decision makers are careful in accessing how their position will look like the end of certain periods, little attention is given how they will get there through the period. The uncertainty of how a process will develop across a shorter period of time is often “eliminated” by simply choosing a longer planning time interval, what makes path dependency is one of the most often overlooked business risk types. After reviewing the origin of the problem we propose and compare seven risk measures to access path. Traditional risk measures like standard deviation of sub period cash flows fail to capture this risk type. We conclude that in most cases considering the distribution of the expected cash flow effect caused by the path dependency may offer the best method, but we may need to use several measures at the same time to include all the optimisation limits of the given firm
McGarvey, Lynn M.; Sterenberg, Gladys Y.; Long, Julie S.
2013-01-01
The authors elucidate what they saw as three important challenges to overcome along the path to becoming elementary school mathematics teacher leaders: marginal interest in math, low self-confidence, and teaching in isolation. To illustrate how these challenges were mitigated, they focus on the stories of two elementary school teachers--Laura and…
Bill, R. C.; Johnson, R. D. (Inventor)
1979-01-01
A gas path seal suitable for use with a turbine engine or compressor is described. A shroud wearable or abradable by the abrasion of the rotor blades of the turbine or compressor shrouds the rotor bades. A compliant backing surrounds the shroud. The backing is a yieldingly deformable porous material covered with a thin ductile layer. A mounting fixture surrounds the backing.
DEFF Research Database (Denmark)
Schürmann, Carsten; Sarnat, Jeffrey
2009-01-01
an induction principle that combines the comfort of structural induction with the expressive strength of transfinite induction. Using lexicographic path induction, we give a consistency proof of Martin-Löf’s intuitionistic theory of inductive definitions. The consistency of Heyting arithmetic follows directly...
Taylor, Patrick C.; Baker, Noel C.
2015-01-01
Earth's climate is changing and will continue to change into the foreseeable future. Expected changes in the climatological distribution of precipitation, surface temperature, and surface solar radiation will significantly impact agriculture. Adaptation strategies are, therefore, required to reduce the agricultural impacts of climate change. Climate change projections of precipitation, surface temperature, and surface solar radiation distributions are necessary input for adaption planning studies. These projections are conventionally constructed from an ensemble of climate model simulations (e.g., the Coupled Model Intercomparison Project 5 (CMIP5)) as an equal weighted average, one model one vote. Each climate model, however, represents the array of climate-relevant physical processes with varying degrees of fidelity influencing the projection of individual climate variables differently. Presented here is a new approach, termed the "Intelligent Ensemble, that constructs climate variable projections by weighting each model according to its ability to represent key physical processes, e.g., precipitation probability distribution. This approach provides added value over the equal weighted average method. Physical process metrics applied in the "Intelligent Ensemble" method are created using a combination of NASA and NOAA satellite and surface-based cloud, radiation, temperature, and precipitation data sets. The "Intelligent Ensemble" method is applied to the RCP4.5 and RCP8.5 anthropogenic climate forcing simulations within the CMIP5 archive to develop a set of climate change scenarios for precipitation, temperature, and surface solar radiation in each USDA Farm Resource Region for use in climate change adaptation studies.
Graphs and matroids weighted in a bounded incline algebra.
Lu, Ling-Xia; Zhang, Bei
2014-01-01
Firstly, for a graph weighted in a bounded incline algebra (or called a dioid), a longest path problem (LPP, for short) is presented, which can be considered the uniform approach to the famous shortest path problem, the widest path problem, and the most reliable path problem. The solutions for LPP and related algorithms are given. Secondly, for a matroid weighted in a linear matroid, the maximum independent set problem is studied.
Jenness, Samuel M; Neaigus, Alan; Murrill, Christopher S; Gelpi-Acosta, Camila; Wendel, Travis; Hagan, Holly
2011-01-01
We investigated the impact of recruitment bias within the venue-based sampling (VBS) method, which is widely used to estimate disease prevalence and risk factors among groups, such as men who have sex with men (MSM), that congregate at social venues. In a 2008 VBS study of 479 MSM in New York City, we calculated venue-specific approach rates (MSM approached/MSM counted) and response rates (MSM interviewed/MSM approached), and then compared crude estimates of HIV risk factors and seroprevalence with estimates weighted to address the lower selection probabilities of MSM who attend social venues infrequently or were recruited at high-volume venues. Our approach rates were lowest at dance clubs, gay pride events, and public sex strolls, where venue volumes were highest; response rates ranged from 39% at gay pride events to 95% at community-based organizations. Sixty-seven percent of respondents attended MSM-oriented social venues at least weekly, and 21% attended such events once a month or less often in the past year. In estimates adjusted for these variations, the prevalence of several past-year risk factors (e.g., unprotected anal intercourse with casual/exchange partners, ≥5 total partners, group sex encounters, at least weekly binge drinking, and hard-drug use) was significantly lower compared with crude estimates. Adjusted HIV prevalence was lower than unadjusted prevalence (15% vs. 18%), but not significantly. Not adjusting VBS data for recruitment biases could overestimate HIV risk and prevalence when the selection probability is greater for higher-risk MSM. While further examination of recruitment-adjustment methods for VBS data is needed, presentation of both unadjusted and adjusted estimates is currently indicated.
Barnes, Timothy L; French, Simone A; Mitchell, Nathan R; Wolfson, Julian
2016-04-01
To examine the association between fast-food consumption, diet quality and body weight in a community sample of working adults. Cross-sectional and prospective analysis of anthropometric, survey and dietary data from adults recruited to participate in a worksite nutrition intervention. Participants self-reported frequency of fast-food consumption per week. Nutrient intakes and diet quality, using the Healthy Eating Index-2010 (HEI-2010), were computed from dietary recalls collected at baseline and 6 months. Metropolitan medical complex, Minneapolis, MN, USA. Two hundred adults, aged 18-60 years. Cross-sectionally, fast-food consumption was significantly associated with higher daily total energy intake (β=72·5, P=0·005), empty calories (β=0·40, P=0·006) and BMI (β=0·73, P=0·011), and lower HEI-2010 score (β=-1·23, P=0·012), total vegetables (β=-0·14, P=0·004), whole grains (β=-0·39, P=0·005), fibre (β=-0·83, P=0·002), Mg (β=-6·99, P=0·019) and K (β=-57·5, P=0·016). Over 6 months, change in fast-food consumption was not significantly associated with changes in energy intake or BMI, but was significantly inversely associated with total intake of vegetables (β=-0·14, P=0·034). Frequency of fast-food consumption was significantly associated with higher energy intake and poorer diet quality cross-sectionally. Six-month change in fast-food intake was small, and not significantly associated with overall diet quality or BMI.
Phenotype Recognition with Combined Features and Random Subspace Classifier Ensemble
Directory of Open Access Journals (Sweden)
Pham Tuan D
2011-04-01
Full Text Available Abstract Background Automated, image based high-content screening is a fundamental tool for discovery in biological science. Modern robotic fluorescence microscopes are able to capture thousands of images from massively parallel experiments such as RNA interference (RNAi or small-molecule screens. As such, efficient computational methods are required for automatic cellular phenotype identification capable of dealing with large image data sets. In this paper we investigated an efficient method for the extraction of quantitative features from images by combining second order statistics, or Haralick features, with curvelet transform. A random subspace based classifier ensemble with multiple layer perceptron (MLP as the base classifier was then exploited for classification. Haralick features estimate image properties related to second-order statistics based on the grey level co-occurrence matrix (GLCM, which has been extensively used for various image processing applications. The curvelet transform has a more sparse representation of the image than wavelet, thus offering a description with higher time frequency resolution and high degree of directionality and anisotropy, which is particularly appropriate for many images rich with edges and curves. A combined feature description from Haralick feature and curvelet transform can further increase the accuracy of classification by taking their complementary information. We then investigate the applicability of the random subspace (RS ensemble method for phenotype classification based on microscopy images. A base classifier is trained with a RS sampled subset of the original feature set and the ensemble assigns a class label by majority voting. Results Experimental results on the phenotype recognition from three benchmarking image sets including HeLa, CHO and RNAi show the effectiveness of the proposed approach. The combined feature is better than any individual one in the classification accuracy. The
Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing
Toye, Habib
2017-05-26
We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.
Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing
Toye, Habib; Zhan, Peng; Gopalakrishnan, Ganesh; Kartadikaria, Aditya R.; Huang, Huang; Knio, Omar; Hoteit, Ibrahim
2017-07-01
We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.
Ensemble Integration of Forest Disturbance Maps for the Landscape Change Monitoring System (LCMS)
Cohen, W. B.; Healey, S. P.; Yang, Z.; Zhu, Z.; Woodcock, C. E.; Kennedy, R. E.; Huang, C.; Steinwand, D.; Vogelmann, J. E.; Stehman, S. V.; Loveland, T. R.
2014-12-01
The recent convergence of free, high quality Landsat data and acceleration in the development of dense Landsat time series algorithms has spawned a nascent interagency effort known as the Landscape Change Monitoring System (LCMS). LCMS is being designed to map historic land cover changes associated with all major disturbance agents and land cover types in the US. Currently, five existing algorithms are being evaluated for inclusion in LCMS. The priorities of these five algorithms overlap to some degree, but each has its own strengths. This has led to the adoption of a novel approach, within LCMS, to integrate the map outputs (i.e., base learners) from these change detection algorithms using empirical ensemble models. Training data are derived from independent datasets representing disturbances such as: harvest, fire, insects, wind, and land use change. Ensemble modeling is expected to produce significant increases in predictive accuracy relative to the results of the individual base learners. The non-parametric models used in LCMS also provide a framework for matching output ensemble maps to independent sample-based statistical estimates of disturbance area. Multiple decision trees "vote" on class assignment, and it is possible to manipulate vote thresholds to ensure that ensemble maps reflect areas of disturbance derived from sources such as national-scale ground or image-based inventories. This talk will focus on results of the first ensemble integration of the base learners for six Landsat scenes distributed across the US. We will present an assessment of base learner performance across different types of disturbance against an independently derived, sample-based disturbance dataset (derived from the TimeSync Landsat time series visualization tool). The goal is to understand the contributions of each base learner to the quality of the ensemble map products. We will also demonstrate how the ensemble map products can be manipulated to match sample-based annual
Inter-relationships among traits and path analysis for yield ...
African Journals Online (AJOL)
Multiple regression and simple phenotypic correlation coefficients revealed that storage root number and storage root weight were important components in storage root yield across locations. The path analysis identified leaf area, storage root number, storage root girth and storage root weight as the main yield components ...
Real-time hydrological ensemble forecasts: experience gained from MAP D-PHASE
Hegg, C.; Ranzi, R.
2009-04-01
A hydrological ensemble prediction system for the Alpine region was setup and tested in real-time from June to November 2007 within the MAP D-PHASE project. Several flood forecasting chains, combining meteorological ensemble predictions, surface measurements or radar data and hydrological models issued ensemble runoff forecasts for a number of basins in the Alps. Hydrological aspects of these operational forecasting chains are presented, from the definitions of meteorological and hydrological attention, alert and alarm thresholds for each basin to the interaction with end users including their feedback. The experience gained from this experiment demonstrates that hydrological ensemble prediction systems provide a wide information which is useful for the decisions of end users, as managers of water resources and hydropower systems, civil protection, hydrometeorological services. However some convective type of events, occurred in June and in September 2007 on both the southern and the northern side of the Alps were not well captured by the models and the importance of model initialisation and the assimilation of surface observations was confirmed. Some ‘outliers', or ‘crazy' members of the ensemble predictions resulting in very high rainfall peaks also for ordinary events might alert a ‘risk-adverse' end user. The information available to unexperienced end users can be potentially too much and can exhibit a too large variability thus needing to be interpreted and condensed in a simple way, to become useful for their decisions. Experienced end users, instead, are capable to weight the importance of forecasts vs. observations to take decisions indicating that end user training to ensemble prediction systems is needed. Nowcasting systems were assessed as very helpful, because they helped determining in what range of the spread of the ensemble an event was going to develope. In any case a close collaboration between end users and forecasters is essential so that
DEFF Research Database (Denmark)
Ben Bouallègue, Zied; Heppelmann, Tobias; Theis, Susanne E.
2016-01-01
Probabilistic forecasts in the form of ensemble of scenarios are required for complex decision making processes. Ensemble forecasting systems provide such products but the spatio-temporal structures of the forecast uncertainty is lost when statistical calibration of the ensemble forecasts...... is applied for each lead time and location independently. Non-parametric approaches allow the reconstruction of spatio-temporal joint probability distributions at a low computational cost. For example, the ensemble copula coupling (ECC) method rebuilds the multivariate aspect of the forecast from...... the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error. The new...
DEFF Research Database (Denmark)
Ben Bouallègue, Zied; Heppelmann, Tobias; Theis, Susanne E.
2015-01-01
Probabilistic forecasts in the form of ensemble of scenarios are required for complex decision making processes. Ensemble forecasting systems provide such products but the spatio-temporal structures of the forecast uncertainty is lost when statistical calibration of the ensemble forecasts...... is applied for each lead time and location independently. Non-parametric approaches allow the reconstruction of spatio-temporal joint probability distributions at a low computational cost.For example, the ensemble copula coupling (ECC) method consists in rebuilding the multivariate aspect of the forecast...... from the original ensemble forecasts. Based on the assumption of error stationarity, parametric methods aim to fully describe the forecast dependence structures. In this study, the concept of ECC is combined with past data statistics in order to account for the autocorrelation of the forecast error...
A new deterministic Ensemble Kalman Filter with one-step-ahead smoothing for storm surge forecasting
Raboudi, Naila
2016-11-01
The Ensemble Kalman Filter (EnKF) is a popular data assimilation method for state-parameter estimation. Following a sequential assimilation strategy, it breaks the problem into alternating cycles of forecast and analysis steps. In the forecast step, the dynamical model is used to integrate a stochastic sample approximating the state analysis distribution (called analysis ensemble) to obtain a forecast ensemble. In the analysis step, the forecast ensemble is updated with the incoming observation using a Kalman-like correction, which is then used for the next forecast step. In realistic large-scale applications, EnKFs are implemented with limited ensembles, and often poorly known model errors statistics, leading to a crude approximation of the forecast covariance. This strongly limits the filter performance. Recently, a new EnKF was proposed in [1] following a one-step-ahead smoothing strategy (EnKF-OSA), which involves an OSA smoothing of the state between two successive analysis. At each time step, EnKF-OSA exploits the observation twice. The incoming observation is first used to smooth the ensemble at the previous time step. The resulting smoothed ensemble is then integrated forward to compute a "pseudo forecast" ensemble, which is again updated with the same observation. The idea of constraining the state with future observations is to add more information in the estimation process in order to mitigate for the sub-optimal character of EnKF-like methods. The second EnKF-OSA "forecast" is computed from the smoothed ensemble and should therefore provide an improved background. In this work, we propose a deterministic variant of the EnKF-OSA, based on the Singular Evolutive Interpolated Ensemble Kalman (SEIK) filter. The motivation behind this is to avoid the observations perturbations of the EnKF in order to improve the scheme\\'s behavior when assimilating big data sets with small ensembles. The new SEIK-OSA scheme is implemented and its efficiency is demonstrated
Multiscale macromolecular simulation: role of evolving ensembles.
Singharoy, A; Joshi, H; Ortoleva, P J
2012-10-22
Multiscale analysis provides an algorithm for the efficient simulation of macromolecular assemblies. This algorithm involves the coevolution of a quasiequilibrium probability density of atomic configurations and the Langevin dynamics of spatial coarse-grained variables denoted order parameters (OPs) characterizing nanoscale system features. In practice, implementation of the probability density involves the generation of constant OP ensembles of atomic configurations. Such ensembles are used to construct thermal forces and diffusion factors that mediate the stochastic OP dynamics. Generation of all-atom ensembles at every Langevin time step is computationally expensive. Here, multiscale computation for macromolecular systems is made more efficient by a method that self-consistently folds in ensembles of all-atom configurations constructed in an earlier step, history, of the Langevin evolution. This procedure accounts for the temporal evolution of these ensembles, accurately providing thermal forces and diffusions. It is shown that efficiency and accuracy of the OP-based simulations is increased via the integration of this historical information. Accuracy improves with the square root of the number of historical timesteps included in the calculation. As a result, CPU usage can be decreased by a factor of 3-8 without loss of accuracy. The algorithm is implemented into our existing force-field based multiscale simulation platform and demonstrated via the structural dynamics of viral capsomers.
The Hydrologic Ensemble Prediction Experiment (HEPEX)
Wood, A. W.; Thielen, J.; Pappenberger, F.; Schaake, J. C.; Hartman, R. K.
2012-12-01
The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF). With support from the US National Weather Service (NWS) and the European Commission (EC), the HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support in emergency management and water resources sectors. The strategy to meet this goal includes meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. HEPEX has organized about a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Today, the HEPEX mission is to demonstrate the added value of hydrological ensemble prediction systems (HEPS) for emergency management and water resources sectors to make decisions that have important consequences for economy, public health, safety, and the environment. HEPEX is now organised around six major themes that represent core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.
An ensemble framework for identifying essential proteins.
Zhang, Xue; Xiao, Wangxin; Acencio, Marcio Luis; Lemke, Ney; Wang, Xujing
2016-08-25
Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small. In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins. This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.
Krenn, Mario; Hochrainer, Armin; Lahiri, Mayukh; Zeilinger, Anton
2017-02-01
Quantum entanglement is one of the most prominent features of quantum mechanics and forms the basis of quantum information technologies. Here we present a novel method for the creation of quantum entanglement in multipartite and high-dimensional systems. The two ingredients are (i) superposition of photon pairs with different origins and (ii) aligning photons such that their paths are identical. We explain the experimentally feasible creation of various classes of multiphoton entanglement encoded in polarization as well as in high-dimensional Hilbert spaces—starting only from nonentangled photon pairs. For two photons, arbitrary high-dimensional entanglement can be created. The idea of generating entanglement by path identity could also apply to quantum entities other than photons. We discovered the technique by analyzing the output of a computer algorithm. This shows that computer designed quantum experiments can be inspirations for new techniques.
GACEM: Genetic Algorithm Based Classifier Ensemble in a Multi-sensor System
Xu, Rongwu; He, Lin
2008-01-01
Multi-sensor systems (MSS) have been increasingly applied in pattern classification while searching for the optimal classification framework is still an open problem. The development of the classifier ensemble seems to provide a promising solution. The classifier ensemble is a learning paradigm where many classifiers are jointly used to solve a problem, which has been proven an effective method for enhancing the classification ability. In this paper, by introducing the concept of Meta-feature (MF) and Trans-function (TF) for describing the relationship between the nature and the measurement of the observed phenomenon, classification in a multi-sensor system can be unified in the classifier ensemble framework. Then an approach called Genetic Algorithm based Classifier Ensemble in Multi-sensor system (GACEM) is presented, where a genetic algorithm is utilized for optimization of both the selection of features subset and the decision combination simultaneously. GACEM trains a number of classifiers based on different combinations of feature vectors at first and then selects the classifiers whose weight is higher than the pre-set threshold to make up the ensemble. An empirical study shows that, compared with the conventional feature-level voting and decision-level voting, not only can GACEM achieve better and more robust performance, but also simplify the system markedly. PMID:27873866
The effect of varying path properties in path steering tasks
L. Liu (Lei); R. van Liere (Robert)
2010-01-01
textabstractPath steering is a primitive 3D interaction task that requires the user to navigate through a path of a given length and width. In a previous paper, we have conducted controlled experiments in which users operated a pen input device to steer a cursor through a 3D path subject to
PATHS groundwater hydrologic model
Energy Technology Data Exchange (ETDEWEB)
Nelson, R.W.; Schur, J.A.
1980-04-01
A preliminary evaluation capability for two-dimensional groundwater pollution problems was developed as part of the Transport Modeling Task for the Waste Isolation Safety Assessment Program (WISAP). Our approach was to use the data limitations as a guide in setting the level of modeling detail. PATHS Groundwater Hydrologic Model is the first level (simplest) idealized hybrid analytical/numerical model for two-dimensional, saturated groundwater flow and single component transport; homogeneous geology. This document consists of the description of the PATHS groundwater hydrologic model. The preliminary evaluation capability prepared for WISAP, including the enhancements that were made because of the authors' experience using the earlier capability is described. Appendixes A through D supplement the report as follows: complete derivations of the background equations are provided in Appendix A. Appendix B is a comprehensive set of instructions for users of PATHS. It is written for users who have little or no experience with computers. Appendix C is for the programmer. It contains information on how input parameters are passed between programs in the system. It also contains program listings and test case listing. Appendix D is a definition of terms.
Ensemble Variability of Near-Infrared-Selected Active Galactic Nuclei
Kouzuma, Shinjirou; Yamaoka, Hitoshi
2011-01-01
We present the properties of the ensemble variability $V$ for nearly 5000 near-infrared (NIR) AGNs selected from the catalog of Quasars and Active Galactic Nuclei (13th Ed.) and the SDSS-DR7 quasar catalog. From 2MASS, DENIS, and UKIDSS/LAS point source catalogs, we extract 2MASS-DENIS and 2MASS-UKIDSS counterparts for cataloged AGNs by catalog cross-identification. We further select variable AGNs based on an optimal criterion for selecting the variable sources. The sample objects are divided...
Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses
Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong
2017-04-01
Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums
An ensemble-based approach for breast mass classification in mammography images
Ribeiro, Patricia B.; Papa, João. P.; Romero, Roseli A. F.
2017-03-01
Mammography analysis is an important tool that helps detecting breast cancer at the very early stages of the disease, thus increasing the quality of life of hundreds of thousands of patients worldwide. In Computer-Aided Detection systems, the identification of mammograms with and without masses (without clinical findings) is highly needed to reduce the false positive rates regarding the automatic selection of regions of interest that may contain some suspicious content. In this work, the introduce a variant of the Optimum-Path Forest (OPF) classifier for breast mass identification, as well as we employed an ensemble-based approach that can enhance the effectiveness of individual classifiers aiming at dealing with the aforementioned purpose. The experimental results also comprise the naïve OPF and a traditional neural network, being the most accurate results obtained through the ensemble of classifiers, with an accuracy nearly to 86%.
A sum-over-paths impulse-response moment-extraction algorithm for RLC IC-interconnect networks
Le Coz, Y. L.; Krishna, D.; Hariharan, G.; Petranovic, D. M.
2005-10-01
We have created a new impulse-response (IR) moment-extraction algorithm for RLC circuit networks. It employs a Feynman sum-over-paths postulate. Our approach begins with generation of s-domain nodal-voltage equations. We then perform a Taylor-series expansion of the circuit transfer function. These expansions yield transition diagrams involving mathematical coupling constants, or weight factors, in integral powers of complex frequency s. Our sum-over-paths postulate supports stochastic evaluation of path sums within the circuit transition diagram to any desired order of s. The specific order of s in the sum corresponds, as well, to the order of IR moment we seek to extract. In developing the algorithm, importantly, we maintain computational efficiency and full parallelism. Initial verification studies of uncoupled and coupled RLC lines furnished promising results: 5% and 10% approximate 1- σ error for first- and second-order IR moments, respectively, after only 100 sampled path-sum terms. In addition, we observed excellent convergence to exact, analytical moment values with increasing number of samples. Our sum-over-paths postulate, in fact, implies generality for arbitrary RLC-interconnect networks, beyond those specific examples presented in this work. We believe, in conclusion, that this type of IR-moment extraction algorithm may find useful application in a massively coupled electrical system, such as that encountered in high-end digital-IC interconnects.
Embedded random matrix ensembles in quantum physics
Kota, V K B
2014-01-01
Although used with increasing frequency in many branches of physics, random matrix ensembles are not always sufficiently specific to account for important features of the physical system at hand. One refinement which retains the basic stochastic approach but allows for such features consists in the use of embedded ensembles. The present text is an exhaustive introduction to and survey of this important field. Starting with an easy-to-read introduction to general random matrix theory, the text then develops the necessary concepts from the beginning, accompanying the reader to the frontiers of present-day research. With some notable exceptions, to date these ensembles have primarily been applied in nuclear spectroscopy. A characteristic example is the use of a random two-body interaction in the framework of the nuclear shell model. Yet, topics in atomic physics, mesoscopic physics, quantum information science and statistical mechanics of isolated finite quantum systems can also be addressed using these ensemb...
Statistical ensembles and fragmentation of finite nuclei
Das, P.; Mallik, S.; Chaudhuri, G.
2017-09-01
Statistical models based on different ensembles are very commonly used to describe the nuclear multifragmentation reaction in heavy ion collisions at intermediate energies. Canonical model results are more appropriate for finite nuclei calculations while those obtained from the grand canonical ones are more easily calculable. A transformation relation has been worked out for converting results of finite nuclei from grand canonical to canonical and vice versa. The formula shows that, irrespective of the particle number fluctuation in the grand canonical ensemble, exact canonical results can be recovered for observables varying linearly or quadratically with the number of particles. This result is of great significance since the baryon and charge conservation constraints can make the exact canonical calculations extremely difficult in general. This concept developed in this work can be extended in future for transformation to ensembles where analytical solutions do not exist. The applicability of certain equations (isoscaling, etc.) in the regime of finite nuclei can also be tested using this transformation relation.
Total probabilities of ensemble runoff forecasts
Olav Skøien, Jon; Bogner, Konrad; Salamon, Peter; Smith, Paul; Pappenberger, Florian
2017-04-01
Ensemble forecasting has a long history from meteorological modelling, as an indication of the uncertainty of the forecasts. However, it is necessary to calibrate and post-process the ensembles as the they often exhibit both bias and dispersion errors. Two of the most common methods for this are Bayesian Model Averaging (Raftery et al., 2005) and Ensemble Model Output Statistics (EMOS) (Gneiting et al., 2005). There are also methods for regionalizing these methods (Berrocal et al., 2007) and for incorporating the correlation between lead times (Hemri et al., 2013). Engeland and Steinsland Engeland and Steinsland (2014) developed a framework which can estimate post-processing parameters varying in space and time, while giving a spatially and temporally consistent output. However, their method is computationally complex for our larger number of stations, which makes it unsuitable for our purpose. Our post-processing method of the ensembles is developed in the framework of the European Flood Awareness System (EFAS - http://www.efas.eu), where we are making forecasts for whole Europe, and based on observations from around 700 catchments. As the target is flood forecasting, we are also more interested in improving the forecast skill for high-flows rather than in a good prediction of the entire flow regime. EFAS uses a combination of ensemble forecasts and deterministic forecasts from different meteorological forecasters to force a distributed hydrologic model and to compute runoff ensembles for each river pixel within the model domain. Instead of showing the mean and the variability of each forecast ensemble individually, we will now post-process all model outputs to estimate the total probability, the post-processed mean and uncertainty of all ensembles. The post-processing parameters are first calibrated for each calibration location, but we are adding a spatial penalty in the calibration process to force a spatial correlation of the parameters. The penalty takes
Matrix averages relating to Ginibre ensembles
Energy Technology Data Exchange (ETDEWEB)
Forrester, Peter J [Department of Mathematics and Statistics, University of Melbourne, Victoria 3010 (Australia); Rains, Eric M [Department of Mathematics, California Institute of Technology, Pasadena, CA 91125 (United States)], E-mail: p.forrester@ms.unimelb.edu.au
2009-09-25
The theory of zonal polynomials is used to compute the average of a Schur polynomial of argument AX, where A is a fixed matrix and X is from the real Ginibre ensemble. This generalizes a recent result of Sommers and Khoruzhenko (2009 J. Phys. A: Math. Theor. 42 222002), and furthermore allows analogous results to be obtained for the complex and real quaternion Ginibre ensembles. As applications, the positive integer moments of the general variance Ginibre ensembles are computed in terms of generalized hypergeometric functions; these are written in terms of averages over matrices of the same size as the moment to give duality formulas, and the averages of the power sums of the eigenvalues are expressed as finite sums of zonal polynomials.
Ensemble approach for differentiation of malignant melanoma
Rastgoo, Mojdeh; Morel, Olivier; Marzani, Franck; Garcia, Rafael
2015-04-01
Melanoma is the deadliest type of skin cancer, yet it is the most treatable kind depending on its early diagnosis. The early prognosis of melanoma is a challenging task for both clinicians and dermatologists. Due to the importance of early diagnosis and in order to assist the dermatologists, we propose an automated framework based on ensemble learning methods and dermoscopy images to differentiate melanoma from dysplastic and benign lesions. The evaluation of our framework on the recent and public dermoscopy benchmark (PH2 dataset) indicates the potential of proposed method. Our evaluation, using only global features, revealed that ensembles such as random forest perform better than single learner. Using random forest ensemble and combination of color and texture features, our framework achieved the highest sensitivity of 94% and specificity of 92%.
Ensemble simulations with discrete classical dynamics
DEFF Research Database (Denmark)
Toxværd, Søren
2013-01-01
{E}(h)$ is employed to determine the relation with the corresponding energy, $E$ for the analytic dynamics with $h=0$ and the zero-order estimate $E_0(h)$ of the energy for discrete dynamics, appearing in the literature for MD with VA. We derive a corresponding time reversible VA algorithm for canonical dynamics...... for the $(NV\\tilde{T}(h))$ ensemble and determine the relations between the energies and temperatures for the different ensembles, including the $(NVE_0(h))$ and $(NVT_0(h))$ ensembles. The differences in the energies and temperatures are proportional with $h^2$ and they are of the order a few tenths...... of a percent for a traditional value of $h$. The relations between $(NV\\tilde{E}(h))$ and $(NVE)$, and $(NV\\tilde{T}(h))$ and $(NVT)$ are easily determined for a given density and temperature, and allows for using larger time increments in MD. The accurate determinations of the energies are used to determine...
Directory of Open Access Journals (Sweden)
Mohammad Nazmul Haque
Full Text Available Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.
Shortest Paths and Vehicle Routing
DEFF Research Database (Denmark)
Petersen, Bjørn
This thesis presents how to parallelize a shortest path labeling algorithm. It is shown how to handle Chvátal-Gomory rank-1 cuts in a column generation context. A Branch-and-Cut algorithm is given for the Elementary Shortest Paths Problem with Capacity Constraint. A reformulation of the Vehicle R...... Routing Problem based on partial paths is presented. Finally, a practical application of finding shortest paths in the telecommunication industry is shown....
Cluster ensembles, quantization and the dilogarithm
DEFF Research Database (Denmark)
Fock, Vladimir; Goncharov, Alexander B.
2009-01-01
A cluster ensemble is a pair of positive spaces (i.e. varieties equipped with positive atlases), coming with an action of a symmetry group . The space is closely related to the spectrum of a cluster algebra [ 12 ]. The two spaces are related by a morphism . The space is equipped with a closed -form......, possibly degenerate, and the space has a Poisson structure. The map is compatible with these structures. The dilogarithm together with its motivic and quantum avatars plays a central role in the cluster ensemble structure. We define a non-commutative -deformation of the -space. When is a root of unity...
Ensemble analysis of adaptive compressed genome sequencing strategies
2014-01-01
Background Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. Results In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource
Ensemble Averaged Probability Density Function (APDF) for Compressible Turbulent Reacting Flows
Shih, Tsan-Hsing; Liu, Nan-Suey
2012-01-01
In this paper, we present a concept of the averaged probability density function (APDF) for studying compressible turbulent reacting flows. The APDF is defined as an ensemble average of the fine grained probability density function (FG-PDF) with a mass density weighting. It can be used to exactly deduce the mass density weighted, ensemble averaged turbulent mean variables. The transport equation for APDF can be derived in two ways. One is the traditional way that starts from the transport equation of FG-PDF, in which the compressible Navier- Stokes equations are embedded. The resulting transport equation of APDF is then in a traditional form that contains conditional means of all terms from the right hand side of the Navier-Stokes equations except for the chemical reaction term. These conditional means are new unknown quantities that need to be modeled. Another way of deriving the transport equation of APDF is to start directly from the ensemble averaged Navier-Stokes equations. The resulting transport equation of APDF derived from this approach appears in a closed form without any need for additional modeling. The methodology of ensemble averaging presented in this paper can be extended to other averaging procedures: for example, the Reynolds time averaging for statistically steady flow and the Reynolds spatial averaging for statistically homogeneous flow. It can also be extended to a time or spatial filtering procedure to construct the filtered density function (FDF) for the large eddy simulation (LES) of compressible turbulent reacting flows.
Kelishadi, Roya; Marashinia, Farzad; Heshmat, Ramin; Motlagh, Mohammad-Esmaeil; Qorbani, Mostafa; Taslimi, Mahnaz; Nourbakhsh, Mohsen; Ardalan, Gelayol; Poursafa, Parinaz
2013-04-20
This study explores the associations of weight perceptions with actual body mass index (BMI) and attempts to lose weight in a nationally representative sample of a pediatric population. Data were collected from school students of 27 provinces in Iran, as part of "the national survey of school student high risk behaviors". We used t-test for continuous data and chi square test for categorical data. The correlation between categorical variables was assessed by Cramer's phi test. A multiple nominal logistic regression model was fitted to data to assess the association between perceived body weight and gender by adjusting for potential confounding variables. The study participants consisted of 5570 (2784 girls, 70% urban) students with mean age of 14.7 ±2.4 years. Overall, 17.3% of students were underweight, and 17.7% were overweight or obese. Nearly 25% and 50% of participants reported themselves as appropriate weight and very obese, respectively. In both genders, the strength of association between perceived weight and actual BMI was quite high (Cramer's phi coefficient = 0.5, p body weight with trying to lose weight was moderate (Cramer's phi coefficient = 0.2, p peers to try to lose weight. After adjusting for possible confounders, the chance of perceiving oneself as very obese compared to perceiving oneself as very thin was 1.56-fold higher in girls than in boys, i.e. OR (95% CI): 1.56 (1.27-1.91). This study revealed a considerably frequent "mismatch" between actual weight status and body shape dissatisfaction, which supports the necessity of increasing public awareness in this regard.
Integrating path dependency and path creation in a general understanding of path constitution
Meyer, Uli; Schubert, Cornelius
2007-01-01
Path dependency as it is described by Arthur and David portrays technological developments as historically embedded, emergent processes. In contrast, Garud and Karnøe's notion of path creation emphasises the role of strategic change and deliberate action for the development of new technologies. In this article, we integrate both concepts into a general understanding of path processes which accounts for emergent as well as deliberate modes of path constitution. In addition, we distinguish betw...
Two Generations of Path Dependence
DEFF Research Database (Denmark)
Madsen, Mogens Ove
Even if there is no fully articulated and generally accepted theory of Path Dependence it has eagerly been taken up across a wide range of social sciences - primarily coming from economics. Path Dependence is most of all a metaphor that offers reason to believe, that some political, social...... or economic processes have multiple possible paths of outcomes, rather than a unique path of equilibria. The selection among outcomes may depend on contingent choices or events - outcomes of path-dependent processes require a very relevant study - a perception of history....
Directory of Open Access Journals (Sweden)
David Middleton
2005-01-01
Full Text Available The hillside’s tidal waves of yellow-green Break downward into full-grown stalks of wheat In which a peasant, shouldering his hoe Passes along a snaking narrow path -- A teeming place through which his hard thighs press And where his head just barely stays above The swaying grain, drunken in abundance, Farm buildings almost floating on the swells Beyond which sea gulls gliding white in air Fly down on out of sight to salty fields, Taking the channel fish off Normandy, A surfeit fit for Eden i...
Directory of Open Access Journals (Sweden)
Jamie Waters
2014-09-01
Full Text Available This project uses Newton’s Second Law of Motion, Euler’s method, basic physics, and basic calculus to model the flight path of a rocket. From this, one can find the height and velocity at any point from launch to the maximum altitude, or apogee. This can then be compared to the actual values to see if the method of estimation is a plausible. The rocket used for this project is modeled after Bullistic-1 which was launched by the Society of Aeronautics and Rocketry at the University of South Florida.
Mehhtz, Peter
2005-01-01
JPF is an explicit state software model checker for Java bytecode. Today, JPF is a swiss army knife for all sort of runtime based verification purposes. This basically means JPF is a Java virtual machine that executes your program not just once (like a normal VM), but theoretically in all possible ways, checking for property violations like deadlocks or unhandled exceptions along all potential execution paths. If it finds an error, JPF reports the whole execution that leads to it. Unlike a normal debugger, JPF keeps track of every step how it got to the defect.
Kalman plus weights: a time scale algorithm
Greenhall, C. A.
2001-01-01
KPW is a time scale algorithm that combines Kalman filtering with the basic time scale equation (BTSE). A single Kalman filter that estimates all clocks simultaneously is used to generate the BTSE frequency estimates, while the BTSE weights are inversely proportional to the white FM variances of the clocks. Results from simulated clock ensembles are compared to previous simulation results from other algorithms.
NYYD Ensemble ja Riho Sibul / Anneli Remme
Remme, Anneli, 1968-
2001-01-01
Gavin Bryarsi teos "Jesus' Blood Never Failed Me Yet" NYYD Ensemble'i ja Riho Sibula esituses 27. detsembril Pauluse kirikus Tartus ja 28. detsembril Rootsi- Mihkli kirikus Tallinnas. Kaastegevad Tartu Ülikooli Kammerkoor (Tartus) ja kammerkoor Voces Musicales (Tallinnas). Kunstiline juht Olari Elts
Genetic Algorithm Optimized Neural Networks Ensemble as ...
African Journals Online (AJOL)
Improvements in neural network calibration models by a novel approach using neural network ensemble (NNE) for the simultaneous spectrophotometric multicomponent analysis are suggested, with a study on the estimation of the components of an antihypertensive combination, namely, atenolol and losartan potassium.
Canonical Ensemble Model for Black Hole Radiation
Indian Academy of Sciences (India)
In this paper, a canonical ensemble model for the black hole quantum tunnelling radiation is introduced. In this model the probability distribution function corresponding to the emission shell is calculated to second order. The formula of pressure and internal energy of the thermal system is modified, and the fundamental ...
Squeezing of Collective Excitations in Spin Ensembles
DEFF Research Database (Denmark)
Kraglund Andersen, Christian; Mølmer, Klaus
2012-01-01
and analytic calculations, and we obtain squeezing for a wide range of parameters. We also investigate the transfer of the squeezing properties to the cavity field and to an output mode from the cavity. Finally, we investigate how the squeezing is affected by effects of inhomogeneities which would be present...... in solid state implementations of the spin ensembles....
Quantifying Uncertainty Through Global and Mesoscale Ensembles
2009-09-30
transition to operations is in progress, with porting to the FNMOC large operational LINUX cluster ( OPAL A2) complete. The mesoscale system has been...ensembles to understand model behavior within parameter space, and the NOAA-funded Hurricane Forecast Improvement Project (HFIP). For the mesoscale
The Hydrologic Ensemble Prediction Experiment (HEPEX)
Wood, Andy; Wetterhall, Fredrik; Ramos, Maria-Helena
2015-04-01
The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF), and co-sponsored by the US National Weather Service (NWS) and the European Commission (EC). The HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support. HEPEX pursues this goal through research efforts and practical implementations involving six core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. HEPEX has grown through meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. In the last decade, HEPEX has organized over a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Through these interactions and an active online blog (www.hepex.org), HEPEX has built a strong and active community of nearly 400 researchers & practitioners around the world. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.
A method for ensemble wildland fire simulation
Mark A. Finney; Isaac C. Grenfell; Charles W. McHugh; Robert C. Seli; Diane Trethewey; Richard D. Stratton; Stuart Brittain
2011-01-01
An ensemble simulation system that accounts for uncertainty in long-range weather conditions and two-dimensional wildland fire spread is described. Fuel moisture is expressed based on the energy release component, a US fire danger rating index, and its variation throughout the fire season is modeled using time series analysis of historical weather data. This analysis...
Nonlocal inhomogeneous broadening in plasmonic nanoparticle ensembles
DEFF Research Database (Denmark)
Tserkezis, Christos; Maack, Johan Rosenkrantz; Liu, Z.
)] or realistic noble metals, following sharp or smooth size distribution functions around an ensemble mean value, and identify examples where inhomogeneous broadening can be significant. We also explain how its role becomes less important in most experimentally accessible situations, provided that all...
Frederick, David A; Sandhu, Gaganjyot; Morse, Patrick J; Swami, Viren
2016-06-01
We examined the prevalence and correlates of satisfaction with appearance and weight. Participants (N=12,176) completed an online survey posted on the NBCNews.com and Today.com websites. Few men and women were very to extremely dissatisfied with their physical appearances (6%; 9%), but feeling very to extremely dissatisfied with weight was more common (15%; 20%). Only about one-fourth of men and women felt very to extremely satisfied with their appearances (28%; 26%) and weights (24%; 20%). Men and women with higher body masses reported higher appearance and weight dissatisfaction. Dissatisfied people had higher Neuroticism, more preoccupied and fearful attachment styles, and spent more hours watching television. In contrast, satisfied people had higher Openness, Conscientious, and Extraversion, were more secure in attachment style, and had higher self-esteem and life satisfaction. These findings highlight the high prevalence of body dissatisfaction and the factors linked to dissatisfaction among U.S. adults. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Theoretical Analysis of Why Hybrid Ensembles Work
Directory of Open Access Journals (Sweden)
Kuo-Wei Hsu
2017-01-01
Full Text Available Inspired by the group decision making process, ensembles or combinations of classifiers have been found favorable in a wide variety of application domains. Some researchers propose to use the mixture of two different types of classification algorithms to create a hybrid ensemble. Why does such an ensemble work? The question remains. Following the concept of diversity, which is one of the fundamental elements of the success of ensembles, we conduct a theoretical analysis of why hybrid ensembles work, connecting using different algorithms to accuracy gain. We also conduct experiments on classification performance of hybrid ensembles of classifiers created by decision tree and naïve Bayes classification algorithms, each of which is a top data mining algorithm and often used to create non-hybrid ensembles. Therefore, through this paper, we provide a complement to the theoretical foundation of creating and using hybrid ensembles.
Impact of hybrid GSI analysis using ETR ensembles
Indian Academy of Sciences (India)
NCMRWF Global ForecastSystem) with ETR (Ensemble Transform with Rescaling) based Global Ensemble Forecast (GEFS) ofresolution T-190L28 is investigated. The experiment is conducted for a period of one week in June 2013and forecast ...
An educational model for ensemble streamflow simulation and uncertainty analysis
National Research Council Canada - National Science Library
AghaKouchak, A; Nakhjiri, N; Habib, E
2013-01-01
...) are interconnected. The educational toolbox includes a MATLAB Graphical User Interface (GUI) and an ensemble simulation scheme that can be used for teaching uncertainty analysis, parameter estimation, ensemble simulation and model sensitivity...
Space Applications for Ensemble Detection and Analysis Project
National Aeronautics and Space Administration — Ensemble Detection is both a measurement technique and analysis tool. Like a prism that separates light into spectral bands, an ensemble detector mixes a signal with...
Global Ensemble Forecast System (GEFS) [2.5 Deg.
National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...
Using ensemble forecasting for wind power
Energy Technology Data Exchange (ETDEWEB)
Giebel, G.; Landberg, L.; Badger, J. [Risoe National Lab., Roskilde (Denmark); Sattler, K.
2003-07-01
Short-term prediction of wind power has a long tradition in Denmark. It is an essential tool for the operators to keep the grid from becoming unstable in a region like Jutland, where more than 27% of the electricity consumption comes from wind power. This means that the minimum load is already lower than the maximum production from wind energy alone. Danish utilities have therefore used short-term prediction of wind energy since the mid-90ies. However, the accuracy is still far from being sufficient in the eyes of the utilities (used to have load forecasts accurate to within 5% on a one-week horizon). The Ensemble project tries to alleviate the dependency of the forecast quality on one model by using multiple models, and also will investigate the possibilities of using the model spread of multiple models or of dedicated ensemble runs for a prediction of the uncertainty of the forecast. Usually, short-term forecasting works (especially for the horizon beyond 6 hours) by gathering input from a Numerical Weather Prediction (NWP) model. This input data is used together with online data in statistical models (this is the case eg in Zephyr/WPPT) to yield the output of the wind farms or of a whole region for the next 48 hours (only limited by the NWP model horizon). For the accuracy of the final production forecast, the accuracy of the NWP prediction is paramount. While many efforts are underway to increase the accuracy of the NWP forecasts themselves (which ultimately are limited by the amount of computing power available, the lack of a tight observational network on the Atlantic and limited physics modelling), another approach is to use ensembles of different models or different model runs. This can be either an ensemble of different models output for the same area, using different data assimilation schemes and different model physics, or a dedicated ensemble run by a large institution, where the same model is run with slight variations in initial conditions and
Filter ensemble regularized common spatial pattern for EEG classification
Su, Yuxi; Li, Yali; Wang, Shengjin
2015-07-01
Common Spatial Pattern (CSP) is one of the most effective feature extraction algorithm for Brain-Computer Interfaces (BCI). Despite its advantages of wide versatility and high efficiency, CSP is shown to be non-robust to noise and prone to over fitting when training sample number is limited. In order to overcome these problems, Regularized Common Spatial Pattern (RCSP) is further proposed. RCSP regularized covariance matrix estimation by two parameters, which reduces the estimation difference and improves the stationarity under small sample condition. However, RCSP does not make full use of the frequency information. In this paper, we presents a filter ensemble technique for RCSP (FERCSP) to further extract frequency information and aggregate all the RCSPs efficiently to get an ensemble-based solution. The performance of the proposed algorithm is evaluated on data set IVa of BCI Competition III against other five RCSPbased algorithms. The experimental results show that FERCSP significantly outperforms those of the existing methods in classification accuracy. The FERCSP outperforms the CSP algorithm and R-CSP-A algorithm in all five subjects with an average improvement of 6% in accuracy.
Directory of Open Access Journals (Sweden)
Kesztyüs, Dorothea
2014-01-01
Full Text Available [english] Aim: To study associations between health-related quality of life (HRQoL, frequency of illness, and weight in primary school children in southern Germany. Methods: Data from baseline measurements of the outcome evaluation of a teacher based health promotion programme (“Join the Healthy Boat” were analysed. Parents provided information about their children’s HRQoL (KINDL, EQ5D-Y Visual Analogue Scale. The number of visits to a physician, children’s days of absence because of sickness, and parental days of absence from work due to their children’s illness during the last year of school/kindergarten were queried. Children’s weight status was determined by body mass index (BMI, central obesity by waist to height ratio (WHtR ≥0.5. Results: From 1,888 children (7.1±0.6 years, 7.8% were underweight, 82% had normal weight, 5.7% were overweight and 4.4% obese. 8.4% of all children were centrally obese. Bivariate analysis showed no significant differences for parental absence and visits to a physician in weight groups classified by BMI, but obese children had more sick days than non-obese. Centrally obese children differed significantly from the rest in the number of sick days and visits to a physician, but not in the frequency of parental absence. In regression analyses, central obesity correlated significantly with EQ5D-Y VAS, KINDL total score and the subscales of “psyche”, “family” and “friends”. BMI weight groups showed no significant associations. Conclusions: Central obesity but not BMI derived overweight and obesity is associated with HRQoL and visits to a physician in primary school children. Future studies should include WHtR. Preventive measures for children should focus on a reduction of or slowed increase in waist circumference.
Oellingrath, Inger M; Hestetun, Ingebjørg; Svendsen, Martin V
2016-02-01
To examine gender-specific associations of weight perception and appearance satisfaction with slimming attempts and eating patterns among young Norwegian adolescents. Cross-sectional study. Adolescent dietary data were reported by parents using a retrospective FFQ. Eating patterns were identified using principal component analysis. Adolescents' reported weight perception, appearance satisfaction and slimming attempts were analysed using cross-tabulation and Pearson's χ 2 test. Associations between perceived weight, appearance satisfaction and slimming attempts/eating patterns were examined using multiple logistic regression analysis. Primary schools, Telemark, Norway. Children (n 469), mean age 12·7 (sd 0·3) years, and parents. Gender differences were observed in self-perceived weight and appearance satisfaction. Girls were most satisfied with appearance when feeling thin, boys when feeling just the right weight. Perceived overweight was the main predictor of slimming attempts across genders (adjusted OR=15·3; 95 % CI 6·0, 39·1 for girls; adjusted OR=18·2; 95 % CI 5·8, 57·3 for boys). Low appearance satisfaction was associated with slimming attempts (adjusted OR=3·3; 95 % CI 1·0, 10·5) and a dieting eating pattern (adjusted OR=2·8; 95 % CI 1·5, 5·2) in girls. Perceived underweight was associated with a junk/convenience eating pattern in boys (adjusted OR=2·8; 95 % CI 1·2, 6·4). Gender differences were observed in subjective body concerns. Perceived overweight was the main predictor of slimming attempts by both genders. Different aspects of body dissatisfaction were related to different food behaviours in boys and girls. Health professionals should be aware of these gender differences when planning health promotion programmes targeting young adolescents.
Internet's critical path horizon
Valverde, S.; Solé, R. V.
2004-03-01
Internet is known to display a highly heterogeneous structure and complex fluctuations in its traffic dynamics. Congestion seems to be an inevitable result of user's behavior coupled to the network dynamics and it effects should be minimized by choosing appropriate routing strategies. But what are the requirements of routing depth in order to optimize the traffic flow? In this paper we analyse the behavior of Internet traffic with a topologically realistic spatial structure as described in a previous study [S.-H. Yook et al., Proc. Natl Acad. Sci. USA 99, 13382 (2002)]. The model involves self-regulation of packet generation and different levels of routing depth. It is shown that it reproduces the relevant key, statistical features of Internet's traffic. Moreover, we also report the existence of a critical path horizon defining a transition from low-efficient traffic to highly efficient flow. This transition is actually a direct consequence of the web's small world architecture exploited by the routing algorithm. Once routing tables reach the network diameter, the traffic experiences a sudden transition from a low-efficient to a highly-efficient behavior. It is conjectured that routing policies might have spontaneously reached such a compromise in a distributed manner. Internet would thus be operating close to such critical path horizon.
Optimal initial perturbations for El Nino ensemble prediction with ensemble Kalman filter
Energy Technology Data Exchange (ETDEWEB)
Ham, Yoo-Geun; Kang, In-Sik [Seoul National University, School of Earth and Environment Sciences, Seoul (Korea); Kug, Jong-Seong [Korea Ocean Research and Development Institute, Ansan (Korea)
2009-12-15
A method for selecting optimal initial perturbations is developed within the framework of an ensemble Kalman filter (EnKF). Among the initial conditions generated by EnKF, ensemble members with fast growing perturbations are selected to optimize the ENSO seasonal forecast skills. Seasonal forecast experiments show that the forecast skills with the selected ensemble members are significantly improved compared with other ensemble members for up to 1-year lead forecasts. In addition, it is found that there is a strong relationship between the forecast skill improvements and flow-dependent instability. That is, correlation skills are significantly improved over the region where the predictable signal is relatively small (i.e. an inverse relationship). It is also shown that forecast skills are significantly improved during ENSO onset and decay phases, which are the most unpredictable periods among the ENSO events. (orig.)
Quantum canonical ensemble: A projection operator approach
Magnus, Wim; Lemmens, Lucien; Brosens, Fons
2017-09-01
Knowing the exact number of particles N, and taking this knowledge into account, the quantum canonical ensemble imposes a constraint on the occupation number operators. The constraint particularly hampers the systematic calculation of the partition function and any relevant thermodynamic expectation value for arbitrary but fixed N. On the other hand, fixing only the average number of particles, one may remove the above constraint and simply factorize the traces in Fock space into traces over single-particle states. As is well known, that would be the strategy of the grand-canonical ensemble which, however, comes with an additional Lagrange multiplier to impose the average number of particles. The appearance of this multiplier can be avoided by invoking a projection operator that enables a constraint-free computation of the partition function and its derived quantities in the canonical ensemble, at the price of an angular or contour integration. Introduced in the recent past to handle various issues related to particle-number projected statistics, the projection operator approach proves beneficial to a wide variety of problems in condensed matter physics for which the canonical ensemble offers a natural and appropriate environment. In this light, we present a systematic treatment of the canonical ensemble that embeds the projection operator into the formalism of second quantization while explicitly fixing N, the very number of particles rather than the average. Being applicable to both bosonic and fermionic systems in arbitrary dimensions, transparent integral representations are provided for the partition function ZN and the Helmholtz free energy FN as well as for two- and four-point correlation functions. The chemical potential is not a Lagrange multiplier regulating the average particle number but can be extracted from FN+1 -FN, as illustrated for a two-dimensional fermion gas.
Observation bias correction with an ensemble Kalman filter
Fertig, Elana J.; Baek, Seung-Jong; Hunt, Brian R.; Ott, Edward; Szunyogh, Istvan; Aravéquia, José A.; Kalnay, Eugenia; Li, Hong; Liu, Junjie
2009-01-01
This paper considers the use of an ensemble Kalman filter to correct satellite radiance observations for state dependent biases. Our approach is to use state-space augmentation to estimate satellite biases as part of the ensemble data assimilation procedure. We illustrate our approach by applying it to a particular ensemble scheme—the local ensemble transform Kalman filter (LETKF)—to assimilate simulated biased atmospheric infrared sounder brightness temperature observations from 15 channels ...
pathChirp: Efficient Available Bandwidth Estimation for Network Paths
Energy Technology Data Exchange (ETDEWEB)
Cottrell, Les
2003-04-30
This paper presents pathChirp, a new active probing tool for estimating the available bandwidth on a communication network path. Based on the concept of ''self-induced congestion,'' pathChirp features an exponential flight pattern of probes we call a chirp. Packet chips offer several significant advantages over current probing schemes based on packet pairs or packet trains. By rapidly increasing the probing rate within each chirp, pathChirp obtains a rich set of information from which to dynamically estimate the available bandwidth. Since it uses only packet interarrival times for estimation, pathChirp does not require synchronous nor highly stable clocks at the sender and receiver. We test pathChirp with simulations and Internet experiments and find that it provides good estimates of the available bandwidth while using only a fraction of the number of probe bytes that current state-of-the-art techniques use.
Chen, Jie; Brissette, François; Arsenault, Richard; Gatien, Philippe; Roy, Pierre-Olivier; Li, Zhi; Turcotte, Richard
2013-04-01
Probabilistic streamflow prediction based on past climate records or meteorological forecasts have drawn much attention in recent years. It is usually incorporated into operational forecasting systems by government agencies and industries to deal with water resources management and regulation problems. This work presents an operational prototype for short to medium term ensemble streamflow predictions over Quebec, Canada. The system uses ensemble meteorological forecasts for short term (up to 7 days) forecasting, transitioning to a stochastic weather generator conditioned on historical data for the period exceeding 7 days. The precipitation and temperature series are then fed into a combination of 32 hydrology models to account for both the meteorological and hydrology modelling uncertainties. A novel post-processing approach was implemented to correct the biases and the under-dispersion of ensemble meteorological forecasts. This post-processing approach links the mean of the ensemble meteorological forecast to parameters of a stochastic weather generator (absolute probability of precipitation and observed precipitation mean in the case of precipitation). The stochastic weather generator is then used to generated unbiased times series with accurate spread. Results show that the post-processed meteorological forecasts displayed skill for a period up to 7 days for both precipitation and temperature. The ensemble streamflow prediction displayed more skill than when using the deterministic forecast or the stochastic weather generator not conditioned on the ensemble meteorological forecasts. To tackle the uncertainty linked to the hydrology model, 4 different models calibrated with up to 9 different efficiency metrics (for a combination of 32 models/calibrations). Nine different averaging schemes were compared to attribute weights to the 32 combinations. The best averaging method (Granger-Ramanathan) produced estimates with a much better efficiency than the best
Epidemic extinction paths in complex networks
Hindes, Jason; Schwartz, Ira B.
2017-05-01
We study the extinction of long-lived epidemics on finite complex networks induced by intrinsic noise. Applying analytical techniques to the stochastic susceptible-infected-susceptible model, we predict the distribution of large fluctuations, the most probable or optimal path through a network that leads to a disease-free state from an endemic state, and the average extinction time in general configurations. Our predictions agree with Monte Carlo simulations on several networks, including synthetic weighted and degree-distributed networks with degree correlations, and an empirical high school contact network. In addition, our approach quantifies characteristic scaling patterns for the optimal path and distribution of large fluctuations, both near and away from the epidemic threshold, in networks with heterogeneous eigenvector centrality and degree distributions.
Han, Longfei; Luo, Senlin; Yu, Jianmin; Pan, Limin; Chen, Songjing
2015-03-01
Diabetes mellitus is a chronic disease and a worldwide public health challenge. It has been shown that 50-80% proportion of T2DM is undiagnosed. In this paper, support vector machines are utilized to screen diabetes, and an ensemble learning module is added, which turns the "black box" of SVM decisions into comprehensible and transparent rules, and it is also useful for solving imbalance problem. Results on China Health and Nutrition Survey data show that the proposed ensemble learning method generates rule sets with weighted average precision 94.2% and weighted average recall 93.9% for all classes. Furthermore, the hybrid system can provide a tool for diagnosis of diabetes, and it supports a second opinion for lay users.
Exploring and Listening to Chinese Classical Ensembles in General Music
Zhang, Wenzhuo
2017-01-01
Music diversity is valued in theory, but the extent to which it is efficiently presented in music class remains limited. Within this article, I aim to bridge this gap by introducing four genres of Chinese classical ensembles--Qin and Xiao duets, Jiang Nan bamboo and silk ensembles, Cantonese ensembles, and contemporary Chinese orchestras--into the…
Ensemble methods for robust 3D face recognition using commodity depth sensors
Schimbinschi, F.; Schomaker, L.; Wiering, M.
2016-01-01
In this paper we introduce a new dataset and pose invariant sampling method and describe the ensemble methods used for recognizing faces in 3D scenes, captured using commodity depth sensors. We use the 3D SIFT key point detector to take advantage of the similarities between faces, which leads to a
The Relationship between English Language Learner Status and Music Ensemble Participation
Lorah, Julie A.; Sanders, Elizabeth A.; Morrison, Steven J.
2014-01-01
Authors of previous research have reported that U.S. English language learner (ELL) students participate in school-sponsored music ensembles (band, orchestra, and choir) at a lower rate than their native-English-speaking peers (non-ELLs). The current study examined this phenomenon using a nationally representative sample of U.S. 10th graders (14-…
Grilo, Carlos M; White, Marney A; Masheb, Robin M
2012-05-01
Undue influence of shape or weight on self-evaluation--referred to as overvaluation--is a core feature across eating disorders, but is not a diagnostic requirement for binge-eating disorder (BED). This study examined overvaluation of shape/weight in ethnically diverse obese patients with BED seeking treatment in primary care. Participants were a consecutive series of 142 (105 female and 37 male) participants with BED; 43% were Caucasian, 37% were African-American, 13% were Hispanic-American, and 7% were of "other" ethnicity. Participants categorized with overvaluation (N=97; 68%) versus without clinical overvaluation (N=45; 32%) did not differ significantly in ethnicity/race, age, gender, body mass index, or binge-eating frequency. The overvaluation group had significantly greater levels of eating disorder psychopathology, poorer psychological functioning (higher depression, lower self-esteem), and greater anxiety disorder co-morbidity than the group who did not overvalue their shape/weight. The greater eating disorder and psychological disturbance levels in the overvaluation group relative to the non-overvaluation group persisted after controlling for psychiatric co-morbidity. Our findings, based on an ethnically diverse series of patients seeking treatment in general primary care settings, are consistent with findings from specialist clinics and suggest that overvaluation does not simply reflect concerns commensurate with being obese or with frequency of binge-eating, but is strongly associated with heightened eating-related psychopathology and psychological distress. Overvaluation of shape/weight warrants consideration as a diagnostic specifier for BED as it provides important information about severity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Berger, Uwe; Weitkamp, Katharina; Strauss, Bernhard
2009-03-01
From a clinical point of view, a high 'objective' BMI or an early biological onset of puberty are well-known risk factors for eating disorders. In contrast, little is known about irrational beliefs and subjective meanings of body weight and pubertal timing. Mostly using standardised questionnaires, 136 girls with an average age of 12 years were asked to report their eating behaviour, (body) self-esteem, body dissatisfaction, weight limits, estimations of future BMI, subjective pubertal timing and appearance-related social comparisons. Results showed significant correlations between disturbed eating behaviour and the existence of a weight limit, which was reported by 45% of the girls. Twenty two per cent wished to have a future BMI beneath the 10th percentile. In terms of pubertal timing, girls who perceived themselves as either 'early starters' or 'late starters' reported significantly more risky eating behaviour. Results are discussed with a focus on the psychotherapeutic use of our findings as well as the opportunity for the development of preventive strategies.
The role of model dynamics in ensemble Kalman filter performance for chaotic systems
Ng, G.-H.C.; McLaughlin, D.; Entekhabi, D.; Ahanin, A.
2011-01-01
The ensemble Kalman filter (EnKF) is susceptible to losing track of observations, or 'diverging', when applied to large chaotic systems such as atmospheric and ocean models. Past studies have demonstrated the adverse impact of sampling error during the filter's update step. We examine how system dynamics affect EnKF performance, and whether the absence of certain dynamic features in the ensemble may lead to divergence. The EnKF is applied to a simple chaotic model, and ensembles are checked against singular vectors of the tangent linear model, corresponding to short-term growth and Lyapunov vectors, corresponding to long-term growth. Results show that the ensemble strongly aligns itself with the subspace spanned by unstable Lyapunov vectors. Furthermore, the filter avoids divergence only if the full linearized long-term unstable subspace is spanned. However, short-term dynamics also become important as non-linearity in the system increases. Non-linear movement prevents errors in the long-term stable subspace from decaying indefinitely. If these errors then undergo linear intermittent growth, a small ensemble may fail to properly represent all important modes, causing filter divergence. A combination of long and short-term growth dynamics are thus critical to EnKF performance. These findings can help in developing practical robust filters based on model dynamics. ?? 2011 The Authors Tellus A ?? 2011 John Wiley & Sons A/S.
Directory of Open Access Journals (Sweden)
Elias D. Nino-Ruiz
2017-07-01
Full Text Available In this paper, a matrix-free posterior ensemble Kalman filter implementation based on a modified Cholesky decomposition is proposed. The method works as follows: the precision matrix of the background error distribution is estimated based on a modified Cholesky decomposition. The resulting estimator can be expressed in terms of Cholesky factors which can be updated based on a series of rank-one matrices in order to approximate the precision matrix of the analysis distribution. By using this matrix, the posterior ensemble can be built by either sampling from the posterior distribution or using synthetic observations. Furthermore, the computational effort of the proposed method is linear with regard to the model dimension and the number of observed components from the model domain. Experimental tests are performed making use of the Lorenz-96 model. The results reveal that, the accuracy of the proposed implementation in terms of root-mean-square-error is similar, and in some cases better, to that of a well-known ensemble Kalman filter (EnKF implementation: the local ensemble transform Kalman filter. In addition, the results are comparable to those obtained by the EnKF with large ensemble sizes.
A Link-Based Cluster Ensemble Approach For Improved Gene Expression Data Analysis
Directory of Open Access Journals (Sweden)
P.Balaji
2015-01-01
Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.
Portfolio optimization using local linear regression ensembles in RapidMiner
Nagy, Gabor; Barta, Gergo; Henk, Tamas
2015-01-01
In this paper we implement a Local Linear Regression Ensemble Committee (LOLREC) to predict 1-day-ahead returns of 453 assets form the S&P500. The estimates and the historical returns of the committees are used to compute the weights of the portfolio from the 453 stock. The proposed method outperforms benchmark portfolio selection strategies that optimize the growth rate of the capital. We investigate the effect of algorithm parameter m: the number of selected stocks on achieved average annua...