Multiple-point statistical prediction on fracture networks at Yucca Mountain
International Nuclear Information System (INIS)
Liu, X.Y; Zhang, C.Y.; Liu, Q.S.; Birkholzer, J.T.
2009-01-01
In many underground nuclear waste repository systems, such as at Yucca Mountain, water flow rate and amount of water seepage into the waste emplacement drifts are mainly determined by hydrological properties of fracture network in the surrounding rock mass. Natural fracture network system is not easy to describe, especially with respect to its connectivity which is critically important for simulating the water flow field. In this paper, we introduced a new method for fracture network description and prediction, termed multi-point-statistics (MPS). The process of the MPS method is to record multiple-point statistics concerning the connectivity patterns of a fracture network from a known fracture map, and to reproduce multiple-scale training fracture patterns in a stochastic manner, implicitly and directly. It is applied to fracture data to study flow field behavior at the Yucca Mountain waste repository system. First, the MPS method is used to create a fracture network with an original fracture training image from Yucca Mountain dataset. After we adopt a harmonic and arithmetic average method to upscale the permeability to a coarse grid, THM simulation is carried out to study near-field water flow in the surrounding waste emplacement drifts. Our study shows that connectivity or patterns of fracture networks can be grasped and reconstructed by MPS methods. In theory, it will lead to better prediction of fracture system characteristics and flow behavior. Meanwhile, we can obtain variance from flow field, which gives us a way to quantify model uncertainty even in complicated coupled THM simulations. It indicates that MPS can potentially characterize and reconstruct natural fracture networks in a fractured rock mass with advantages of quantifying connectivity of fracture system and its simulation uncertainty simultaneously.
Multiple point statistical simulation using uncertain (soft) conditional data
Hansen, Thomas Mejer; Vu, Le Thanh; Mosegaard, Klaus; Cordua, Knud Skou
2018-05-01
Geostatistical simulation methods have been used to quantify spatial variability of reservoir models since the 80s. In the last two decades, state of the art simulation methods have changed from being based on covariance-based 2-point statistics to multiple-point statistics (MPS), that allow simulation of more realistic Earth-structures. In addition, increasing amounts of geo-information (geophysical, geological, etc.) from multiple sources are being collected. This pose the problem of integration of these different sources of information, such that decisions related to reservoir models can be taken on an as informed base as possible. In principle, though difficult in practice, this can be achieved using computationally expensive Monte Carlo methods. Here we investigate the use of sequential simulation based MPS simulation methods conditional to uncertain (soft) data, as a computational efficient alternative. First, it is demonstrated that current implementations of sequential simulation based on MPS (e.g. SNESIM, ENESIM and Direct Sampling) do not account properly for uncertain conditional information, due to a combination of using only co-located information, and a random simulation path. Then, we suggest two approaches that better account for the available uncertain information. The first make use of a preferential simulation path, where more informed model parameters are visited preferentially to less informed ones. The second approach involves using non co-located uncertain information. For different types of available data, these approaches are demonstrated to produce simulation results similar to those obtained by the general Monte Carlo based approach. These methods allow MPS simulation to condition properly to uncertain (soft) data, and hence provides a computationally attractive approach for integration of information about a reservoir model.
History Matching Through a Smooth Formulation of Multiple-Point Statistics
DEFF Research Database (Denmark)
Melnikova, Yulia; Zunino, Andrea; Lange, Katrine
2014-01-01
and the mismatch with multiple-point statistics. As a result, in the framework of the Bayesian approach, such a solution belongs to a high posterior region. The methodology, while applicable to any inverse problem with a training-image-based prior, is especially beneficial for problems which require expensive......We propose a smooth formulation of multiple-point statistics that enables us to solve inverse problems using gradient-based optimization techniques. We introduce a differentiable function that quantifies the mismatch between multiple-point statistics of a training image and of a given model. We...... show that, by minimizing this function, any continuous image can be gradually transformed into an image that honors the multiple-point statistics of the discrete training image. The solution to an inverse problem is then found by minimizing the sum of two mismatches: the mismatch with data...
Jandrisevits, Carmen; Marschallinger, Robert
2014-05-01
Quarternary sediments in overdeepened alpine valleys and basins in the Eastern Alps bear substantial groundwater resources. The associated aquifer systems are generally geometrically complex with highly variable hydraulic properties. 3D geological models provide predictions of both geometry and properties of the subsurface required for subsequent modelling of groundwater flow and transport. In hydrology, geostatistical Kriging and Kriging based conditional simulations are widely used to predict the spatial distribution of hydrofacies. In the course of investigating the shallow aquifer structures in the Zell basin in the Upper Salzach valley (Salzburg, Austria), a benchmark of available geostatistical modelling and simulation methods was performed: traditional variogram based geostatistical methods, i.e. Indicator Kriging, Sequential Indicator Simulation and Sequential Indicator Co - Simulation were used as well as Multiple Point Statistics. The ~ 6 km2 investigation area is sampled by 56 drillings with depths of 5 to 50 m; in addition, there are 2 geophysical sections with lengths of 2 km and depths of 50 m. Due to clustered drilling sites, indicator Kriging models failed to consistently model the spatial variability of hydrofacies. Using classical variogram based geostatistical simulation (SIS), equally probable realizations were generated with differences among the realizations providing an uncertainty measure. The yielded models are unstructured from a geological point - they do not portray the shapes and lateral extensions of associated sedimentary units. Since variograms consider only two - point spatial correlations, they are unable to capture the spatial variability of complex geological structures. The Multiple Point Statistics approach overcomes these limitations of two point statistics as it uses a Training image instead of variograms. The 3D Training Image can be seen as a reference facies model where geological knowledge about depositional
Pilot points method for conditioning multiple-point statistical facies simulation on flow data
Ma, Wei; Jafarpour, Behnam
2018-05-01
We propose a new pilot points method for conditioning discrete multiple-point statistical (MPS) facies simulation on dynamic flow data. While conditioning MPS simulation on static hard data is straightforward, their calibration against nonlinear flow data is nontrivial. The proposed method generates conditional models from a conceptual model of geologic connectivity, known as a training image (TI), by strategically placing and estimating pilot points. To place pilot points, a score map is generated based on three sources of information: (i) the uncertainty in facies distribution, (ii) the model response sensitivity information, and (iii) the observed flow data. Once the pilot points are placed, the facies values at these points are inferred from production data and then are used, along with available hard data at well locations, to simulate a new set of conditional facies realizations. While facies estimation at the pilot points can be performed using different inversion algorithms, in this study the ensemble smoother (ES) is adopted to update permeability maps from production data, which are then used to statistically infer facies types at the pilot point locations. The developed method combines the information in the flow data and the TI by using the former to infer facies values at selected locations away from the wells and the latter to ensure consistent facies structure and connectivity where away from measurement locations. Several numerical experiments are used to evaluate the performance of the developed method and to discuss its important properties.
Accelerating simulation for the multiple-point statistics algorithm using vector quantization
Zuo, Chen; Pan, Zhibin; Liang, Hao
2018-03-01
Multiple-point statistics (MPS) is a prominent algorithm to simulate categorical variables based on a sequential simulation procedure. Assuming training images (TIs) as prior conceptual models, MPS extracts patterns from TIs using a template and records their occurrences in a database. However, complex patterns increase the size of the database and require considerable time to retrieve the desired elements. In order to speed up simulation and improve simulation quality over state-of-the-art MPS methods, we propose an accelerating simulation for MPS using vector quantization (VQ), called VQ-MPS. First, a variable representation is presented to make categorical variables applicable for vector quantization. Second, we adopt a tree-structured VQ to compress the database so that stationary simulations are realized. Finally, a transformed template and classified VQ are used to address nonstationarity. A two-dimensional (2D) stationary channelized reservoir image is used to validate the proposed VQ-MPS. In comparison with several existing MPS programs, our method exhibits significantly better performance in terms of computational time, pattern reproductions, and spatial uncertainty. Further demonstrations consist of a 2D four facies simulation, two 2D nonstationary channel simulations, and a three-dimensional (3D) rock simulation. The results reveal that our proposed method is also capable of solving multifacies, nonstationarity, and 3D simulations based on 2D TIs.
Directory of Open Access Journals (Sweden)
Yin Yanshu
2017-12-01
Full Text Available In this paper, a location-based multiple point statistics method is developed to model a non-stationary reservoir. The proposed method characterizes the relationship between the sedimentary pattern and the deposit location using the relative central position distance function, which alleviates the requirement that the training image and the simulated grids have the same dimension. The weights in every direction of the distance function can be changed to characterize the reservoir heterogeneity in various directions. The local integral replacements of data events, structured random path, distance tolerance and multi-grid strategy are applied to reproduce the sedimentary patterns and obtain a more realistic result. This method is compared with the traditional Snesim method using a synthesized 3-D training image of Poyang Lake and a reservoir model of Shengli Oilfield in China. The results indicate that the new method can reproduce the non-stationary characteristics better than the traditional method and is more suitable for simulation of delta-front deposits. These results show that the new method is a powerful tool for modelling a reservoir with non-stationary characteristics.
Høyer, Anne-Sophie; Vignoli, Giulio; Mejer Hansen, Thomas; Thanh Vu, Le; Keefer, Donald A.; Jørgensen, Flemming
2017-12-01
Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS) to hydrogeological modelling focus on relatively fine-scale models and concentrate on the estimation of facies-level structural uncertainty. Much less attention is paid to the use of input data and optimal construction of training images. For instance, even though the training image should capture a set of spatial geological characteristics to guide the simulations, the majority of the research still relies on 2-D or quasi-3-D training images. In the present study, we demonstrate a novel strategy for 3-D MPS modelling characterized by (i) realistic 3-D training images and (ii) an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers an area of 2810 km2 in the southern part of Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments) which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 million voxels with size 100 m × 100 m × 5 m. Data used for the modelling include water well logs, high-resolution seismic data, and a previously published 3-D geological model. We apply a series of different strategies for the simulations based on data quality, and develop a novel method to effectively create observed spatial trends. The training image is constructed as a relatively small 3-D voxel model covering an area of 90 km2. We use an iterative training image development strategy and find that even slight modifications in the training image create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical workflow to build the training image and
DEFF Research Database (Denmark)
He, Xiulan; Sonnenborg, Torben; Jørgensen, Flemming
2017-01-01
-stationary geological system characterized by a network of connected buried valleys that incise deeply into layered Miocene sediments (case study in Denmark). The results show that, based on fragmented information of the formation boundaries, the MPS partition method is able to simulate a non-stationary system......Stationarity has traditionally been a requirement of geostatistical simulations. A common way to deal with non-stationarity is to divide the system into stationary sub-regions and subsequently merge the realizations for each region. Recently, the so-called partition approach that has...... the flexibility to model non-stationary systems directly was developed for multiple-point statistics simulation (MPS). The objective of this study is to apply the MPS partition method with conventional borehole logs and high-resolution airborne electromagnetic (AEM) data, for simulation of a real-world non...
DEFF Research Database (Denmark)
Barfod, Adrian
The PhD thesis presents a new method for analyzing the relationship between resistivity and lithology, as well as a method for quantifying the hydrostratigraphic modeling uncertainty related to Multiple-Point Statistical (MPS) methods. Three-dimensional (3D) geological models are im...... is to improve analysis and research of the resistivity-lithology relationship and ensemble geological/hydrostratigraphic modeling. The groundwater mapping campaign in Denmark, beginning in the 1990’s, has resulted in the collection of large amounts of borehole and geophysical data. The data has been compiled...... in two publicly available databases, the JUPITER and GERDA databases, which contain borehole and geophysical data, respectively. The large amounts of available data provided a unique opportunity for studying the resistivity-lithology relationship. The method for analyzing the resistivity...
Directory of Open Access Journals (Sweden)
A.-S. Høyer
2017-12-01
Full Text Available Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS to hydrogeological modelling focus on relatively fine-scale models and concentrate on the estimation of facies-level structural uncertainty. Much less attention is paid to the use of input data and optimal construction of training images. For instance, even though the training image should capture a set of spatial geological characteristics to guide the simulations, the majority of the research still relies on 2-D or quasi-3-D training images. In the present study, we demonstrate a novel strategy for 3-D MPS modelling characterized by (i realistic 3-D training images and (ii an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers an area of 2810 km2 in the southern part of Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 million voxels with size 100 m × 100 m × 5 m. Data used for the modelling include water well logs, high-resolution seismic data, and a previously published 3-D geological model. We apply a series of different strategies for the simulations based on data quality, and develop a novel method to effectively create observed spatial trends. The training image is constructed as a relatively small 3-D voxel model covering an area of 90 km2. We use an iterative training image development strategy and find that even slight modifications in the training image create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical
Directory of Open Access Journals (Sweden)
Barash Danny
2008-04-01
Full Text Available Abstract Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3, for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary
Supporting Multiple Pointing Devices in Microsoft Windows
DEFF Research Database (Denmark)
Westergaard, Michael
2002-01-01
In this paper the implementation of a Microsoft Windows driver including APIs supporting multiple pointing devices is presented. Microsoft Windows does not natively support multiple pointing devices controlling independent cursors, and a number of solutions to this have been implemented by us and...... and others. Here we motivate and describe a general solution, and how user applications can use it by means of a framework. The device driver and the supporting APIs will be made available free of charge. Interested parties can contact the author for more information....
Learning Predictive Statistics: Strategies and Brain Mechanisms.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-08-30
When immersed in a new environment, we are challenged to decipher initially incomprehensible streams of sensory information. However, quite rapidly, the brain finds structure and meaning in these incoming signals, helping us to predict and prepare ourselves for future actions. This skill relies on extracting the statistics of event streams in the environment that contain regularities of variable complexity from simple repetitive patterns to complex probabilistic combinations. Here, we test the brain mechanisms that mediate our ability to adapt to the environment's statistics and predict upcoming events. By combining behavioral training and multisession fMRI in human participants (male and female), we track the corticostriatal mechanisms that mediate learning of temporal sequences as they change in structure complexity. We show that learning of predictive structures relates to individual decision strategy; that is, selecting the most probable outcome in a given context (maximizing) versus matching the exact sequence statistics. These strategies engage distinct human brain regions: maximizing engages dorsolateral prefrontal, cingulate, sensory-motor regions, and basal ganglia (dorsal caudate, putamen), whereas matching engages occipitotemporal regions (including the hippocampus) and basal ganglia (ventral caudate). Our findings provide evidence for distinct corticostriatal mechanisms that facilitate our ability to extract behaviorally relevant statistics to make predictions. SIGNIFICANCE STATEMENT Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. Past work has studied how humans identify repetitive patterns and associative pairings. However, the natural environment contains regularities that vary in complexity from simple repetition to complex probabilistic combinations. Here, we combine behavior and multisession fMRI to track the brain mechanisms that mediate our ability to adapt to
Statistical prediction of Late Miocene climate
Digital Repository Service at National Institute of Oceanography (India)
Fernandes, A.A; Gupta, S.M.
by making certain simplifying assumptions; for example in modelling ocean 4 currents, the geostrophic approximation is made. In case of statistical prediction no such a priori assumption need be made. statistical prediction comprises of using observed data... the number of equations. In this case the equations are overdetermined, and therefore one has to look for a solution that best fits the sample data in a least squares sense. To this end we express the sample .... (2.1)+ ry = y + data as follows: n L c. (x...
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2000-01-01
New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2004-01-01
Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric
Statistical short-term earthquake prediction.
Kagan, Y Y; Knopoff, L
1987-06-19
A statistical procedure, derived from a theoretical model of fracture growth, is used to identify a foreshock sequence while it is in progress. As a predictor, the procedure reduces the average uncertainty in the rate of occurrence for a future strong earthquake by a factor of more than 1000 when compared with the Poisson rate of occurrence. About one-third of all main shocks with local magnitude greater than or equal to 4.0 in central California can be predicted in this way, starting from a 7-year database that has a lower magnitude cut off of 1.5. The time scale of such predictions is of the order of a few hours to a few days for foreshocks in the magnitude range from 2.0 to 5.0.
Statistical predictions from anarchic field theory landscapes
International Nuclear Information System (INIS)
Balasubramanian, Vijay; Boer, Jan de; Naqvi, Asad
2010-01-01
Consistent coupling of effective field theories with a quantum theory of gravity appears to require bounds on the rank of the gauge group and the amount of matter. We consider landscapes of field theories subject to such to boundedness constraints. We argue that appropriately 'coarse-grained' aspects of the randomly chosen field theory in such landscapes, such as the fraction of gauge groups with ranks in a given range, can be statistically predictable. To illustrate our point we show how the uniform measures on simple classes of N=1 quiver gauge theories localize in the vicinity of theories with certain typical structures. Generically, this approach would predict a high energy theory with very many gauge factors, with the high rank factors largely decoupled from the low rank factors if we require asymptotic freedom for the latter.
Improving the Pattern Reproducibility of Multiple-Point-Based Prior Models Using Frequency Matching
DEFF Research Database (Denmark)
Cordua, Knud Skou; Hansen, Thomas Mejer; Mosegaard, Klaus
2014-01-01
Some multiple-point-based sampling algorithms, such as the snesim algorithm, rely on sequential simulation. The conditional probability distributions that are used for the simulation are based on statistics of multiple-point data events obtained from a training image. During the simulation, data...... events with zero probability in the training image statistics may occur. This is handled by pruning the set of conditioning data until an event with non-zero probability is found. The resulting probability distribution sampled by such algorithms is a pruned mixture model. The pruning strategy leads...... to a probability distribution that lacks some of the information provided by the multiple-point statistics from the training image, which reduces the reproducibility of the training image patterns in the outcome realizations. When pruned mixture models are used as prior models for inverse problems, local re...
Multiplicative point process as a model of trading activity
Gontis, V.; Kaulakys, B.
2004-11-01
Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.
A statistical model for predicting muscle performance
Byerly, Diane Leslie De Caix
The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing
Predicting radiotherapy outcomes using statistical learning techniques
International Nuclear Information System (INIS)
El Naqa, Issam; Bradley, Jeffrey D; Deasy, Joseph O; Lindsay, Patricia E; Hope, Andrew J
2009-01-01
Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model
Statistical Basis for Predicting Technological Progress
Nagy, Béla; Farmer, J. Doyne; Bui, Quan M.; Trancik, Jessika E.
2013-01-01
Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation. PMID:23468837
Statistical basis for predicting technological progress.
Directory of Open Access Journals (Sweden)
Béla Nagy
Full Text Available Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.
Predicting Statistical Distributions of Footbridge Vibrations
DEFF Research Database (Denmark)
Pedersen, Lars; Frier, Christian
2009-01-01
The paper considers vibration response of footbridges to pedestrian loading. Employing Newmark and Monte Carlo simulation methods, a statistical distribution of bridge vibration levels is calculated modelling walking parameters such as step frequency and stride length as random variables...
DEFF Research Database (Denmark)
Cordua, Knud Skou; Hansen, Thomas Mejer; Lange, Katrine
In order to move beyond simplified covariance based a priori models, which are typically used for inverse problems, more complex multiple-point-based a priori models have to be considered. By means of marginal probability distributions ‘learned’ from a training image, sequential simulation has...... proven to be an efficient way of obtaining multiple realizations that honor the same multiple-point statistics as the training image. The frequency matching method provides an alternative way of formulating multiple-point-based a priori models. In this strategy the pattern frequency distributions (i.......e. marginals) of the training image and a subsurface model are matched in order to obtain a solution with the same multiple-point statistics as the training image. Sequential Gibbs sampling is a simulation strategy that provides an efficient way of applying sequential simulation based algorithms as a priori...
Statistical prediction of parametric roll using FORM
DEFF Research Database (Denmark)
Jensen, Jørgen Juncher; Choi, Ju-hyuck; Nielsen, Ulrik Dam
2017-01-01
Previous research has shown that the First Order Reliability Method (FORM) can be an efficient method for estimation of outcrossing rates and extreme value statistics for stationary stochastic processes. This is so also for bifurcation type of processes like parametric roll of ships. The present...
Time series prediction: statistical and neural techniques
Zahirniak, Daniel R.; DeSimio, Martin P.
1996-03-01
In this paper we compare the performance of nonlinear neural network techniques to those of linear filtering techniques in the prediction of time series. Specifically, we compare the results of using the nonlinear systems, known as multilayer perceptron and radial basis function neural networks, with the results obtained using the conventional linear Wiener filter, Kalman filter and Widrow-Hoff adaptive filter in predicting future values of stationary and non- stationary time series. Our results indicate the performance of each type of system is heavily dependent upon the form of the time series being predicted and the size of the system used. In particular, the linear filters perform adequately for linear or near linear processes while the nonlinear systems perform better for nonlinear processes. Since the linear systems take much less time to be developed, they should be tried prior to using the nonlinear systems when the linearity properties of the time series process are unknown.
FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants.
Bednar, David; Beerens, Koen; Sebestova, Eva; Bendl, Jaroslav; Khare, Sagar; Chaloupkova, Radka; Prokop, Zbynek; Brezovsky, Jan; Baker, David; Damborsky, Jiri
2015-11-01
There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C) by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.
FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants.
Directory of Open Access Journals (Sweden)
David Bednar
2015-11-01
Full Text Available There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.
Using machine learning, neural networks and statistics to predict bankruptcy
Pompe, P.P.M.; Feelders, A.J.; Feelders, A.J.
1997-01-01
Recent literature strongly suggests that machine learning approaches to classification outperform "classical" statistical methods. We make a comparison between the performance of linear discriminant analysis, classification trees, and neural networks in predicting corporate bankruptcy. Linear
DEFF Research Database (Denmark)
Kessler, Timo Christian; Nilsson, Bertel; Klint, Knud Erik
2010-01-01
(TPROGS) of alternating geological facies. The second method, multiple-point statistics, uses training images to estimate the conditional probability of sand-lenses at a certain location. Both methods respect field observations such as local stratigraphy, however, only the multiple-point statistics can...... of sand-lenses in clay till. Sand-lenses mainly account for horizontal transport and are prioritised in this study. Based on field observations, the distribution has been modeled using two different geostatistical approaches. One method uses a Markov chain model calculating the transition probabilities...
Wind speed prediction using statistical regression and neural network
Indian Academy of Sciences (India)
Prediction of wind speed in the atmospheric boundary layer is important for wind energy assess- ment,satellite launching and aviation,etc.There are a few techniques available for wind speed prediction,which require a minimum number of input parameters.Four different statistical techniques,viz.,curve ﬁtting,Auto Regressive ...
Fatigue crack initiation and growth life prediction with statistical consideration
International Nuclear Information System (INIS)
Kwon, J.D.; Choi, S.H.; Kwak, S.G.; Chun, K.O.
1991-01-01
Life prediction or residual life prediction of structures or machines is one of the most strongly world wide needed problems as requirement in the stage of slowly developing economy which comes after rapidly and highly developing stage. For the purpose of statistical life prediction, fatigue test was conducted under the 3 stress levels, and for each stress level, 20 specimens are used. The statistical properties of the crack growth parameter m and C in the fatigue crack growth law of da/dN = C(ΔK) m , and the relationship between m and C, and the statistical distribution pattern of fatigue crack initiation, growth and fracture lives can be obtained by experimental results
Learning predictive statistics from temporal sequences: Dynamics and strategies.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-10-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
A neighborhood statistics model for predicting stream pathogen indicator levels.
Pandey, Pramod K; Pasternack, Gregory B; Majumder, Mahbubul; Soupir, Michelle L; Kaiser, Mark S
2015-03-01
Because elevated levels of water-borne Escherichia coli in streams are a leading cause of water quality impairments in the U.S., water-quality managers need tools for predicting aqueous E. coli levels. Presently, E. coli levels may be predicted using complex mechanistic models that have a high degree of unchecked uncertainty or simpler statistical models. To assess spatio-temporal patterns of instream E. coli levels, herein we measured E. coli, a pathogen indicator, at 16 sites (at four different times) within the Squaw Creek watershed, Iowa, and subsequently, the Markov Random Field model was exploited to develop a neighborhood statistics model for predicting instream E. coli levels. Two observed covariates, local water temperature (degrees Celsius) and mean cross-sectional depth (meters), were used as inputs to the model. Predictions of E. coli levels in the water column were compared with independent observational data collected from 16 in-stream locations. The results revealed that spatio-temporal averages of predicted and observed E. coli levels were extremely close. Approximately 66 % of individual predicted E. coli concentrations were within a factor of 2 of the observed values. In only one event, the difference between prediction and observation was beyond one order of magnitude. The mean of all predicted values at 16 locations was approximately 1 % higher than the mean of the observed values. The approach presented here will be useful while assessing instream contaminations such as pathogen/pathogen indicator levels at the watershed scale.
Risk prediction model: Statistical and artificial neural network approach
Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim
2017-04-01
Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Statistical timing for parametric yield prediction of digital integrated circuits
Jess, J.A.G.; Kalafala, K.; Naidu, S.R.; Otten, R.H.J.M.; Visweswariah, C.
2006-01-01
Uncertainty in circuit performance due to manufacturing and environmental variations is increasing with each new generation of technology. It is therefore important to predict the performance of a chip as a probabilistic quantity. This paper proposes three novel path-based algorithms for statistical
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Model output statistics applied to wind power prediction
Energy Technology Data Exchange (ETDEWEB)
Joensen, A; Giebel, G; Landberg, L [Risoe National Lab., Roskilde (Denmark); Madsen, H; Nielsen, H A [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)
1999-03-01
Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.
IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics.
Hoyt, Robert Eugene; Snider, Dallas; Thompson, Carla; Mantravadi, Sarita
2016-10-11
We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix
Spatial statistics for predicting flow through a rock fracture
International Nuclear Information System (INIS)
Coakley, K.J.
1989-03-01
Fluid flow through a single rock fracture depends on the shape of the space between the upper and lower pieces of rock which define the fracture. In this thesis, the normalized flow through a fracture, i.e. the equivalent permeability of a fracture, is predicted in terms of spatial statistics computed from the arrangement of voids, i.e. open spaces, and contact areas within the fracture. Patterns of voids and contact areas, with complexity typical of experimental data, are simulated by clipping a correlated Gaussian process defined on a N by N pixel square region. The voids have constant aperture; the distance between the upper and lower surfaces which define the fracture is either zero or a constant. Local flow is assumed to be proportional to local aperture cubed times local pressure gradient. The flow through a pattern of voids and contact areas is solved using a finite-difference method. After solving for the flow through simulated 10 by 10 by 30 pixel patterns of voids and contact areas, a model to predict equivalent permeability is developed. The first model is for patterns with 80% voids where all voids have the same aperture. The equivalent permeability of a pattern is predicted in terms of spatial statistics computed from the arrangement of voids and contact areas within the pattern. Four spatial statistics are examined. The change point statistic measures how often adjacent pixel alternate from void to contact area (or vice versa ) in the rows of the patterns which are parallel to the overall flow direction. 37 refs., 66 figs., 41 tabs
Statistical models for expert judgement and wear prediction
International Nuclear Information System (INIS)
Pulkkinen, U.
1994-01-01
This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)
Predicting statistical properties of open reading frames in bacterial genomes.
Directory of Open Access Journals (Sweden)
Katharina Mir
Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.
Probably not future prediction using probability and statistical inference
Dworsky, Lawrence N
2008-01-01
An engaging, entertaining, and informative introduction to probability and prediction in our everyday lives Although Probably Not deals with probability and statistics, it is not heavily mathematical and is not filled with complex derivations, proofs, and theoretical problem sets. This book unveils the world of statistics through questions such as what is known based upon the information at hand and what can be expected to happen. While learning essential concepts including "the confidence factor" and "random walks," readers will be entertained and intrigued as they move from chapter to chapter. Moreover, the author provides a foundation of basic principles to guide decision making in almost all facets of life including playing games, developing winning business strategies, and managing personal finances. Much of the book is organized around easy-to-follow examples that address common, everyday issues such as: How travel time is affected by congestion, driving speed, and traffic lights Why different gambling ...
Prediction of slant path rain attenuation statistics at various locations
Goldhirsh, J.
1977-01-01
The paper describes a method for predicting slant path attenuation statistics at arbitrary locations for variable frequencies and path elevation angles. The method involves the use of median reflectivity factor-height profiles measured with radar as well as the use of long-term point rain rate data and assumed or measured drop size distributions. The attenuation coefficient due to cloud liquid water in the presence of rain is also considered. Absolute probability fade distributions are compared for eight cases: Maryland (15 GHz), Texas (30 GHz), Slough, England (19 and 37 GHz), Fayetteville, North Carolina (13 and 18 GHz), and Cambridge, Massachusetts (13 and 18 GHz).
Statistical Approaches for Spatiotemporal Prediction of Low Flows
Fangmann, A.; Haberlandt, U.
2017-12-01
An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be
Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields
Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.
2018-01-01
This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.
Uncertainty propagation for statistical impact prediction of space debris
Hoogendoorn, R.; Mooij, E.; Geul, J.
2018-01-01
Predictions of the impact time and location of space debris in a decaying trajectory are highly influenced by uncertainties. The traditional Monte Carlo (MC) method can be used to perform accurate statistical impact predictions, but requires a large computational effort. A method is investigated that directly propagates a Probability Density Function (PDF) in time, which has the potential to obtain more accurate results with less computational effort. The decaying trajectory of Delta-K rocket stages was used to test the methods using a six degrees-of-freedom state model. The PDF of the state of the body was propagated in time to obtain impact-time distributions. This Direct PDF Propagation (DPP) method results in a multi-dimensional scattered dataset of the PDF of the state, which is highly challenging to process. No accurate results could be obtained, because of the structure of the DPP data and the high dimensionality. Therefore, the DPP method is less suitable for practical uncontrolled entry problems and the traditional MC method remains superior. Additionally, the MC method was used with two improved uncertainty models to obtain impact-time distributions, which were validated using observations of true impacts. For one of the two uncertainty models, statistically more valid impact-time distributions were obtained than in previous research.
Statistical characterization of pitting corrosion process and life prediction
International Nuclear Information System (INIS)
Sheikh, A.K.; Younas, M.
1995-01-01
In order to prevent corrosion failures of machines and structures, it is desirable to know in advance when the corrosion damage will take place, and appropriate measures are needed to mitigate the damage. The corrosion predictions are needed both at development as well as operational stage of machines and structures. There are several forms of corrosion process through which varying degrees of damage can occur. Under certain conditions these corrosion processes at alone and in other set of conditions, several of these processes may occur simultaneously. For a certain type of machine elements and structures, such as gears, bearing, tubes, pipelines, containers, storage tanks etc., are particularly prone to pitting corrosion which is an insidious form of corrosion. The corrosion predictions are usually based on experimental results obtained from test coupons and/or field experiences of similar machines or parts of a structure. Considerable scatter is observed in corrosion processes. The probabilities nature and kinetics of pitting process makes in necessary to use statistical method to forecast the residual life of machine of structures. The focus of this paper is to characterization pitting as a time-dependent random process, and using this characterization the prediction of life to reach a critical level of pitting damage can be made. Using several data sets from literature on pitting corrosion, the extreme value modeling of pitting corrosion process, the evolution of the extreme value distribution in time, and their relationship to the reliability of machines and structure are explained. (author)
Statistical prediction of nanoparticle delivery: from culture media to cell
Rowan Brown, M.; Hondow, Nicole; Brydson, Rik; Rees, Paul; Brown, Andrew P.; Summers, Huw D.
2015-04-01
The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for
Statistical prediction of nanoparticle delivery: from culture media to cell
International Nuclear Information System (INIS)
Brown, M Rowan; Rees, Paul; Summers, Huw D; Hondow, Nicole; Brydson, Rik; Brown, Andrew P
2015-01-01
The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for
Estimating Predictive Variance for Statistical Gas Distribution Modelling
International Nuclear Information System (INIS)
Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo
2009-01-01
Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis
Directory of Open Access Journals (Sweden)
Charles Frank
2018-03-01
Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.
Statistics and predictions of population, energy and environment problems
International Nuclear Information System (INIS)
Sobajima, Makoto
1999-03-01
In the situation that world's population, especially in developing countries, is rapidly growing, humankind is facing to global problems that they cannot steadily live unless they find individual places to live, obtain foods, and peacefully get energy necessary for living for centuries. For this purpose, humankind has to think what behavior they should take in the finite environment, talk, agree and execute. Though energy has been long respected as a symbol for improving living, demanded and used, they have come to limit the use making the global environment more serious. If there is sufficient energy not loading cost to the environment. If nuclear energy regarded as such one sustain the resource for long and has market competitiveness. What situation of realization of compensating new energy is now in the case the use of nuclear energy is restricted by the society fearing radioactivity. If there are promising ones for the future. One concerning with the study of energy cannot go without knowing these. The statistical materials compiled here are thought to be useful for that purpose, and are collected mainly from ones viewing future prediction based on past practices. Studies on the prediction is so important to have future measures that these data bases are expected to be improved for better accuracy. (author)
DEFF Research Database (Denmark)
Barfod, Adrian; Vilhelmsen, Troels Norvin; Jørgensen, Flemming
2017-01-01
of the TI was studied. An example was presented where a geological model from a neighboring area was used to simulate hydrostratigraphic models of the Kasted area. It was shown that as long as the geological setting is similar in nature, the realizations, although different, still reflect...
DEFF Research Database (Denmark)
Barfod, Adrian; Straubhaar, Julien; Høyer, Anne-Sophie
2017-01-01
the incorporation of elaborate datasets and provides a framework for stochastic hydrostratigraphic modelling. This paper focuses on comparing three MPS methods: snesim, DS and iqsim. The MPS methods are tested and compared on a real-world hydrogeophysical survey from Kasted in Denmark, which covers an area of 45 km......2. The comparison of the stochastic hydrostratigraphic MPS models is carried out in an elaborate scheme of visual inspection, mathematical similarity and consistency with boreholes. Using the Kasted survey data, a practical example for modelling new survey areas is presented. A cognitive...
Prediction of dimethyl disulfide levels from biosolids using statistical modeling.
Gabriel, Steven A; Vilalai, Sirapong; Arispe, Susanna; Kim, Hyunook; McConnell, Laura L; Torrents, Alba; Peot, Christopher; Ramirez, Mark
2005-01-01
Two statistical models were used to predict the concentration of dimethyl disulfide (DMDS) released from biosolids produced by an advanced wastewater treatment plant (WWTP) located in Washington, DC, USA. The plant concentrates sludge from primary sedimentation basins in gravity thickeners (GT) and sludge from secondary sedimentation basins in dissolved air flotation (DAF) thickeners. The thickened sludge is pumped into blending tanks and then fed into centrifuges for dewatering. The dewatered sludge is then conditioned with lime before trucking out from the plant. DMDS, along with other volatile sulfur and nitrogen-containing chemicals, is known to contribute to biosolids odors. These models identified oxidation/reduction potential (ORP) values of a GT and DAF, the amount of sludge dewatered by centrifuges, and the blend ratio between GT thickened sludge and DAF thickened sludge in blending tanks as control variables. The accuracy of the developed regression models was evaluated by checking the adjusted R2 of the regression as well as the signs of coefficients associated with each variable. In general, both models explained observed DMDS levels in sludge headspace samples. The adjusted R2 value of the regression models 1 and 2 were 0.79 and 0.77, respectively. Coefficients for each regression model also had the correct sign. Using the developed models, plant operators can adjust the controllable variables to proactively decrease this odorant. Therefore, these models are a useful tool in biosolids management at WWTPs.
Fast Quantum Algorithm for Predicting Descriptive Statistics of Stochastic Processes
Williams Colin P.
1999-01-01
Stochastic processes are used as a modeling tool in several sub-fields of physics, biology, and finance. Analytic understanding of the long term behavior of such processes is only tractable for very simple types of stochastic processes such as Markovian processes. However, in real world applications more complex stochastic processes often arise. In physics, the complicating factor might be nonlinearities; in biology it might be memory effects; and in finance is might be the non-random intentional behavior of participants in a market. In the absence of analytic insight, one is forced to understand these more complex stochastic processes via numerical simulation techniques. In this paper we present a quantum algorithm for performing such simulations. In particular, we show how a quantum algorithm can predict arbitrary descriptive statistics (moments) of N-step stochastic processes in just O(square root of N) time. That is, the quantum complexity is the square root of the classical complexity for performing such simulations. This is a significant speedup in comparison to the current state of the art.
Monthly to seasonal low flow prediction: statistical versus dynamical models
Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke
2016-04-01
the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.
Quantifying natural delta variability using a multiple-point geostatistics prior uncertainty model
Scheidt, Céline; Fernandes, Anjali M.; Paola, Chris; Caers, Jef
2016-10-01
We address the question of quantifying uncertainty associated with autogenic pattern variability in a channelized transport system by means of a modern geostatistical method. This question has considerable relevance for practical subsurface applications as well, particularly those related to uncertainty quantification relying on Bayesian approaches. Specifically, we show how the autogenic variability in a laboratory experiment can be represented and reproduced by a multiple-point geostatistical prior uncertainty model. The latter geostatistical method requires selection of a limited set of training images from which a possibly infinite set of geostatistical model realizations, mimicking the training image patterns, can be generated. To that end, we investigate two methods to determine how many training images and what training images should be provided to reproduce natural autogenic variability. The first method relies on distance-based clustering of overhead snapshots of the experiment; the second method relies on a rate of change quantification by means of a computer vision algorithm termed the demon algorithm. We show quantitatively that with either training image selection method, we can statistically reproduce the natural variability of the delta formed in the experiment. In addition, we study the nature of the patterns represented in the set of training images as a representation of the "eigenpatterns" of the natural system. The eigenpattern in the training image sets display patterns consistent with previous physical interpretations of the fundamental modes of this type of delta system: a highly channelized, incisional mode; a poorly channelized, depositional mode; and an intermediate mode between the two.
Estimation of Multiple Point Sources for Linear Fractional Order Systems Using Modulating Functions
Belkhatir, Zehor; Laleg-Kirati, Taous-Meriem
2017-01-01
This paper proposes an estimation algorithm for the characterization of multiple point inputs for linear fractional order systems. First, using polynomial modulating functions method and a suitable change of variables the problem of estimating
Tang, Yunwei; Atkinson, Peter M.; Zhang, Jingxiong
2015-03-01
A cross-scale data integration method was developed and tested based on the theory of geostatistics and multiple-point geostatistics (MPG). The goal was to downscale remotely sensed images while retaining spatial structure by integrating images at different spatial resolutions. During the process of downscaling, a rich spatial correlation model in the form of a training image was incorporated to facilitate reproduction of similar local patterns in the simulated images. Area-to-point cokriging (ATPCK) was used as locally varying mean (LVM) (i.e., soft data) to deal with the change of support problem (COSP) for cross-scale integration, which MPG cannot achieve alone. Several pairs of spectral bands of remotely sensed images were tested for integration within different cross-scale case studies. The experiment shows that MPG can restore the spatial structure of the image at a fine spatial resolution given the training image and conditioning data. The super-resolution image can be predicted using the proposed method, which cannot be realised using most data integration methods. The results show that ATPCK-MPG approach can achieve greater accuracy than methods which do not account for the change of support issue.
Statistical and Machine Learning Models to Predict Programming Performance
Bergin, Susan
2006-01-01
This thesis details a longitudinal study on factors that influence introductory programming success and on the development of machine learning models to predict incoming student performance. Although numerous studies have developed models to predict programming success, the models struggled to achieve high accuracy in predicting the likely performance of incoming students. Our approach overcomes this by providing a machine learning technique, using a set of three significant...
Statistical model based gender prediction for targeted NGS clinical panels
Directory of Open Access Journals (Sweden)
Palani Kannan Kandavel
2017-12-01
The reference test dataset are being used to test the model. The sensitivity on predicting the gender has been increased from the current “genotype composition in ChrX” based approach. In addition, the prediction score given by the model can be used to evaluate the quality of clinical dataset. The higher prediction score towards its respective gender indicates the higher quality of sequenced data.
Statistical models to predict flows at monthly level in Salvajina
International Nuclear Information System (INIS)
Gonzalez, Harold O
1994-01-01
It thinks about and models of lineal regression evaluate at monthly level that they allow to predict flows in Salvajina, with base in predictions variable, like the difference of pressure between Darwin and Tahiti, precipitation in Piendamo Cauca), temperature in Port Chicama (Peru) and pressure in Tahiti
PGT: A Statistical Approach to Prediction and Mechanism Design
Wolpert, David H.; Bono, James W.
One of the biggest challenges facing behavioral economics is the lack of a single theoretical framework that is capable of directly utilizing all types of behavioral data. One of the biggest challenges of game theory is the lack of a framework for making predictions and designing markets in a manner that is consistent with the axioms of decision theory. An approach in which solution concepts are distribution-valued rather than set-valued (i.e. equilibrium theory) has both capabilities. We call this approach Predictive Game Theory (or PGT). This paper outlines a general Bayesian approach to PGT. It also presents one simple example to illustrate the way in which this approach differs from equilibrium approaches in both prediction and mechanism design settings.
DEFF Research Database (Denmark)
Cordua, Knud Skou; Hansen, Thomas Mejer; Mosegaard, Klaus
2012-01-01
We present a general Monte Carlo full-waveform inversion strategy that integrates a priori information described by geostatistical algorithms with Bayesian inverse problem theory. The extended Metropolis algorithm can be used to sample the a posteriori probability density of highly nonlinear...... inverse problems, such as full-waveform inversion. Sequential Gibbs sampling is a method that allows efficient sampling of a priori probability densities described by geostatistical algorithms based on either two-point (e.g., Gaussian) or multiple-point statistics. We outline the theoretical framework......) Based on a posteriori realizations, complicated statistical questions can be answered, such as the probability of connectivity across a layer. (3) Complex a priori information can be included through geostatistical algorithms. These benefits, however, require more computing resources than traditional...
Multivariate statistical models for disruption prediction at ASDEX Upgrade
International Nuclear Information System (INIS)
Aledda, R.; Cannas, B.; Fanni, A.; Sias, G.; Pautasso, G.
2013-01-01
In this paper, a disruption prediction system for ASDEX Upgrade has been proposed that does not require disruption terminated experiments to be implemented. The system consists of a data-based model, which is built using only few input signals coming from successfully terminated pulses. A fault detection and isolation approach has been used, where the prediction is based on the analysis of the residuals of an auto regressive exogenous input model. The prediction performance of the proposed system is encouraging when it is applied to the same set of campaigns used to implement the model. However, the false alarms significantly increase when we tested the system on discharges coming from experimental campaigns temporally far from those used to train the model. This is due to the well know aging effect inherent in the data-based models. The main advantage of the proposed method, with respect to other data-based approaches in literature, is that it does not need data on experiments terminated with a disruption, as it uses a normal operating conditions model. This is a big advantage in the prospective of a prediction system for ITER, where a limited number of disruptions can be allowed
Statistical tests for equal predictive ability across multiple forecasting methods
DEFF Research Database (Denmark)
Borup, Daniel; Thyrsgaard, Martin
We develop a multivariate generalization of the Giacomini-White tests for equal conditional predictive ability. The tests are applicable to a mixture of nested and non-nested models, incorporate estimation uncertainty explicitly, and allow for misspecification of the forecasting model as well as ...
Foundations of Complex Systems Nonlinear Dynamics, Statistical Physics, and Prediction
Nicolis, Gregoire
2007-01-01
Complexity is emerging as a post-Newtonian paradigm for approaching a large body of phenomena of concern at the crossroads of physical, engineering, environmental, life and human sciences from a unifying point of view. This book outlines the foundations of modern complexity research as it arose from the cross-fertilization of ideas and tools from nonlinear science, statistical physics and numerical simulation. It is shown how these developments lead to an understanding, both qualitative and quantitative, of the complex systems encountered in nature and in everyday experience and, conversely, h
Prediction of lacking control power in power plants using statistical models
DEFF Research Database (Denmark)
Odgaard, Peter Fogh; Mataji, B.; Stoustrup, Jakob
2007-01-01
Prediction of the performance of plants like power plants is of interest, since the plant operator can use these predictions to optimize the plant production. In this paper the focus is addressed on a special case where a combination of high coal moisture content and a high load limits the possible...... plant load, meaning that the requested plant load cannot be met. The available models are in this case uncertain. Instead statistical methods are used to predict upper and lower uncertainty bounds on the prediction. Two different methods are used. The first relies on statistics of recent prediction...... errors; the second uses operating point depending statistics of prediction errors. Using these methods on the previous mentioned case, it can be concluded that the second method can be used to predict the power plant performance, while the first method has problems predicting the uncertain performance...
A statistical model for aggregating judgments by incorporating peer predictions
McCoy, John; Prelec, Drazen
2017-01-01
We propose a probabilistic model to aggregate the answers of respondents answering multiple-choice questions. The model does not assume that everyone has access to the same information, and so does not assume that the consensus answer is correct. Instead, it infers the most probable world state, even if only a minority vote for it. Each respondent is modeled as receiving a signal contingent on the actual world state, and as using this signal to both determine their own answer and predict the ...
Statistical Models for Predicting Threat Detection From Human Behavior
Kelley, Timothy; Amon, Mary J.; Bertenthal, Bennett I.
2018-01-01
Users must regularly distinguish between secure and insecure cyber platforms in order to preserve their privacy and safety. Mouse tracking is an accessible, high-resolution measure that can be leveraged to understand the dynamics of perception, categorization, and decision-making in threat detection. Researchers have begun to utilize measures like mouse tracking in cyber security research, including in the study of risky online behavior. However, it remains an empirical question to what extent real-time information about user behavior is predictive of user outcomes and demonstrates added value compared to traditional self-report questionnaires. Participants navigated through six simulated websites, which resembled either secure “non-spoof” or insecure “spoof” versions of popular websites. Websites also varied in terms of authentication level (i.e., extended validation, standard validation, or partial encryption). Spoof websites had modified Uniform Resource Locator (URL) and authentication level. Participants chose to “login” to or “back” out of each website based on perceived website security. Mouse tracking information was recorded throughout the task, along with task performance. After completing the website identification task, participants completed a questionnaire assessing their security knowledge and degree of familiarity with the websites simulated during the experiment. Despite being primed to the possibility of website phishing attacks, participants generally showed a bias for logging in to websites versus backing out of potentially dangerous sites. Along these lines, participant ability to identify spoof websites was around the level of chance. Hierarchical Bayesian logistic models were used to compare the accuracy of two-factor (i.e., website security and encryption level), survey-based (i.e., security knowledge and website familiarity), and real-time measures (i.e., mouse tracking) in predicting risky online behavior during phishing
Statistical Models for Predicting Threat Detection From Human Behavior
Directory of Open Access Journals (Sweden)
Timothy Kelley
2018-04-01
Full Text Available Users must regularly distinguish between secure and insecure cyber platforms in order to preserve their privacy and safety. Mouse tracking is an accessible, high-resolution measure that can be leveraged to understand the dynamics of perception, categorization, and decision-making in threat detection. Researchers have begun to utilize measures like mouse tracking in cyber security research, including in the study of risky online behavior. However, it remains an empirical question to what extent real-time information about user behavior is predictive of user outcomes and demonstrates added value compared to traditional self-report questionnaires. Participants navigated through six simulated websites, which resembled either secure “non-spoof” or insecure “spoof” versions of popular websites. Websites also varied in terms of authentication level (i.e., extended validation, standard validation, or partial encryption. Spoof websites had modified Uniform Resource Locator (URL and authentication level. Participants chose to “login” to or “back” out of each website based on perceived website security. Mouse tracking information was recorded throughout the task, along with task performance. After completing the website identification task, participants completed a questionnaire assessing their security knowledge and degree of familiarity with the websites simulated during the experiment. Despite being primed to the possibility of website phishing attacks, participants generally showed a bias for logging in to websites versus backing out of potentially dangerous sites. Along these lines, participant ability to identify spoof websites was around the level of chance. Hierarchical Bayesian logistic models were used to compare the accuracy of two-factor (i.e., website security and encryption level, survey-based (i.e., security knowledge and website familiarity, and real-time measures (i.e., mouse tracking in predicting risky online behavior
Statistical Models for Predicting Threat Detection From Human Behavior.
Kelley, Timothy; Amon, Mary J; Bertenthal, Bennett I
2018-01-01
Users must regularly distinguish between secure and insecure cyber platforms in order to preserve their privacy and safety. Mouse tracking is an accessible, high-resolution measure that can be leveraged to understand the dynamics of perception, categorization, and decision-making in threat detection. Researchers have begun to utilize measures like mouse tracking in cyber security research, including in the study of risky online behavior. However, it remains an empirical question to what extent real-time information about user behavior is predictive of user outcomes and demonstrates added value compared to traditional self-report questionnaires. Participants navigated through six simulated websites, which resembled either secure "non-spoof" or insecure "spoof" versions of popular websites. Websites also varied in terms of authentication level (i.e., extended validation, standard validation, or partial encryption). Spoof websites had modified Uniform Resource Locator (URL) and authentication level. Participants chose to "login" to or "back" out of each website based on perceived website security. Mouse tracking information was recorded throughout the task, along with task performance. After completing the website identification task, participants completed a questionnaire assessing their security knowledge and degree of familiarity with the websites simulated during the experiment. Despite being primed to the possibility of website phishing attacks, participants generally showed a bias for logging in to websites versus backing out of potentially dangerous sites. Along these lines, participant ability to identify spoof websites was around the level of chance. Hierarchical Bayesian logistic models were used to compare the accuracy of two-factor (i.e., website security and encryption level), survey-based (i.e., security knowledge and website familiarity), and real-time measures (i.e., mouse tracking) in predicting risky online behavior during phishing attacks
Prediction of Frost Occurrences Using Statistical Modeling Approaches
Directory of Open Access Journals (Sweden)
Hyojin Lee
2016-01-01
Full Text Available We developed the frost prediction models in spring in Korea using logistic regression and decision tree techniques. Hit Rate (HR, Probability of Detection (POD, and False Alarm Rate (FAR from both models were calculated and compared. Threshold values for the logistic regression models were selected to maximize HR and POD and minimize FAR for each station, and the split for the decision tree models was stopped when change in entropy was relatively small. Average HR values were 0.92 and 0.91 for logistic regression and decision tree techniques, respectively, average POD values were 0.78 and 0.80 for logistic regression and decision tree techniques, respectively, and average FAR values were 0.22 and 0.28 for logistic regression and decision tree techniques, respectively. The average numbers of selected explanatory variables were 5.7 and 2.3 for logistic regression and decision tree techniques, respectively. Fewer explanatory variables can be more appropriate for operational activities to provide a timely warning for the prevention of the frost damages to agricultural crops. We concluded that the decision tree model can be more useful for the timely warning system. It is recommended that the models should be improved to reflect local topological features.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Directory of Open Access Journals (Sweden)
Zhenxiang Jiang
2016-01-01
Full Text Available The traditional methods of diagnosing dam service status are always suitable for single measuring point. These methods also reflect the local status of dams without merging multisource data effectively, which is not suitable for diagnosing overall service. This study proposes a new method involving multiple points to diagnose dam service status based on joint distribution function. The function, including monitoring data of multiple points, can be established with t-copula function. Therefore, the possibility, which is an important fusing value in different measuring combinations, can be calculated, and the corresponding diagnosing criterion is established with typical small probability theory. Engineering case study indicates that the fusion diagnosis method can be conducted in real time and the abnormal point can be detected, thereby providing a new early warning method for engineering safety.
Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A.; van t Veld, Aart A.
2012-01-01
PURPOSE: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
International Nuclear Information System (INIS)
2005-01-01
For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees
International Nuclear Information System (INIS)
2001-01-01
For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
International Nuclear Information System (INIS)
1999-01-01
For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Methods of fast, multiple-point in vivo T1 determination
International Nuclear Information System (INIS)
Zhang, Y.; Spigarelli, M.; Fencil, L.E.; Yeung, H.N.
1989-01-01
Two methods of rapid, multiple-point determination of T1 in vivo have been evaluated with a phantom consisting of vials of gel in different Mn + + concentrations. The first method was an inversion-recovery- on-the-fly technique, and the second method used a variable- tip-angle (α) progressive saturation with two sub- sequences of different repetition times. In the first method, 1/T1 was evaluated by an exponential fit. In the second method, 1/T1 was obtained iteratively with a linear fit and then readjusted together with α to a model equation until self-consistency was reached
Error analysis of dimensionless scaling experiments with multiple points using linear regression
International Nuclear Information System (INIS)
Guercan, Oe.D.; Vermare, L.; Hennequin, P.; Bourdelle, C.
2010-01-01
A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)
Coexistence of different vacua in the effective quantum field theory and multiple point principle
International Nuclear Information System (INIS)
Volovik, G.E.
2004-01-01
According to the multiple point principle our Universe in on the coexistence curve of two or more phases of the quantum vacuum. The coexistence of different quantum vacua can be regulated by the exchange of the global fermionic charges between the vacua. If the coexistence is regulated by the baryonic charge, all the coexisting vacua exhibit the baryonic asymmetry. Due to the exchange of the baryonic charge between the vacuum and matter which occurs above the electroweak transition, the baryonic asymmetry of the vacuum induces the baryonic asymmetry of matter in our Standard-Model phase of the quantum vacuum [ru
On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
Directory of Open Access Journals (Sweden)
Paul B Conn
Full Text Available Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH, developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models.
International Nuclear Information System (INIS)
2003-01-01
For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products
International Nuclear Information System (INIS)
2004-01-01
For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees
International Nuclear Information System (INIS)
2000-01-01
For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products
Cameron, Enrico; Pilla, Giorgio; Stella, Fabio A.
2018-01-01
The application of statistical classification methods is investigated—in comparison also to spatial interpolation methods—for predicting the acceptability of well-water quality in a situation where an effective quantitative model of the hydrogeological system under consideration cannot be developed. In the example area in northern Italy, in particular, the aquifer is locally affected by saline water and the concentration of chloride is the main indicator of both saltwater occurrence and groundwater quality. The goal is to predict if the chloride concentration in a water well will exceed the allowable concentration so that the water is unfit for the intended use. A statistical classification algorithm achieved the best predictive performances and the results of the study show that statistical classification methods provide further tools for dealing with groundwater quality problems concerning hydrogeological systems that are too difficult to describe analytically or to simulate effectively.
Mariethoz, Gregoire; Lefebvre, Sylvain
2014-05-01
Multiple-Point Simulations (MPS) is a family of geostatistical tools that has received a lot of attention in recent years for the characterization of spatial phenomena in geosciences. It relies on the definition of training images to represent a given type of spatial variability, or texture. We show that the algorithmic tools used are similar in many ways to techniques developed in computer graphics, where there is a need to generate large amounts of realistic textures for applications such as video games and animated movies. Similarly to MPS, these texture synthesis methods use training images, or exemplars, to generate realistic-looking graphical textures. Both domains of multiple-point geostatistics and example-based texture synthesis present similarities in their historic development and share similar concepts. These disciplines have however remained separated, and as a result significant algorithmic innovations in each discipline have not been universally adopted. Texture synthesis algorithms present drastically increased computational efficiency, patterns reproduction and user control. At the same time, MPS developed ways to condition models to spatial data and to produce 3D stochastic realizations, which have not been thoroughly investigated in the field of texture synthesis. In this paper we review the possible links between these disciplines and show the potential and limitations of using concepts and approaches from texture synthesis in MPS. We also provide guidelines on how recent developments could benefit both fields of research, and what challenges remain open.
Statistical model for prediction of hearing loss in patients receiving cisplatin chemotherapy.
Johnson, Andrew; Tarima, Sergey; Wong, Stuart; Friedland, David R; Runge, Christina L
2013-03-01
This statistical model might be used to predict cisplatin-induced hearing loss, particularly in patients undergoing concomitant radiotherapy. To create a statistical model based on pretreatment hearing thresholds to provide an individual probability for hearing loss from cisplatin therapy and, secondarily, to investigate the use of hearing classification schemes as predictive tools for hearing loss. Retrospective case-control study. Tertiary care medical center. A total of 112 subjects receiving chemotherapy and audiometric evaluation were evaluated for the study. Of these subjects, 31 met inclusion criteria for analysis. The primary outcome measurement was a statistical model providing the probability of hearing loss following the use of cisplatin chemotherapy. Fifteen of the 31 subjects had significant hearing loss following cisplatin chemotherapy. American Academy of Otolaryngology-Head and Neck Society and Gardner-Robertson hearing classification schemes revealed little change in hearing grades between pretreatment and posttreatment evaluations for subjects with or without hearing loss. The Chang hearing classification scheme could effectively be used as a predictive tool in determining hearing loss with a sensitivity of 73.33%. Pretreatment hearing thresholds were used to generate a statistical model, based on quadratic approximation, to predict hearing loss (C statistic = 0.842, cross-validated = 0.835). The validity of the model improved when only subjects who received concurrent head and neck irradiation were included in the analysis (C statistic = 0.91). A calculated cutoff of 0.45 for predicted probability has a cross-validated sensitivity and specificity of 80%. Pretreatment hearing thresholds can be used as a predictive tool for cisplatin-induced hearing loss, particularly with concomitant radiotherapy.
Pearce, Marcus T
2018-05-11
Music perception depends on internal psychological models derived through exposure to a musical culture. It is hypothesized that this musical enculturation depends on two cognitive processes: (1) statistical learning, in which listeners acquire internal cognitive models of statistical regularities present in the music to which they are exposed; and (2) probabilistic prediction based on these learned models that enables listeners to organize and process their mental representations of music. To corroborate these hypotheses, I review research that uses a computational model of probabilistic prediction based on statistical learning (the information dynamics of music (IDyOM) model) to simulate data from empirical studies of human listeners. The results show that a broad range of psychological processes involved in music perception-expectation, emotion, memory, similarity, segmentation, and meter-can be understood in terms of a single, underlying process of probabilistic prediction using learned statistical models. Furthermore, IDyOM simulations of listeners from different musical cultures demonstrate that statistical learning can plausibly predict causal effects of differential cultural exposure to musical styles, providing a quantitative model of cultural distance. Understanding the neural basis of musical enculturation will benefit from close coordination between empirical neuroimaging and computational modeling of underlying mechanisms, as outlined here. © 2018 The Authors. Annals of the New York Academy of Sciences published by Wiley Periodicals, Inc. on behalf of New York Academy of Sciences.
Wang, Ming; Long, Qi
2016-09-01
Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
A Statistical Approach for Gain Bandwidth Prediction of Phoenix-Cell Based Reflect arrays
Directory of Open Access Journals (Sweden)
Hassan Salti
2018-01-01
Full Text Available A new statistical approach to predict the gain bandwidth of Phoenix-cell based reflectarrays is proposed. It combines the effects of both main factors that limit the bandwidth of reflectarrays: spatial phase delays and intrinsic bandwidth of radiating cells. As an illustration, the proposed approach is successfully applied to two reflectarrays based on new Phoenix cells.
Statistical model predictions for p+p and Pb+Pb collisions at LHC
Kraus, I.; Cleymans, J.; Oeschler, H.; Redlich, K.; Wheaton, S.
2009-01-01
Particle production in p+p and central collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in
Statistical prediction of biomethane potentials based on the composition of lignocellulosic biomass
DEFF Research Database (Denmark)
Thomsen, Sune Tjalfe; Spliid, Henrik; Østergård, Hanne
2014-01-01
Mixture models are introduced as a new and stronger methodology for statistical prediction of biomethane potentials (BPM) from lignocellulosic biomass compared to the linear regression models previously used. A large dataset from literature combined with our own data were analysed using canonical...
Statistical Analysis of a Method to Predict Drug-Polymer Miscibility
DEFF Research Database (Denmark)
Knopp, Matthias Manne; Olesen, Niels Erik; Huang, Yanbin
2016-01-01
In this study, a method proposed to predict drug-polymer miscibility from differential scanning calorimetry measurements was subjected to statistical analysis. The method is relatively fast and inexpensive and has gained popularity as a result of the increasing interest in the formulation of drug...... as provided in this study. © 2015 Wiley Periodicals, Inc. and the American Pharmacists Association J Pharm Sci....
Estimation of Multiple Point Sources for Linear Fractional Order Systems Using Modulating Functions
Belkhatir, Zehor
2017-06-28
This paper proposes an estimation algorithm for the characterization of multiple point inputs for linear fractional order systems. First, using polynomial modulating functions method and a suitable change of variables the problem of estimating the locations and the amplitudes of a multi-pointwise input is decoupled into two algebraic systems of equations. The first system is nonlinear and solves for the time locations iteratively, whereas the second system is linear and solves for the input’s amplitudes. Second, closed form formulas for both the time location and the amplitude are provided in the particular case of single point input. Finally, numerical examples are given to illustrate the performance of the proposed technique in both noise-free and noisy cases. The joint estimation of pointwise input and fractional differentiation orders is also presented. Furthermore, a discussion on the performance of the proposed algorithm is provided.
Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop
Morrison, Joseph H.
2010-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.
Comparison of four statistical and machine learning methods for crash severity prediction.
Iranitalab, Amirfarrokh; Khattak, Aemal
2017-11-01
Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Output from Statistical Predictive Models as Input to eLearning Dashboards
Directory of Open Access Journals (Sweden)
Marlene A. Smith
2015-06-01
Full Text Available We describe how statistical predictive models might play an expanded role in educational analytics by giving students automated, real-time information about what their current performance means for eventual success in eLearning environments. We discuss how an online messaging system might tailor information to individual students using predictive analytics. The proposed system would be data-driven and quantitative; e.g., a message might furnish the probability that a student will successfully complete the certificate requirements of a massive open online course. Repeated messages would prod underperforming students and alert instructors to those in need of intervention. Administrators responsible for accreditation or outcomes assessment would have ready documentation of learning outcomes and actions taken to address unsatisfactory student performance. The article’s brief introduction to statistical predictive models sets the stage for a description of the messaging system. Resources and methods needed to develop and implement the system are discussed.
Comparison of statistical and clinical predictions of functional outcome after ischemic stroke.
Directory of Open Access Journals (Sweden)
Douglas D Thompson
Full Text Available To determine whether the predictions of functional outcome after ischemic stroke made at the bedside using a doctor's clinical experience were more or less accurate than the predictions made by clinical prediction models (CPMs.A prospective cohort study of nine hundred and thirty one ischemic stroke patients recruited consecutively at the outpatient, inpatient and emergency departments of the Western General Hospital, Edinburgh between 2002 and 2005. Doctors made informal predictions of six month functional outcome on the Oxford Handicap Scale (OHS. Patients were followed up at six months with a validated postal questionnaire. For each patient we calculated the absolute predicted risk of death or dependence (OHS≥3 using five previously described CPMs. The specificity of a doctor's informal predictions of OHS≥3 at six months was good 0.96 (95% CI: 0.94 to 0.97 and similar to CPMs (range 0.94 to 0.96; however the sensitivity of both informal clinical predictions 0.44 (95% CI: 0.39 to 0.49 and clinical prediction models (range 0.38 to 0.45 was poor. The prediction of the level of disability after stroke was similar for informal clinical predictions (ordinal c-statistic 0.74 with 95% CI 0.72 to 0.76 and CPMs (range 0.69 to 0.75. No patient or clinician characteristic affected the accuracy of informal predictions, though predictions were more accurate in outpatients.CPMs are at least as good as informal clinical predictions in discriminating between good and bad functional outcome after ischemic stroke. The place of these models in clinical practice has yet to be determined.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
International Nuclear Information System (INIS)
Nedic, Vladimir; Despotovic, Danijela; Cvetanovic, Slobodan; Despotovic, Milan; Babic, Sasa
2014-01-01
Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L eq . Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model
BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.
Kaur, Harpreet; Raghava, G P S
2002-03-01
beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/
Energy Technology Data Exchange (ETDEWEB)
Nedic, Vladimir, E-mail: vnedic@kg.ac.rs [Faculty of Philology and Arts, University of Kragujevac, Jovana Cvijića bb, 34000 Kragujevac (Serbia); Despotovic, Danijela, E-mail: ddespotovic@kg.ac.rs [Faculty of Economics, University of Kragujevac, Djure Pucara Starog 3, 34000 Kragujevac (Serbia); Cvetanovic, Slobodan, E-mail: slobodan.cvetanovic@eknfak.ni.ac.rs [Faculty of Economics, University of Niš, Trg kralja Aleksandra Ujedinitelja, 18000 Niš (Serbia); Despotovic, Milan, E-mail: mdespotovic@kg.ac.rs [Faculty of Engineering, University of Kragujevac, Sestre Janjic 6, 34000 Kragujevac (Serbia); Babic, Sasa, E-mail: babicsf@yahoo.com [College of Applied Mechanical Engineering, Trstenik (Serbia)
2014-11-15
Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L{sub eq}. Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model.
Monroy, Claire D; Gerson, Sarah A; Hunnius, Sabine
2018-05-01
Humans are sensitive to the statistical regularities in action sequences carried out by others. In the present eyetracking study, we investigated whether this sensitivity can support the prediction of upcoming actions when observing unfamiliar action sequences. In two between-subjects conditions, we examined whether observers would be more sensitive to statistical regularities in sequences performed by a human agent versus self-propelled 'ghost' events. Secondly, we investigated whether regularities are learned better when they are associated with contingent effects. Both implicit and explicit measures of learning were compared between agent and ghost conditions. Implicit learning was measured via predictive eye movements to upcoming actions or events, and explicit learning was measured via both uninstructed reproduction of the action sequences and verbal reports of the regularities. The findings revealed that participants, regardless of condition, readily learned the regularities and made correct predictive eye movements to upcoming events during online observation. However, different patterns of explicit-learning outcomes emerged following observation: Participants were most likely to re-create the sequence regularities and to verbally report them when they had observed an actor create a contingent effect. These results suggest that the shift from implicit predictions to explicit knowledge of what has been learned is facilitated when observers perceive another agent's actions and when these actions cause effects. These findings are discussed with respect to the potential role of the motor system in modulating how statistical regularities are learned and used to modify behavior.
Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James
2018-06-01
Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.
Statistical approach to predict compressive strength of high workability slag-cement mortars
International Nuclear Information System (INIS)
Memon, N.A.; Memon, N.A.; Sumadi, S.R.
2009-01-01
This paper reports an attempt made to develop empirical expressions to estimate/ predict the compressive strength of high workability slag-cement mortars. Experimental data of 54 mix mortars were used. The mortars were prepared with slag as cement replacement of the order of 0, 50 and 60%. The flow (workability) was maintained at 136+-3%. The numerical and statistical analysis was performed by using database computer software Microsoft Office Excel 2003. Three empirical mathematical models were developed to estimate/predict 28 days compressive strength of high workability slag cement-mortars with 0, 50 and 60% slag which predict the values accurate between 97 and 98%. Finally a generalized empirical mathematical model was proposed which can predict 28 days compressive strength of high workability mortars up to degree of accuracy 95%. (author)
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Energy Technology Data Exchange (ETDEWEB)
Berghausen, P.E. Jr.; Mathews, T.W.
1987-01-01
The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques.
International Nuclear Information System (INIS)
Berghausen, P.E. Jr.; Mathews, T.W.
1987-01-01
The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques
Prediction of transmission loss through an aircraft sidewall using statistical energy analysis
Ming, Ruisen; Sun, Jincai
1989-06-01
The transmission loss of randomly incident sound through an aircraft sidewall is investigated using statistical energy analysis. Formulas are also obtained for the simple calculation of sound transmission loss through single- and double-leaf panels. Both resonant and nonresonant sound transmissions can be easily calculated using the formulas. The formulas are used to predict sound transmission losses through a Y-7 propeller airplane panel. The panel measures 2.56 m x 1.38 m and has two windows. The agreement between predicted and measured values through most of the frequency ranges tested is quite good.
Statistical Analysis of CFD Solutions From the Fifth AIAA Drag Prediction Workshop
Morrison, Joseph H.
2013-01-01
A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from North America, Europe, Asia, and South America using a common grid sequence and multiple turbulence models for the June 2012 fifth Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was the Common Research Model subsonic transport wing-body previously used for the 4th Drag Prediction Workshop. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with previous workshops.
Statistical Model Predictions for p+p and Pb+Pb Collisions at LHC
Kraus, I; Oeschler, H; Redlich, K; Wheaton, S
2009-01-01
Particle production in p+p and central Pb+Pb collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in detail, and some of them, which are particularly appropriate to determine the chemical freeze-out point experimentally, are indicated. Considering elementary interactions on the other hand, we focus on strangeness production and its possible suppression. Extrapolating the thermal parameters to LHC energy, we present predictions of the statistical model for particle yields in p+p collisions. We quantify the strangeness suppression by the correlation volume parameter and discuss its influence on particle production. We propose observables that can provide deeper insight into the mechanism of strangeness production and suppression at LHC.
Yin, Gaohong
2016-05-01
Since the failure of the Scan Line Corrector (SLC) instrument on Landsat 7, observable gaps occur in the acquired Landsat 7 imagery, impacting the spatial continuity of observed imagery. Due to the highly geometric and radiometric accuracy provided by Landsat 7, a number of approaches have been proposed to fill the gaps. However, all proposed approaches have evident constraints for universal application. The main issues in gap-filling are an inability to describe the continuity features such as meandering streams or roads, or maintaining the shape of small objects when filling gaps in heterogeneous areas. The aim of the study is to validate the feasibility of using the Direct Sampling multiple-point geostatistical method, which has been shown to reconstruct complicated geological structures satisfactorily, to fill Landsat 7 gaps. The Direct Sampling method uses a conditional stochastic resampling of known locations within a target image to fill gaps and can generate multiple reconstructions for one simulation case. The Direct Sampling method was examined across a range of land cover types including deserts, sparse rural areas, dense farmlands, urban areas, braided rivers and coastal areas to demonstrate its capacity to recover gaps accurately for various land cover types. The prediction accuracy of the Direct Sampling method was also compared with other gap-filling approaches, which have been previously demonstrated to offer satisfactory results, under both homogeneous area and heterogeneous area situations. Studies have shown that the Direct Sampling method provides sufficiently accurate prediction results for a variety of land cover types from homogeneous areas to heterogeneous land cover types. Likewise, it exhibits superior performances when used to fill gaps in heterogeneous land cover types without input image or with an input image that is temporally far from the target image in comparison with other gap-filling approaches.
Qi, D.; Majda, A.
2017-12-01
A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with
LSHSIM: A Locality Sensitive Hashing based method for multiple-point geostatistics
Moura, Pedro; Laber, Eduardo; Lopes, Hélio; Mesejo, Daniel; Pavanelli, Lucas; Jardim, João; Thiesen, Francisco; Pujol, Gabriel
2017-10-01
Reservoir modeling is a very important task that permits the representation of a geological region of interest, so as to generate a considerable number of possible scenarios. Since its inception, many methodologies have been proposed and, in the last two decades, multiple-point geostatistics (MPS) has been the dominant one. This methodology is strongly based on the concept of training image (TI) and the use of its characteristics, which are called patterns. In this paper, we propose a new MPS method that combines the application of a technique called Locality Sensitive Hashing (LSH), which permits to accelerate the search for patterns similar to a target one, with a Run-Length Encoding (RLE) compression technique that speeds up the calculation of the Hamming similarity. Experiments with both categorical and continuous images show that LSHSIM is computationally efficient and produce good quality realizations. In particular, for categorical data, the results suggest that LSHSIM is faster than MS-CCSIM, one of the state-of-the-art methods.
International Nuclear Information System (INIS)
Sarwar, M.; Gillibrand, D.; Bradbury, S.R.
1991-01-01
Advanced surface engineering technologies (physical and chemical vapour deposition) have been successfully applied to high speed steel and carbide cutting tools, and the potential benefits in terms of both performance and longer tool life, are now well established. Although major achievements have been reported by many manufacturers and users, there are a number of applications where surface engineering has been unsuccessful. Considerable attention has been given to the film characteristics and the variables associated with its properties; however, very little attention has been directed towards the benefits to the tool user. In order to apply surface engineering technology effectively to cutting tools, the coater needs to have accurate information relating to cutting conditions, i.e. cutting forces, stress and temperature etc. The present paper describes results obtained with single- and multiple-point cutting tools with examples of failures, which should help the surface coater to appreciate the significance of the cutting conditions, and in particular the magnitude of the forces and stresses present during cutting processes. These results will assist the development of a systems approach to cutting tool technology and surface engineering with a view to developing an improved product. (orig.)
The statistical prediction of offshore winds from land-based data for wind-energy applications
DEFF Research Database (Denmark)
Walmsley, J.L.; Barthelmie, R.J.; Burrows, W.R.
2001-01-01
Land-based meteorological measurements at two locations on the Danish coast are used to predict offshore wind speeds. Offshore wind-speed data are used only for developing the statistical prediction algorithms and for verification. As a first step, the two datasets were separated into nine...... percentile-based bins, with a minimum of 30 data records in each bin. Next, the records were randomly selected with approximately 70% of the data in each bin being used as a training set for development of the prediction algorithms, and the remaining 30% being reserved as a test set for evaluation purposes....... The binning procedure ensured that both training and test sets fairly represented the overall data distribution. To base the conclusions on firmer ground, five permutations of these training and test sets were created. Thus, all calculations were based on five cases, each one representing a different random...
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
James, Ryan G.; Mahoney, John R.; Crutchfield, James P.
2017-06-01
One of the most basic characterizations of the relationship between two random variables, X and Y , is the value of their mutual information. Unfortunately, calculating it analytically and estimating it empirically are often stymied by the extremely large dimension of the variables. One might hope to replace such a high-dimensional variable by a smaller one that preserves its relationship with the other. It is well known that either X (or Y ) can be replaced by its minimal sufficient statistic about Y (or X ) while preserving the mutual information. While intuitively reasonable, it is not obvious or straightforward that both variables can be replaced simultaneously. We demonstrate that this is in fact possible: the information X 's minimal sufficient statistic preserves about Y is exactly the information that Y 's minimal sufficient statistic preserves about X . We call this procedure information trimming. As an important corollary, we consider the case where one variable is a stochastic process' past and the other its future. In this case, the mutual information is the channel transmission rate between the channel's effective states. That is, the past-future mutual information (the excess entropy) is the amount of information about the future that can be predicted using the past. Translating our result about minimal sufficient statistics, this is equivalent to the mutual information between the forward- and reverse-time causal states of computational mechanics. We close by discussing multivariate extensions to this use of minimal sufficient statistics.
Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images
Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina
2012-03-01
The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.
Aegisdottir, Stefania; White, Michael J.; Spengler, Paul M.; Maugherman, Alan S.; Anderson, Linda A.; Cook, Robert S.; Nichols, Cassandra N.; Lampropoulos, Georgios K.; Walker, Blain S.; Cohen, Genna; Rush, Jeffrey D.
2006-01-01
Clinical predictions made by mental health practitioners are compared with those using statistical approaches. Sixty-seven studies were identified from a comprehensive search of 56 years of research; 92 effect sizes were derived from these studies. The overall effect of clinical versus statistical prediction showed a somewhat greater accuracy for…
International Nuclear Information System (INIS)
Ayodele, T.R.; Ogunjuyigbe, A.S.O.
2015-01-01
In this paper, probability distribution of clearness index is proposed for the prediction of global solar radiation. First, the clearness index is obtained from the past data of global solar radiation, then, the parameters of the appropriate distribution that best fit the clearness index are determined. The global solar radiation is thereafter predicted from the clearness index using inverse transformation of the cumulative distribution function. To validate the proposed method, eight years global solar radiation data (2000–2007) of Ibadan, Nigeria are used to determine the parameters of appropriate probability distribution for clearness index. The calculated parameters are then used to predict the future monthly average global solar radiation for the following year (2008). The predicted values are compared with the measured values using four statistical tests: the Root Mean Square Error (RMSE), MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error) and the coefficient of determination (R"2). The proposed method is also compared to the existing regression models. The results show that logistic distribution provides the best fit for clearness index of Ibadan and the proposed method is effective in predicting the monthly average global solar radiation with overall RMSE of 0.383 MJ/m"2/day, MAE of 0.295 MJ/m"2/day, MAPE of 2% and R"2 of 0.967. - Highlights: • Distribution of clearnes index is proposed for prediction of global solar radiation. • The clearness index is obtained from the past data of global solar radiation. • The parameters of distribution that best fit the clearness index are determined. • Solar radiation is predicted from the clearness index using inverse transformation. • The method is effective in predicting the monthly average global solar radiation.
International Nuclear Information System (INIS)
Ghazimirsaied, Ahmad; Koch, Charles Robert
2012-01-01
Highlights: ► Misfire reduction in a combustion engine based on chaotic theory methods. ► Chaotic theory analysis of cyclic variation of a HCCI engine near misfire. ► Symbol sequence approach is used to predict ignition timing one cycle-ahead. ► Prediction is combined with feedback control to lower HCCI combustion variation. ► Feedback control extends the HCCI operating range into the misfire region. -- Abstract: Cyclic variation of a Homogeneous Charge Compression Ignition (HCCI) engine near misfire is analyzed using chaotic theory methods and feedback control is used to stabilize high cyclic variations. Variation of consecutive cycles of θ Pmax (the crank angle of maximum cylinder pressure over an engine cycle) for a Primary Reference Fuel engine is analyzed near misfire operation for five test points with similar conditions but different octane numbers. The return map of the time series of θ Pmax at each combustion cycle reveals the deterministic and random portions of the dynamics near misfire for this HCCI engine. A symbol-statistic approach is used to predict θ Pmax one cycle-ahead. Predicted θ Pmax has similar dynamical behavior to the experimental measurements. Based on this cycle ahead prediction, and using fuel octane as the input, feedback control is used to stabilize the instability of θ Pmax variations at this engine condition near misfire.
Directory of Open Access Journals (Sweden)
Abut F
2015-08-01
Full Text Available Fatih Abut, Mehmet Fatih AkayDepartment of Computer Engineering, Çukurova University, Adana, TurkeyAbstract: Maximal oxygen uptake (VO2max indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance
A statistical approach to the prediction of pressure tube fracture toughness
International Nuclear Information System (INIS)
Pandey, M.D.; Radford, D.D.
2008-01-01
The fracture toughness of the zirconium alloy (Zr-2.5Nb) is an important parameter in determining the flaw tolerance for operation of pressure tubes in a nuclear reactor. Fracture toughness data have been generated by performing rising pressure burst tests on sections of pressure tubes removed from operating reactors. The test data were used to generate a lower-bound fracture toughness curve, which is used in defining the operational limits of pressure tubes. The paper presents a comprehensive statistical analysis of burst test data and develops a multivariate statistical model to relate toughness with material chemistry, mechanical properties, and operational history. The proposed model can be useful in predicting fracture toughness of specific in-service pressure tubes, thereby minimizing conservatism associated with a generic lower-bound approach
International Nuclear Information System (INIS)
Carvajal Escobar Yesid; Munoz, Flor Matilde
2007-01-01
The project this centred in the revision of the state of the art of the ocean-atmospheric phenomena that you affect the Colombian hydrology especially The Phenomenon Enos that causes a socioeconomic impact of first order in our country, it has not been sufficiently studied; therefore it is important to approach the thematic one, including the variable macroclimates associated to the Enos in the analyses of water planning. The analyses include revision of statistical techniques of analysis of consistency of hydrological data with the objective of conforming a database of monthly flow of the river reliable and homogeneous Cauca. Statistical methods are used (Analysis of data multivariante) specifically The analysis of principal components to involve them in the development of models of prediction of flows monthly means in the river Cauca involving the Lineal focus as they are the model autoregressive AR, ARX and Armax and the focus non lineal Net Artificial Network.
Khomenko, B A; Rijllart, A; Sanfilippo, S; Siemko, A
2001-01-01
Premature training quenches are usually caused by the transient energy released within the magnet coil as it is energised. Two distinct varieties of disturbances exist. They are thought to be electrical and mechanical in origin. The first type of disturbance comes from non-uniform current distribution in superconducting cables whereas the second one usually originates from conductor motions or micro-fractures of insulating materials under the action of Lorentz forces. All of these mechanical events produce in general a rapid variation of the voltages in the so-called quench antennas and across the magnet coil, called spikes. A statistical method to treat the spatial localisation and the time occurrence of spikes will be presented. It allows identification of the mechanical weak points in the magnet without need to increase the current to provoke a quench. The prediction of the quench level from detailed analysis of the spike statistics can be expected.
Hydrogen-bond coordination in organic crystal structures: statistics, predictions and applications.
Galek, Peter T A; Chisholm, James A; Pidcock, Elna; Wood, Peter A
2014-02-01
Statistical models to predict the number of hydrogen bonds that might be formed by any donor or acceptor atom in a crystal structure have been derived using organic structures in the Cambridge Structural Database. This hydrogen-bond coordination behaviour has been uniquely defined for more than 70 unique atom types, and has led to the development of a methodology to construct hypothetical hydrogen-bond arrangements. Comparing the constructed hydrogen-bond arrangements with known crystal structures shows promise in the assessment of structural stability, and some initial examples of industrially relevant polymorphs, co-crystals and hydrates are described.
Prediction of noise in ships by the application of “statistical energy analysis.”
DEFF Research Database (Denmark)
Jensen, John Ødegaard
1979-01-01
If it will be possible effectively to reduce the noise level in the accomodation on board ships, by introducing appropriate noise abatement measures already at an early design stage, it is quite essential that sufficiently accurate prediction methods are available for the naval architects...... or for a special noise abatement measure, e.g., increased structural damping. The paper discusses whether it might be possible to derive an alternative calculation model based on the “statistical energy analysis” approach (SEA). By considering the hull of a ship to be constructed from plate elements connected...
International Nuclear Information System (INIS)
DUDEK, J; SZPAK, B; FORNAL, B; PORQUET, M-G
2011-01-01
In this and the follow-up article we briefly discuss what we believe represents one of the most serious problems in contemporary nuclear structure: the question of statistical significance of parametrizations of nuclear microscopic Hamiltonians and the implied predictive power of the underlying theories. In the present Part I, we introduce the main lines of reasoning of the so-called Inverse Problem Theory, an important sub-field in the contemporary Applied Mathematics, here illustrated on the example of the Nuclear Mean-Field Approach.
Energy Technology Data Exchange (ETDEWEB)
Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van' t [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands)
2012-03-15
Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.
Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A
2012-03-15
To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Amany E. Aly
2016-04-01
Full Text Available When a system consisting of independent components of the same type, some appropriate actions may be done as soon as a portion of them have failed. It is, therefore, important to be able to predict later failure times from earlier ones. One of the well-known failure distributions commonly used to model component life, is the modified Weibull distribution (MWD. In this paper, two pivotal quantities are proposed to construct prediction intervals for future unobservable lifetimes based on generalized order statistics (gos from MWD. Moreover, a pivotal quantity is developed to reconstruct missing observations at the beginning of experiment. Furthermore, Monte Carlo simulation studies are conducted and numerical computations are carried out to investigate the efficiency of presented results. Finally, two illustrative examples for real data sets are analyzed.
International Nuclear Information System (INIS)
Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van’t
2012-01-01
Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.
Predicting energy performance of a net-zero energy building: A statistical approach
International Nuclear Information System (INIS)
Kneifel, Joshua; Webb, David
2016-01-01
Highlights: • A regression model is applied to actual energy data from a net-zero energy building. • The model is validated through a rigorous statistical analysis. • Comparisons are made between model predictions and those of a physics-based model. • The model is a viable baseline for evaluating future models from the energy data. - Abstract: Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid Climate Zone, and compares these
Wind gust estimation by combining numerical weather prediction model and statistical post-processing
Patlakas, Platon; Drakaki, Eleni; Galanis, George; Spyrou, Christos; Kallos, George
2017-04-01
The continuous rise of off-shore and near-shore activities as well as the development of structures, such as wind farms and various offshore platforms, requires the employment of state-of-the-art risk assessment techniques. Such analysis is used to set the safety standards and can be characterized as a climatologically oriented approach. Nevertheless, a reliable operational support is also needed in order to minimize cost drawbacks and human danger during the construction and the functioning stage as well as during maintenance activities. One of the most important parameters for this kind of analysis is the wind speed intensity and variability. A critical measure associated with this variability is the presence and magnitude of wind gusts as estimated in the reference level of 10m. The latter can be attributed to different processes that vary among boundary-layer turbulence, convection activities, mountain waves and wake phenomena. The purpose of this work is the development of a wind gust forecasting methodology combining a Numerical Weather Prediction model and a dynamical statistical tool based on Kalman filtering. To this end, the parameterization of Wind Gust Estimate method was implemented to function within the framework of the atmospheric model SKIRON/Dust. The new modeling tool combines the atmospheric model with a statistical local adaptation methodology based on Kalman filters. This has been tested over the offshore west coastline of the United States. The main purpose is to provide a useful tool for wind analysis and prediction and applications related to offshore wind energy (power prediction, operation and maintenance). The results have been evaluated by using observational data from the NOAA's buoy network. As it was found, the predicted output shows a good behavior that is further improved after the local adjustment post-process.
The use of machine learning and nonlinear statistical tools for ADME prediction.
Sakiyama, Yojiro
2009-02-01
Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.
Predicting The Exit Time Of Employees In An Organization Using Statistical Model
Directory of Open Access Journals (Sweden)
Ahmed Al Kuwaiti
2015-08-01
Full Text Available Employees are considered as an asset to any organization and each organization provide a better and flexible working environment to retain its best and resourceful workforce. As such continuous efforts are being taken to avoid or extend the exitwithdrawal of employees from the organization. Human resource managers are facing a challenge to predict the exit time of employees and there is no precise model existing at present in the literature. This study has been conducted to predict the probability of exit of an employee in an organization using appropriate statistical model. Accordingly authors designed a model using Additive Weibull distribution to predict the expected exit time of employee in an organization. In addition a Shock model approach is also executed to check how well the Additive Weibull distribution suits in an organization. The analytical results showed that when the inter-arrival time increases the expected time for the employees to exit also increases. This study concluded that Additive Weibull distribution can be considered as an alternative in the place of Shock model approach to predict the exit time of employee in an organization.
Directory of Open Access Journals (Sweden)
Fabrizio Pucci
Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.
Abut, Fatih; Akay, Mehmet Fatih
2015-01-01
Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.
Manning, Robert M.
1990-01-01
A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.
Yang, Jing; Zammit, Christian; Dudley, Bruce
2017-04-01
The phenomenon of losing and gaining in rivers normally takes place in lowland where often there are various, sometimes conflicting uses for water resources, e.g., agriculture, industry, recreation, and maintenance of ecosystem function. To better support water allocation decisions, it is crucial to understand the location and seasonal dynamics of these losses and gains. We present a statistical methodology to predict losing and gaining river reaches in New Zealand based on 1) information surveys with surface water and groundwater experts from regional government, 2) A collection of river/watershed characteristics, including climate, soil and hydrogeologic information, and 3) the random forests technique. The surveys on losing and gaining reaches were conducted face-to-face at 16 New Zealand regional government authorities, and climate, soil, river geometry, and hydrogeologic data from various sources were collected and compiled to represent river/watershed characteristics. The random forests technique was used to build up the statistical relationship between river reach status (gain and loss) and river/watershed characteristics, and then to predict for river reaches at Strahler order one without prior losing and gaining information. Results show that the model has a classification error of around 10% for "gain" and "loss". The results will assist further research, and water allocation decisions in lowland New Zealand.
Predicting future protection of respirator users: Statistical approaches and practical implications.
Hu, Chengcheng; Harber, Philip; Su, Jing
2016-01-01
The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.
Predicting tube repair at French nuclear steam generators using statistical modeling
Energy Technology Data Exchange (ETDEWEB)
Mathon, C., E-mail: cedric.mathon@edf.fr [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Chaudhary, A. [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Gay, N.; Pitner, P. [EDF Generation, Nuclear Operation Division (UNIE), Saint-Denis (France)
2014-04-01
Electricité de France (EDF) currently operates a total of 58 Nuclear Pressurized Water Reactors (PWR) which are composed of 34 units of 900 MWe, 20 units of 1300 MWe and 4 units of 1450 MWe. This report provides an overall status of SG tube bundles on the 1300 MWe units. These units are 4 loop reactors using the AREVA 68/19 type SG model which are equipped either with Alloy 600 thermally treated (TT) tubes or Alloy 690 TT tubes. As of 2011, the effective full power years of operation (EFPY) ranges from 13 to 20 and during this time, the main degradation mechanisms observed on SG tubes are primary water stress corrosion cracking (PWSCC) and wear at anti-vibration bars (AVB) level. Statistical models have been developed for each type of degradation in order to predict the growth rate and number of affected tubes. Additional plugging is also performed to prevent other degradations such as tube wear due to foreign objects or high-cycle flow-induced fatigue. The contribution of these degradation mechanisms on the rate of tube plugging is described. The results from the statistical models are then used in predicting the long-term life of the steam generators and therefore providing a useful tool toward their effective life management and possible replacement.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data
Directory of Open Access Journals (Sweden)
Alexander P. Kartun-Giles
2018-04-01
Full Text Available A projective network model is a model that enables predictions to be made based on a subsample of the network data, with the predictions remaining unchanged if a larger sample is taken into consideration. An exchangeable model is a model that does not depend on the order in which nodes are sampled. Despite a large variety of non-equilibrium (growing and equilibrium (static sparse complex network models that are widely used in network science, how to reconcile sparseness (constant average degree with the desired statistical properties of projectivity and exchangeability is currently an outstanding scientific problem. Here we propose a network process with hidden variables which is projective and can generate sparse power-law networks. Despite the model not being exchangeable, it can be closely related to exchangeable uncorrelated networks as indicated by its information theory characterization and its network entropy. The use of the proposed network process as a null model is here tested on real data, indicating that the model offers a promising avenue for statistical network modelling.
A generic statistical methodology to predict the maximum pit depth of a localized corrosion process
International Nuclear Information System (INIS)
Jarrah, A.; Bigerelle, M.; Guillemot, G.; Najjar, D.; Iost, A.; Nianga, J.-M.
2011-01-01
Highlights: → We propose a methodology to predict the maximum pit depth in a corrosion process. → Generalized Lambda Distribution and the Computer Based Bootstrap Method are combined. → GLD fit a large variety of distributions both in their central and tail regions. → Minimum thickness preventing perforation can be estimated with a safety margin. → Considering its applications, this new approach can help to size industrial pieces. - Abstract: This paper outlines a new methodology to predict accurately the maximum pit depth related to a localized corrosion process. It combines two statistical methods: the Generalized Lambda Distribution (GLD), to determine a model of distribution fitting with the experimental frequency distribution of depths, and the Computer Based Bootstrap Method (CBBM), to generate simulated distributions equivalent to the experimental one. In comparison with conventionally established statistical methods that are restricted to the use of inferred distributions constrained by specific mathematical assumptions, the major advantage of the methodology presented in this paper is that both the GLD and the CBBM enable a statistical treatment of the experimental data without making any preconceived choice neither on the unknown theoretical parent underlying distribution of pit depth which characterizes the global corrosion phenomenon nor on the unknown associated theoretical extreme value distribution which characterizes the deepest pits. Considering an experimental distribution of depths of pits produced on an aluminium sample, estimations of maximum pit depth using a GLD model are compared to similar estimations based on usual Gumbel and Generalized Extreme Value (GEV) methods proposed in the corrosion engineering literature. The GLD approach is shown having smaller bias and dispersion in the estimation of the maximum pit depth than the Gumbel approach both for its realization and mean. This leads to comparing the GLD approach to the GEV one
Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso
2018-03-01
The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases
Mattonen, Sarah A.; Palma, David A.; Haasbeek, Cornelis J. A.; Senan, Suresh; Ward, Aaron D.
2014-03-01
Benign radiation-induced lung injury is a common finding following stereotactic ablative radiotherapy (SABR) for lung cancer, and is often difficult to differentiate from a recurring tumour due to the ablative doses and highly conformal treatment with SABR. Current approaches to treatment response assessment have shown limited ability to predict recurrence within 6 months of treatment. The purpose of our study was to evaluate the accuracy of second order texture statistics for prediction of eventual recurrence based on computed tomography (CT) images acquired within 6 months of treatment, and compare with the performance of first order appearance and lesion size measures. Consolidative and ground-glass opacity (GGO) regions were manually delineated on post-SABR CT images. Automatic consolidation expansion was also investigated to act as a surrogate for GGO position. The top features for prediction of recurrence were all texture features within the GGO and included energy, entropy, correlation, inertia, and first order texture (standard deviation of density). These predicted recurrence with 2-fold cross validation (CV) accuracies of 70-77% at 2- 5 months post-SABR, with energy, entropy, and first order texture having leave-one-out CV accuracies greater than 80%. Our results also suggest that automatic expansion of the consolidation region could eliminate the need for manual delineation, and produced reproducible results when compared to manually delineated GGO. If validated on a larger data set, this could lead to a clinically useful computer-aided diagnosis system for prediction of recurrence within 6 months of SABR and allow for early salvage therapy for patients with recurrence.
REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH
Directory of Open Access Journals (Sweden)
R. SATISHKUMAR
2017-01-01
Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.
International Nuclear Information System (INIS)
Ferng, Y.-M.; Fan, C.N.; Pei, B.S.; Li, H.-N.
2008-01-01
A steam generator (SG) plays a significant role not only with respect to the primary-to-secondary heat transfer but also as a fission product barrier to prevent the release of radionuclides. Tube plugging is an efficient way to avoid releasing radionuclides when SG tubes are severely degraded. However, this remedial action may cause the decrease of SG heat transfer capability, especially in transient or accident conditions. It is therefore crucial for the plant staff to understand the trend of plugged tubes for the SG operation and maintenance. Statistical methodologies are proposed in this paper to predict this trend. The accumulated numbers of SG plugged tubes versus the operation time are predicted using the Weibull and log-normal distributions, which correspond well with the plant measured data from a selected pressurized water reactor (PWR). With the help of these predictions, the accumulated number of SG plugged tubes can be reasonably extrapolated to the 40-year operation lifetime (or even longer than 40 years) of a PWR. This information can assist the plant policymakers to determine whether or when a SG must be replaced
Laepple, Thomas; Jewson, Stephen; Meagher, Jonathan; O'Shay, Adam; Penzer, Jeremy
2007-01-01
We are developing schemes that predict future hurricane numbers by first predicting future sea surface temperatures (SSTs), and then apply the observed statistical relationship between SST and hurricane numbers. As part of this overall goal, in this study we compare the historical performance of three simple statistical methods for making five-year SST forecasts. We also present SST forecasts for 2006-2010 using these methods and compare them to forecasts made from two structural time series ...
Statistical and Biophysical Models for Predicting Total and Outdoor Water Use in Los Angeles
Mini, C.; Hogue, T. S.; Pincetl, S.
2012-04-01
Modeling water demand is a complex exercise in the choice of the functional form, techniques and variables to integrate in the model. The goal of the current research is to identify the determinants that control total and outdoor residential water use in semi-arid cities and to utilize that information in the development of statistical and biophysical models that can forecast spatial and temporal urban water use. The City of Los Angeles is unique in its highly diverse socio-demographic, economic and cultural characteristics across neighborhoods, which introduces significant challenges in modeling water use. Increasing climate variability also contributes to uncertainties in water use predictions in urban areas. Monthly individual water use records were acquired from the Los Angeles Department of Water and Power (LADWP) for the 2000 to 2010 period. Study predictors of residential water use include socio-demographic, economic, climate and landscaping variables at the zip code level collected from US Census database. Climate variables are estimated from ground-based observations and calculated at the centroid of each zip code by inverse-distance weighting method. Remotely-sensed products of vegetation biomass and landscape land cover are also utilized. Two linear regression models were developed based on the panel data and variables described: a pooled-OLS regression model and a linear mixed effects model. Both models show income per capita and the percentage of landscape areas in each zip code as being statistically significant predictors. The pooled-OLS model tends to over-estimate higher water use zip codes and both models provide similar RMSE values.Outdoor water use was estimated at the census tract level as the residual between total water use and indoor use. This residual is being compared with the output from a biophysical model including tree and grass cover areas, climate variables and estimates of evapotranspiration at very high spatial resolution. A
Recchia, Gabriel L; Louwerse, Max M
2016-11-01
Computational techniques comparing co-occurrences of city names in texts allow the relative longitudes and latitudes of cities to be estimated algorithmically. However, these techniques have not been applied to estimate the provenance of artifacts with unknown origins. Here, we estimate the geographic origin of artifacts from the Indus Valley Civilization, applying methods commonly used in cognitive science to the Indus script. We show that these methods can accurately predict the relative locations of archeological sites on the basis of artifacts of known provenance, and we further apply these techniques to determine the most probable excavation sites of four sealings of unknown provenance. These findings suggest that inscription statistics reflect historical interactions among locations in the Indus Valley region, and they illustrate how computational methods can help localize inscribed archeological artifacts of unknown origin. The success of this method offers opportunities for the cognitive sciences in general and for computational anthropology specifically. Copyright © 2015 Cognitive Science Society, Inc.
International Nuclear Information System (INIS)
Yokoyama, Masayuki
2014-01-01
A statistical approach is proposed to predict thermal diffusivity profiles as a transport “model” in fusion plasmas. It can provide regression expressions for the ion and electron heat diffusivities (χ i and χ e ), separately, to construct their radial profiles. An approach that this letter is proposing outstrips the conventional scaling laws for the global confinement time (τ E ) since it also deals with profiles (temperature, density, heating depositions etc.). This approach has become possible with the analysis database accumulated by the extensive application of the integrated transport analysis suite to experiment data. In this letter, TASK3D-a analysis database for high-ion-temperature (high-T i ) plasmas in the LHD (Large Helical Device) is used as an example to describe an approach. (author)
Shape-correlated deformation statistics for respiratory motion prediction in 4D lung
Liu, Xiaoxiao; Oguz, Ipek; Pizer, Stephen M.; Mageras, Gig S.
2010-02-01
4D image-guided radiation therapy (IGRT) for free-breathing lungs is challenging due to the complicated respiratory dynamics. Effective modeling of respiratory motion is crucial to account for the motion affects on the dose to tumors. We propose a shape-correlated statistical model on dense image deformations for patient-specic respiratory motion estimation in 4D lung IGRT. Using the shape deformations of the high-contrast lungs as the surrogate, the statistical model trained from the planning CTs can be used to predict the image deformation during delivery verication time, with the assumption that the respiratory motion at both times are similar for the same patient. Dense image deformation fields obtained by diffeomorphic image registrations characterize the respiratory motion within one breathing cycle. A point-based particle optimization algorithm is used to obtain the shape models of lungs with group-wise surface correspondences. Canonical correlation analysis (CCA) is adopted in training to maximize the linear correlation between the shape variations of the lungs and the corresponding dense image deformations. Both intra- and inter-session CT studies are carried out on a small group of lung cancer patients and evaluated in terms of the tumor location accuracies. The results suggest potential applications using the proposed method.
International Nuclear Information System (INIS)
Yang Ming; Xiao Gang; Peng Youlin
2010-01-01
Objective: to investigate the therapeutic efficacy of multiple-point injection with absolute alcohol and pinyangmycin under digital fluoroscopic guidance for superficial venous malformations. Methods: By using a disposal venous transfusion needle the superficial venous malformation was punctured and then contrast media lohexol was injected in to visualize the tumor body, which was followed by the injection of ethanol and pinyangmycin when the needle was confirmed in the correct position. The procedure was successfully performed in 31 patients. The clinical results were observed and analyzed. Results: After one treatment complete cure was achieved in 21 cases and marked effect was obtained in 8 cases, with a total effectiveness of 93.5%. Conclusion: Multiple-point injection with ethanol and pinyangmycin under digital fluoroscopic guidance is an effective and safe technique for the treatment of superficial venous malformations, especially for the lesions that are deeply located and ill-defined. (authors)
Wang, Tongfei; Kang, Xiaomin; He, Liying; Liu, Zhilan; Xu, Haijing; Zhao, Aimin
2017-09-01
To establish a statistical model to predict thrombophilia in patients with unexplained recurrent pregnancy loss (URPL). A retrospective case-control study was conducted at Ren Ji Hospital, Shanghai, China, from March 2014 to October 2016. The levels of D-dimer (DD), fibrinogen degradation products (FDP), activated partial thromboplastin time (APTT), prothrombin time (PT), thrombin time (TT), fibrinogen (Fg), and platelet aggregation in response to arachidonic acid (AA) and adenosine diphosphate (ADP) were collected. Receiver operating characteristic curve analysis was used to analyze data from 158 UPRL patients (≥3 previous first trimester pregnancy losses with unexplained etiology) and 131 non-RPL patients (no history of recurrent pregnancy loss). A logistic regression model (LRM) was built and the model was externally validated in another group of patients. The LRM included AA, DD, FDP, TT, APTT, and PT. The overall accuracy of the LRM was 80.9%, with sensitivity and specificity of 78.5% and 78.3%, respectively. The diagnostic threshold of the possibility of the LRM was 0.6492, with a sensitivity of 78.5% and a specificity of 78.3%. Subsequently, the LRM was validated with an overall accuracy of 83.6%. The LRM is a valuable model for prediction of thrombophilia in URPL patients. © 2017 International Federation of Gynecology and Obstetrics.
Development of the statistical ARIMA model: an application for predicting the upcoming of MJO index
Hermawan, Eddy; Nurani Ruchjana, Budi; Setiawan Abdullah, Atje; Gede Nyoman Mindra Jaya, I.; Berliana Sipayung, Sinta; Rustiana, Shailla
2017-10-01
This study is mainly concerned in development one of the most important equatorial atmospheric phenomena that we call as the Madden Julian Oscillation (MJO) which having strong impacts to the extreme rainfall anomalies over the Indonesian Maritime Continent (IMC). In this study, we focused to the big floods over Jakarta and surrounded area that suspecting caused by the impacts of MJO. We concentrated to develop the MJO index using the statistical model that we call as Box-Jenkis (ARIMA) ini 1996, 2002, and 2007, respectively. They are the RMM (Real Multivariate MJO) index as represented by RMM1 and RMM2, respectively. There are some steps to develop that model, starting from identification of data, estimated, determined model, before finally we applied that model for investigation some big floods that occurred at Jakarta in 1996, 2002, and 2007 respectively. We found the best of estimated model for the RMM1 and RMM2 prediction is ARIMA (2,1,2). Detailed steps how that model can be extracted and applying to predict the rainfall anomalies over Jakarta for 3 to 6 months later is discussed at this paper.
International Nuclear Information System (INIS)
Bogachev, Mikhail I; Bunde, Armin; Kireenkov, Igor S; Nifontov, Eugene M
2009-01-01
We study the statistics of return intervals between large heartbeat intervals (above a certain threshold Q) in 24 h records obtained from healthy subjects. We find that both the linear and the nonlinear long-term memory inherent in the heartbeat intervals lead to power-laws in the probability density function P Q (r) of the return intervals. As a consequence, the probability W Q (t; Δt) that at least one large heartbeat interval will occur within the next Δt heartbeat intervals, with an increasing elapsed number of intervals t after the last large heartbeat interval, follows a power-law. Based on these results, we suggest a method of obtaining a priori information about the occurrence of the next large heartbeat interval, and thus to predict it. We show explicitly that the proposed method, which exploits long-term memory, is superior to the conventional precursory pattern recognition technique, which focuses solely on short-term memory. We believe that our results can be straightforwardly extended to obtain more reliable predictions in other physiological signals like blood pressure, as well as in other complex records exhibiting multifractal behaviour, e.g. turbulent flow, precipitation, river flows and network traffic.
Directory of Open Access Journals (Sweden)
Mark V Albert
2012-12-01
Full Text Available Due to multiple factors such as fatigue, muscle strengthening, and neural plasticity, the responsiveness of the motor apparatus to neural commands changes over time. To enable precise movements the nervous system must adapt to compensate for these changes. Recent models of motor adaptation derive from assumptions about the way the motor apparatus changes. Characterizing these changes is difficult because motor adaptation happens at the same time, masking most of the effects of ongoing changes. Here, we analyze eye movements of monkeys with lesions to the posterior cerebellar vermis that impair adaptation. Their fluctuations better reveal the underlying changes of the motor system over time. When these measured, unadapted changes are used to derive optimal motor adaptation rules the prediction precision significantly improves. Among three models that similarly fit single-day adaptation results, the model that also matches the temporal correlations of the nonadapting saccades most accurately predicts multiple day adaptation. Saccadic gain adaptation is well matched to the natural statistics of fluctuations of the oculomotor plant.
A RANS knock model to predict the statistical occurrence of engine knock
International Nuclear Information System (INIS)
D'Adamo, Alessandro; Breda, Sebastiano; Fontanesi, Stefano; Irimescu, Adrian; Merola, Simona Silvia; Tornatore, Cinzia
2017-01-01
Highlights: • Development of a new RANS model for SI engine knock probability. • Turbulence-derived transport equations for variances of mixture fraction and enthalpy. • Gasoline autoignition delay times calculated from detailed chemical kinetics. • Knock probability validated against experiments on optically accessible GDI unit. • PDF-based knock model accounting for the random nature of SI engine knock in RANS simulations. - Abstract: In the recent past engine knock emerged as one of the main limiting aspects for the achievement of higher efficiency targets in modern spark-ignition (SI) engines. To attain these requirements, engine operating points must be moved as close as possible to the onset of abnormal combustions, although the turbulent nature of flow field and SI combustion leads to possibly ample fluctuations between consecutive engine cycles. This forces engine designers to distance the target condition from its theoretical optimum in order to prevent abnormal combustion, which can potentially damage engine components because of few individual heavy-knocking cycles. A statistically based RANS knock model is presented in this study, whose aim is the prediction not only of the ensemble average knock occurrence, poorly meaningful in such a stochastic event, but also of a knock probability. The model is based on look-up tables of autoignition times from detailed chemistry, coupled with transport equations for the variance of mixture fraction and enthalpy. The transported perturbations around the ensemble average value are based on variable gradients and on a local turbulent time scale. A multi-variate cell-based Gaussian-PDF model is proposed for the unburnt mixture, resulting in a statistical distribution for the in-cell reaction rate. An average knock precursor and its variance are independently calculated and transported; this results in the prediction of an earliest knock probability preceding the ensemble average knock onset, as confirmed by
Blashfield, Roger K; Fuller, A Kenneth
2016-06-01
Twenty years ago, slightly after the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition was published, we predicted the characteristics of the future Diagnostic and Statistical Manual of Mental Disorders (fifth edition) (). Included in our predictions were how many diagnoses it would contain, the physical size of the Diagnostic and Statistical Manual of Mental Disorders (fifth edition), who its leader would be, how many professionals would be involved in creating it, the revenue generated, and the color of its cover. This article reports on the accuracy of our predictions. Our largest prediction error concerned financial revenue. The earnings growth of the DSM's has been remarkable. Drug company investments, insurance benefits, the financial need of the American Psychiatric Association, and the research grant process are factors that have stimulated the growth of the DSM's. Restoring order and simplicity to the classification of mental disorders will not be a trivial task.
Mann, Michael E.; Steinman, Byron A.; Miller, Sonya K.; Frankcombe, Leela M.; England, Matthew H.; Cheung, Anson H.
2016-04-01
The temporary slowdown in large-scale surface warming during the early 2000s has been attributed to both external and internal sources of climate variability. Using semiempirical estimates of the internal low-frequency variability component in Northern Hemisphere, Atlantic, and Pacific surface temperatures in concert with statistical hindcast experiments, we investigate whether the slowdown and its recent recovery were predictable. We conclude that the internal variability of the North Pacific, which played a critical role in the slowdown, does not appear to have been predictable using statistical forecast methods. An additional minor contribution from the North Atlantic, by contrast, appears to exhibit some predictability. While our analyses focus on combining semiempirical estimates of internal climatic variability with statistical hindcast experiments, possible implications for initialized model predictions are also discussed.
Ratner, Bruce
2011-01-01
The second edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. The first edition, titled Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, contained 17 chapters of innovative and practical statistical data mining techniques. In this second edition, renamed to reflect the increased coverage of machine-learning data mining techniques, the author has
Apel, Heiko; Gafurov, Abror; Gerlitz, Lars; Unger-Shayesteh, Katy; Vorogushyn, Sergiy; Merkushkin, Aleksandr; Merz, Bruno
2016-04-01
The semi-arid regions of Central Asia crucially depend on the water resources supplied by the mountainous areas of the Tien-Shan and Pamirs. During the summer months the snow and glacier melt water of the rivers originating in the mountains provides the only water resource available for agricultural production but also for water collection in reservoirs for energy production in winter months. Thus a reliable seasonal forecast of the water resources is crucial for a sustainable management and planning of water resources.. In fact, seasonal forecasts are mandatory tasks of national hydro-meteorological services in the region. Thus this study aims at a statistical forecast of the seasonal water availability, whereas the focus is put on the usage of freely available data in order to facilitate an operational use without data access limitations. The study takes the Naryn basin as a test case, at which outlet the Toktogul reservoir stores the discharge of the Naryn River. As most of the water originates form snow and glacier melt, a statistical forecast model should use data sets that can serve as proxy data for the snow masses and snow water equivalent in late spring, which essentially determines the bulk of the seasonal discharge. CRU climate data describing the precipitation and temperature in the basin during winter and spring was used as base information, which was complemented by MODIS snow cover data processed through ModSnow tool, discharge during the spring and also GRACE gravimetry anomalies. For the construction of linear forecast models monthly as well as multi-monthly means over the period January to April were used to predict the seasonal mean discharge of May-September at the station Uchterek. An automatic model selection was performed in multiple steps, whereas the best models were selected according to several performance measures and their robustness in a leave-one-out cross validation. It could be shown that the seasonal discharge can be predicted with
Directory of Open Access Journals (Sweden)
Tsafnat Guy
2011-04-01
Full Text Available Abstract Background The identification of drug characteristics is a clinically important task, but it requires much expert knowledge and consumes substantial resources. We have developed a statistical text-mining approach (BInary Characteristics Extractor and biomedical Properties Predictor: BICEPP to help experts screen drugs that may have important clinical characteristics of interest. Results BICEPP first retrieves MEDLINE abstracts containing drug names, then selects tokens that best predict the list of drugs which represents the characteristic of interest. Machine learning is then used to classify drugs using a document frequency-based measure. Evaluation experiments were performed to validate BICEPP's performance on 484 characteristics of 857 drugs, identified from the Australian Medicines Handbook (AMH and the PharmacoKinetic Interaction Screening (PKIS database. Stratified cross-validations revealed that BICEPP was able to classify drugs into all 20 major therapeutic classes (100% and 157 (of 197 minor drug classes (80% with areas under the receiver operating characteristic curve (AUC > 0.80. Similarly, AUC > 0.80 could be obtained in the classification of 173 (of 238 adverse events (73%, up to 12 (of 15 groups of clinically significant cytochrome P450 enzyme (CYP inducers or inhibitors (80%, and up to 11 (of 14 groups of narrow therapeutic index drugs (79%. Interestingly, it was observed that the keywords used to describe a drug characteristic were not necessarily the most predictive ones for the classification task. Conclusions BICEPP has sufficient classification power to automatically distinguish a wide range of clinical properties of drugs. This may be used in pharmacovigilance applications to assist with rapid screening of large drug databases to identify important characteristics for further evaluation.
Predictive statistical modelling of cadmium content in durum wheat grain based on soil parameters.
Viala, Yoann; Laurette, Julien; Denaix, Laurence; Gourdain, Emmanuelle; Méléard, Benoit; Nguyen, Christophe; Schneider, André; Sappin-Didier, Valérie
2017-09-01
Regulatory limits on cadmium (Cd) content in food products are tending to become stricter, especially in cereals, which are a major contributor to dietary intake of Cd by humans. This is of particular importance for durum wheat, which accumulates more Cd than bread wheat. The contamination of durum wheat grain by Cd depends not only on the genotype but also to a large extent on soil Cd availability. Assessing the phytoavailability of Cd for durum wheat is thus crucial, and appropriate methods are required. For this purpose, we propose a statistical model to predict Cd accumulation in durum wheat grain based on soil geochemical properties related to Cd availability in French agricultural soils with low Cd contents and neutral to alkaline pH (soils commonly used to grow durum wheat). The best model is based on the concentration of total Cd in the soil solution, the pH of a soil CaCl 2 extract, the cation exchange capacity (CEC), and the content of manganese oxides (Tamm's extraction) in the soil. The model variables suggest a major influence of cadmium buffering power of the soil and of Cd speciation in solution. The model successfully explains 88% of Cd variability in grains with, generally, below 0.02 mg Cd kg -1 prediction error in wheat grain. Monte Carlo cross-validation indicated that model accuracy will suffice for the European Community project to reduce the regulatory limit from 0.2 to 0.15 mg Cd kg -1 grain, but not for the intermediate step at 0.175 mg Cd kg -1 . The model will help farmers assess the risk that the Cd content of their durum wheat grain will exceed regulatory limits, and help food safety authorities test different regulatory thresholds to find a trade-off between food safety and the negative impact a too strict regulation could have on farmers.
Pedretti, Daniele; Bianchi, Marco
2018-03-01
Breakthrough curves (BTCs) observed during tracer tests in highly heterogeneous aquifers display strong tailing. Power laws are popular models for both the empirical fitting of these curves, and the prediction of transport using upscaling models based on best-fitted estimated parameters (e.g. the power law slope or exponent). The predictive capacity of power law based upscaling models can be however questioned due to the difficulties to link model parameters with the aquifers' physical properties. This work analyzes two aspects that can limit the use of power laws as effective predictive tools: (a) the implication of statistical subsampling, which often renders power laws undistinguishable from other heavily tailed distributions, such as the logarithmic (LOG); (b) the difficulties to reconcile fitting parameters obtained from models with different formulations, such as the presence of a late-time cutoff in the power law model. Two rigorous and systematic stochastic analyses, one based on benchmark distributions and the other on BTCs obtained from transport simulations, are considered. It is found that a power law model without cutoff (PL) results in best-fitted exponents (αPL) falling in the range of typical experimental values reported in the literature (1.5 tailing becomes heavier. Strong fluctuations occur when the number of samples is limited, due to the effects of subsampling. On the other hand, when the power law model embeds a cutoff (PLCO), the best-fitted exponent (αCO) is insensitive to the degree of tailing and to the effects of subsampling and tends to a constant αCO ≈ 1. In the PLCO model, the cutoff rate (λ) is the parameter that fully reproduces the persistence of the tailing and is shown to be inversely correlated to the LOG scale parameter (i.e. with the skewness of the distribution). The theoretical results are consistent with the fitting analysis of a tracer test performed during the MADE-5 experiment. It is shown that a simple
Nacher, Jose C; Ochiai, Tomoshiro
2012-05-01
Increasingly accessible financial data allow researchers to infer market-dynamics-based laws and to propose models that are able to reproduce them. In recent years, several stylized facts have been uncovered. Here we perform an extensive analysis of foreign exchange data that leads to the unveiling of a statistical financial law. First, our findings show that, on average, volatility increases more when the price exceeds the highest (or lowest) value, i.e., breaks the resistance line. We call this the breaking-acceleration effect. Second, our results show that the probability P(T) to break the resistance line in the past time T follows power law in both real data and theoretically simulated data. However, the probability calculated using real data is rather lower than the one obtained using a traditional Black-Scholes (BS) model. Taken together, the present analysis characterizes a different stylized fact of financial markets and shows that the market exceeds a past (historical) extreme price fewer times than expected by the BS model (the resistance effect). However, when the market does, we predict that the average volatility at that time point will be much higher. These findings indicate that any Markovian model does not faithfully capture the market dynamics.
Tarasova, Irina A; Goloborodko, Anton A; Perlova, Tatyana Y; Pridatchenko, Marina L; Gorshkov, Alexander V; Evreinov, Victor V; Ivanov, Alexander R; Gorshkov, Mikhail V
2015-07-07
The theory of critical chromatography for biomacromolecules (BioLCCC) describes polypeptide retention in reversed-phase HPLC using the basic principles of statistical thermodynamics. However, whether this theory correctly depicts a variety of empirical observations and laws introduced for peptide chromatography over the last decades remains to be determined. In this study, by comparing theoretical results with experimental data, we demonstrate that the BioLCCC: (1) fits the empirical dependence of the polypeptide retention on the amino acid sequence length with R(2) > 0.99 and allows in silico determination of the linear regression coefficients of the log-length correction in the additive model for arbitrary sequences and lengths and (2) predicts the distribution coefficients of polypeptides with an accuracy from 0.98 to 0.99 R(2). The latter enables direct calculation of the retention factors for given solvent compositions and modeling of the migration dynamics of polypeptides separated under isocratic or gradient conditions. The obtained results demonstrate that the suggested theory correctly relates the main aspects of polypeptide separation in reversed-phase HPLC.
Xie, Dexuan; Volkmer, Hans W.; Ying, Jinyong
2016-04-01
The nonlocal dielectric approach has led to new models and solvers for predicting electrostatics of proteins (or other biomolecules), but how to validate and compare them remains a challenge. To promote such a study, in this paper, two typical nonlocal dielectric models are revisited. Their analytical solutions are then found in the expressions of simple series for a dielectric sphere containing any number of point charges. As a special case, the analytical solution of the corresponding Poisson dielectric model is also derived in simple series, which significantly improves the well known Kirkwood's double series expansion. Furthermore, a convolution of one nonlocal dielectric solution with a commonly used nonlocal kernel function is obtained, along with the reaction parts of these local and nonlocal solutions. To turn these new series solutions into a valuable research tool, they are programed as a free fortran software package, which can input point charge data directly from a protein data bank file. Consequently, different validation tests can be quickly done on different proteins. Finally, a test example for a protein with 488 atomic charges is reported to demonstrate the differences between the local and nonlocal models as well as the importance of using the reaction parts to develop local and nonlocal dielectric solvers.
International Nuclear Information System (INIS)
Lee, Jae Bong; Park, Jae Hak; Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong
2005-01-01
The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained
Energy Technology Data Exchange (ETDEWEB)
Lee, Jae Bong; Park, Jae Hak [Chungbuk National Univ., Cheongju (Korea, Republic of); Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong [Korea Electtric Power Research Institute, Daejeon (Korea, Republic of)
2005-07-01
The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained.
Díaz, Zuleyka; Segovia, María Jesús; Fernández, José
2005-01-01
Prediction of insurance companies insolvency has arisen as an important problem in the field of financial research. Most methods applied in the past to tackle this issue are traditional statistical techniques which use financial ratios as explicative variables. However, these variables often do not satisfy statistical assumptions, which complicates the application of the mentioned methods. In this paper, a comparative study of the performance of two non-parametric machine learning techniques ...
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2011-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Directory of Open Access Journals (Sweden)
Wenbo Duan
2017-12-01
Full Text Available Ultrasonic guided waves are widely used to inspect and monitor the structural integrity of plates and plate-like structures, such as ship hulls and large storage-tank floors. Recently, ultrasonic guided waves have also been used to remove ice and fouling from ship hulls, wind-turbine blades and aeroplane wings. In these applications, the strength of the sound source must be high for scanning a large area, or to break the bond between ice, fouling and plate substrate. More than one transducer may be used to achieve maximum sound power output. However, multiple sources can interact with each other, and form a sound field in the structure with local constructive and destructive regions. Destructive regions are weak regions and shall be avoided. When multiple transducers are used it is important that they are arranged in a particular way so that the desired wave modes can be excited to cover the whole structure. The objective of this paper is to provide a theoretical basis for generating particular wave mode patterns in finite-width rectangular plates whose length is assumed to be infinitely long with respect to its width and thickness. The wave modes have displacements in both width and thickness directions, and are thus different from the classical Lamb-type wave modes. A two-dimensional semi-analytical finite element (SAFE method was used to study dispersion characteristics and mode shapes in the plate up to ultrasonic frequencies. The modal analysis provided information on the generation of modes suitable for a particular application. The number of point sources and direction of loading for the excitation of a few representative modes was investigated. Based on the SAFE analysis, a standard finite element modelling package, Abaqus, was used to excite the designed modes in a three-dimensional plate. The generated wave patterns in Abaqus were then compared with mode shapes predicted in the SAFE model. Good agreement was observed between the
Zielke, Olaf; McDougall, Damon; Mai, Martin; Babuska, Ivo
2014-05-01
Seismic, often augmented with geodetic data, are frequently used to invert for the spatio-temporal evolution of slip along a rupture plane. The resulting images of the slip evolution for a single event, inferred by different research teams, often vary distinctly, depending on the adopted inversion approach and rupture model parameterization. This observation raises the question, which of the provided kinematic source inversion solutions is most reliable and most robust, and — more generally — how accurate are fault parameterization and solution predictions? These issues are not included in "standard" source inversion approaches. Here, we present a statistical inversion approach to constrain kinematic rupture parameters from teleseismic body waves. The approach is based a) on a forward-modeling scheme that computes synthetic (body-)waves for a given kinematic rupture model, and b) on the QUESO (Quantification of Uncertainty for Estimation, Simulation, and Optimization) library that uses MCMC algorithms and Bayes theorem for sample selection. We present Bayesian inversions for rupture parameters in synthetic earthquakes (i.e. for which the exact rupture history is known) in an attempt to identify the cross-over at which further model discretization (spatial and temporal resolution of the parameter space) is no longer attributed to a decreasing misfit. Identification of this cross-over is of importance as it reveals the resolution power of the studied data set (i.e. teleseismic body waves), enabling one to constrain kinematic earthquake rupture histories of real earthquakes at a resolution that is supported by data. In addition, the Bayesian approach allows for mapping complete posterior probability density functions of the desired kinematic source parameters, thus enabling us to rigorously assess the uncertainties in earthquake source inversions.
SU-E-T-205: MLC Predictive Maintenance Using Statistical Process Control Analysis.
Able, C; Hampton, C; Baydush, A; Bright, M
2012-06-01
MLC failure increases accelerator downtime and negatively affects the clinic treatment delivery schedule. This study investigates the use of Statistical Process Control (SPC), a modern quality control methodology, to retrospectively evaluate MLC performance data thereby predicting the impending failure of individual MLC leaves. SPC, a methodology which detects exceptional variability in a process, was used to analyze MLC leaf velocity data. A MLC velocity test is performed weekly on all leaves during morning QA. The leaves sweep 15 cm across the radiation field with the gantry pointing down. The leaf speed is analyzed from the generated dynalog file using quality assurance software. MLC leaf speeds in which a known motor failure occurred (8) and those in which no motor replacement was performed (11) were retrospectively evaluated for a 71 week period. SPC individual and moving range (I/MR) charts were used in the analysis. The I/MR chart limits were calculated using the first twenty weeks of data and set at 3 standard deviations from the mean. The MLCs in which a motor failure occurred followed two general trends: (a) no data indicating a change in leaf speed prior to failure (5 of 8) and (b) a series of data points exceeding the limit prior to motor failure (3 of 8). I/MR charts for a high percentage (8 of 11) of the non-replaced MLC motors indicated that only a single point exceeded the limit. These single point excesses were deemed false positives. SPC analysis using MLC performance data may be helpful in detecting a significant percentage of impending failures of MLC motors. The ability to detect MLC failure may depend on the method of failure (i.e. gradual or catastrophic). Further study is needed to determine if increasing the sampling frequency could increase reliability. Project was support by a grant from Varian Medical Systems, Inc. © 2012 American Association of Physicists in Medicine.
2017-09-01
application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , ) (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations
Directory of Open Access Journals (Sweden)
M.M. Mohie El-Din
2011-10-01
Full Text Available In this paper, two sample Bayesian prediction intervals for order statistics (OS are obtained. This prediction is based on a certain class of the inverse exponential-type distributions using a right censored sample. A general class of prior density functions is used and the predictive cumulative function is obtained in the two samples case. The class of the inverse exponential-type distributions includes several important distributions such the inverse Weibull distribution, the inverse Burr distribution, the loglogistic distribution, the inverse Pareto distribution and the inverse paralogistic distribution. Special cases of the inverse Weibull model such as the inverse exponential model and the inverse Rayleigh model are considered.
Bustamante, Javier; Seoane, Javier
2004-01-01
Aim To test the effectiveness of statistical models based on explanatory environmental variables vs. existing distribution information (maps and breeding atlas), for predicting the distribution of four species of raptors (family Accipitridae): common buzzard Buteo buteo (Linnaeus, 1758), short-toed eagle Circaetus gallicus (Gmelin, 1788), booted eagle Hieraaetus pennatus (Gmelin, 1788) and black kite Milvus migrans (Boddaert, 1783). Location Andalusia, southe...
Grosveld, Ferdinand W.
1990-01-01
The feasibility of predicting interior noise due to random acoustic or turbulent boundary layer excitation was investigated in experiments in which a statistical energy analysis model (VAPEPS) was used to analyze measurements of the acceleration response and sound transmission of flat aluminum, lucite, and graphite/epoxy plates exposed to random acoustic or turbulent boundary layer excitation. The noise reduction of the plate, when backed by a shallow cavity and excited by a turbulent boundary layer, was predicted using a simplified theory based on the assumption of adiabatic compression of the fluid in the cavity. The predicted plate acceleration response was used as input in the noise reduction prediction. Reasonable agreement was found between the predictions and the measured noise reduction in the frequency range 315-1000 Hz.
Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu
2015-05-27
Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.
Walz, M. A.; Donat, M.; Leckebusch, G. C.
2017-12-01
As extreme wind speeds are responsible for large socio-economic losses in Europe, a skillful prediction would be of great benefit for disaster prevention as well as for the actuarial community. Here we evaluate patterns of large-scale atmospheric variability and the seasonal predictability of extreme wind speeds (e.g. >95th percentile) in the European domain in the dynamical seasonal forecast system ECMWF System 4, and compare to the predictability based on a statistical prediction model. The dominant patterns of atmospheric variability show distinct differences between reanalysis and ECMWF System 4, with most patterns in System 4 extended downstream in comparison to ERA-Interim. The dissimilar manifestations of the patterns within the two models lead to substantially different drivers associated with the occurrence of extreme winds in the respective model. While the ECMWF System 4 is shown to provide some predictive power over Scandinavia and the eastern Atlantic, only very few grid cells in the European domain have significant correlations for extreme wind speeds in System 4 compared to ERA-Interim. In contrast, a statistical model predicts extreme wind speeds during boreal winter in better agreement with the observations. Our results suggest that System 4 does not seem to capture the potential predictability of extreme winds that exists in the real world, and therefore fails to provide reliable seasonal predictions for lead months 2-4. This is likely related to the unrealistic representation of large-scale patterns of atmospheric variability. Hence our study points to potential improvements of dynamical prediction skill by improving the simulation of large-scale atmospheric dynamics.
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
Karacan, C. Özgen; Olea, Ricardo A.
2013-01-01
Coal seam degasification and its success are important for controlling methane, and thus for the health and safety of coal miners. During the course of degasification, properties of coal seams change. Thus, the changes in coal reservoir conditions and in-place gas content as well as methane emission potential into mines should be evaluated by examining time-dependent changes and the presence of major heterogeneities and geological discontinuities in the field. In this work, time-lapsed reservoir and fluid storage properties of the New Castle coal seam, Mary Lee/Blue Creek seam, and Jagger seam of Black Warrior Basin, Alabama, were determined from gas and water production history matching and production forecasting of vertical degasification wellbores. These properties were combined with isotherm and other important data to compute gas-in-place (GIP) and its change with time at borehole locations. Time-lapsed training images (TIs) of GIP and GIP difference corresponding to each coal and date were generated by using these point-wise data and Voronoi decomposition on the TI grid, which included faults as discontinuities for expansion of Voronoi regions. Filter-based multiple-point geostatistical simulations, which were preferred in this study due to anisotropies and discontinuities in the area, were used to predict time-lapsed GIP distributions within the study area. Performed simulations were used for mapping spatial time-lapsed methane quantities as well as their uncertainties within the study area.
Chemical agnostic hazard prediction: Statistical inference of toxicity pathways - data for Figure 2
U.S. Environmental Protection Agency — This dataset comprises one SigmaPlot 13 file containing measured survival data and survival data predicted from the model coefficients selected by the LASSO...
Basic Mathematics Test Predicts Statistics Achievement and Overall First Year Academic Success
Fonteyne, Lot; De Fruyt, Filip; Dewulf, Nele; Duyck, Wouter; Erauw, Kris; Goeminne, Katy; Lammertyn, Jan; Marchant, Thierry; Moerkerke, Beatrijs; Oosterlinck, Tom; Rosseel, Yves
2015-01-01
In the psychology and educational science programs at Ghent University, only 36.1% of the new incoming students in 2011 and 2012 passed all exams. Despite availability of information, many students underestimate the scientific character of social science programs. Statistics courses are a major obstacle in this matter. Not all enrolling students…
The Rhetoric of Investment Theory : The Story of Statistics and Predictability
T. Pistorius (Thomas)
2016-01-01
markdownabstractUncertainty is a feeling of anxiety and a part of culture since the dawn of civilization. Civilizations have invented numerous ways to cope with uncertainty, statistics is one of those technologies. The rhetoric as the discourse of investment theory uncovers that the theory of
Predicting uncertainty in future marine ice sheet volume using Bayesian statistical methods
Davis, A. D.
2015-12-01
The marine ice instability can trigger rapid retreat of marine ice streams. Recent observations suggest that marine ice systems in West Antarctica have begun retreating. However, unknown ice dynamics, computationally intensive mathematical models, and uncertain parameters in these models make predicting retreat rate and ice volume difficult. In this work, we fuse current observational data with ice stream/shelf models to develop probabilistic predictions of future grounded ice sheet volume. Given observational data (e.g., thickness, surface elevation, and velocity) and a forward model that relates uncertain parameters (e.g., basal friction and basal topography) to these observations, we use a Bayesian framework to define a posterior distribution over the parameters. A stochastic predictive model then propagates uncertainties in these parameters to uncertainty in a particular quantity of interest (QoI)---here, the volume of grounded ice at a specified future time. While the Bayesian approach can in principle characterize the posterior predictive distribution of the QoI, the computational cost of both the forward and predictive models makes this effort prohibitively expensive. To tackle this challenge, we introduce a new Markov chain Monte Carlo method that constructs convergent approximations of the QoI target density in an online fashion, yielding accurate characterizations of future ice sheet volume at significantly reduced computational cost.Our second goal is to attribute uncertainty in these Bayesian predictions to uncertainties in particular parameters. Doing so can help target data collection, for the purpose of constraining the parameters that contribute most strongly to uncertainty in the future volume of grounded ice. For instance, smaller uncertainties in parameters to which the QoI is highly sensitive may account for more variability in the prediction than larger uncertainties in parameters to which the QoI is less sensitive. We use global sensitivity
Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu
2015-09-01
Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Predicting the lung compliance of mechanically ventilated patients via statistical modeling
International Nuclear Information System (INIS)
Ganzert, Steven; Kramer, Stefan; Guttmann, Josef
2012-01-01
To avoid ventilator associated lung injury (VALI) during mechanical ventilation, the ventilator is adjusted with reference to the volume distensibility or ‘compliance’ of the lung. For lung-protective ventilation, the lung should be inflated at its maximum compliance, i.e. when during inspiration a maximal intrapulmonary volume change is achieved by a minimal change of pressure. To accomplish this, one of the main parameters is the adjusted positive end-expiratory pressure (PEEP). As changing the ventilator settings usually produces an effect on patient's lung mechanics with a considerable time delay, the prediction of the compliance change associated with a planned change of PEEP could assist the physician at the bedside. This study introduces a machine learning approach to predict the nonlinear lung compliance for the individual patient by Gaussian processes, a probabilistic modeling technique. Experiments are based on time series data obtained from patients suffering from acute respiratory distress syndrome (ARDS). With a high hit ratio of up to 93%, the learned models could predict whether an increase/decrease of PEEP would lead to an increase/decrease of the compliance. However, the prediction of the complete pressure–volume relation for an individual patient has to be improved. We conclude that the approach is well suitable for the given problem domain but that an individualized feature selection should be applied for a precise prediction of individual pressure–volume curves. (paper)
DEFF Research Database (Denmark)
Rosthøj, Susanne; Keiding, Niels
2004-01-01
When studying a regression model measures of explained variation are used to assess the degree to which the covariates determine the outcome of interest. Measures of predictive accuracy are used to assess the accuracy of the predictions based on the covariates and the regression model. We give a ...... a detailed and general introduction to the two measures and the estimation procedures. The framework we set up allows for a study of the effect of misspecification on the quantities estimated. We also introduce a generalization to survival analysis....
Kovalevsky, Louis; Langley, Robin S.; Caro, Stephane
2016-05-01
Due to the high cost of experimental EMI measurements significant attention has been focused on numerical simulation. Classical methods such as Method of Moment or Finite Difference Time Domain are not well suited for this type of problem, as they require a fine discretisation of space and failed to take into account uncertainties. In this paper, the authors show that the Statistical Energy Analysis is well suited for this type of application. The SEA is a statistical approach employed to solve high frequency problems of electromagnetically reverberant cavities at a reduced computational cost. The key aspects of this approach are (i) to consider an ensemble of system that share the same gross parameter, and (ii) to avoid solving Maxwell's equations inside the cavity, using the power balance principle. The output is an estimate of the field magnitude distribution in each cavity. The method is applied on a typical aircraft structure.
A statistical framework for evaluating neural networks to predict recurrent events in breast cancer
Gorunescu, Florin; Gorunescu, Marina; El-Darzi, Elia; Gorunescu, Smaranda
2010-07-01
Breast cancer is the second leading cause of cancer deaths in women today. Sometimes, breast cancer can return after primary treatment. A medical diagnosis of recurrent cancer is often a more challenging task than the initial one. In this paper, we investigate the potential contribution of neural networks (NNs) to support health professionals in diagnosing such events. The NN algorithms are tested and applied to two different datasets. An extensive statistical analysis has been performed to verify our experiments. The results show that a simple network structure for both the multi-layer perceptron and radial basis function can produce equally good results, not all attributes are needed to train these algorithms and, finally, the classification performances of all algorithms are statistically robust. Moreover, we have shown that the best performing algorithm will strongly depend on the features of the datasets, and hence, there is not necessarily a single best classifier.
Pokhrel, Prafulla; Wang, Q. J.; Robertson, David E.
2013-10-01
Seasonal streamflow forecasts are valuable for planning and allocation of water resources. In Australia, the Bureau of Meteorology employs a statistical method to forecast seasonal streamflows. The method uses predictors that are related to catchment wetness at the start of a forecast period and to climate during the forecast period. For the latter, a predictor is selected among a number of lagged climate indices as candidates to give the "best" model in terms of model performance in cross validation. This study investigates two strategies for further improvement in seasonal streamflow forecasts. The first is to combine, through Bayesian model averaging, multiple candidate models with different lagged climate indices as predictors, to take advantage of different predictive strengths of the multiple models. The second strategy is to introduce additional candidate models, using rainfall and sea surface temperature predictions from a global climate model as predictors. This is to take advantage of the direct simulations of various dynamic processes. The results show that combining forecasts from multiple statistical models generally yields more skillful forecasts than using only the best model and appears to moderate the worst forecast errors. The use of rainfall predictions from the dynamical climate model marginally improves the streamflow forecasts when viewed over all the study catchments and seasons, but the use of sea surface temperature predictions provide little additional benefit.
Directory of Open Access Journals (Sweden)
Jan Ketil Arnulf
Full Text Available Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60-86% of the variance in the response patterns and allowed remarkably precise prediction of survey responses from humans, except in a personality test. Even the relationships between independent and their purported dependent variables were accurately predicted. This raises concern about the empirical nature of data collected through some surveys if results are already given a priori through the way subjects are being asked. Survey response patterns seem heavily determined by semantics. Language algorithms may suggest these prior to administering a survey. This study suggests that semantic algorithms are becoming new tools for the social sciences, opening perspectives on survey responses that prevalent psychometric theory cannot explain.
Nikolopoulos, E. I.; Destro, E.; Bhuiyan, M. A. E.; Borga, M., Sr.; Anagnostou, E. N.
2017-12-01
Fire disasters affect modern societies at global scale inducing significant economic losses and human casualties. In addition to their direct impacts they have various adverse effects on hydrologic and geomorphologic processes of a region due to the tremendous alteration of the landscape characteristics (vegetation, soil properties etc). As a consequence, wildfires often initiate a cascade of hazards such as flash floods and debris flows that usually follow the occurrence of a wildfire thus magnifying the overall impact in a region. Post-fire debris flows (PFDF) is one such type of hazards frequently occurring in Western United States where wildfires are a common natural disaster. Prediction of PDFD is therefore of high importance in this region and over the last years a number of efforts from United States Geological Survey (USGS) and National Weather Service (NWS) have been focused on the development of early warning systems that will help mitigate PFDF risk. This work proposes a prediction framework that is based on a nonparametric statistical technique (random forests) that allows predicting the occurrence of PFDF at regional scale with a higher degree of accuracy than the commonly used approaches that are based on power-law thresholds and logistic regression procedures. The work presented is based on a recently released database from USGS that reports a total of 1500 storms that triggered and did not trigger PFDF in a number of fire affected catchments in Western United States. The database includes information on storm characteristics (duration, accumulation, max intensity etc) and other auxiliary information of land surface properties (soil erodibility index, local slope etc). Results show that the proposed model is able to achieve a satisfactory prediction accuracy (threat score > 0.6) superior of previously published prediction frameworks highlighting the potential of nonparametric statistical techniques for development of PFDF prediction systems.
Singh, Sarvesh Kumar; Rani, Raj
2015-10-01
The study addresses the identification of multiple point sources, emitting the same tracer, from their limited set of merged concentration measurements. The identification, here, refers to the estimation of locations and strengths of a known number of simultaneous point releases. The source-receptor relationship is described in the framework of adjoint modelling by using an analytical Gaussian dispersion model. A least-squares minimization framework, free from an initialization of the release parameters (locations and strengths), is presented to estimate the release parameters. This utilizes the distributed source information observable from the given monitoring design and number of measurements. The technique leads to an exact retrieval of the true release parameters when measurements are noise free and exactly described by the dispersion model. The inversion algorithm is evaluated using the real data from multiple (two, three and four) releases conducted during Fusion Field Trials in September 2007 at Dugway Proving Ground, Utah. The release locations are retrieved, on average, within 25-45 m of the true sources with the distance from retrieved to true source ranging from 0 to 130 m. The release strengths are also estimated within a factor of three to the true release rates. The average deviations in retrieval of source locations are observed relatively large in two release trials in comparison to three and four release trials.
DEFF Research Database (Denmark)
Kelly, Mark C.; Barlas, Emre; Sogachev, Andrey
2018-01-01
Here we provide statistical low-order characterization of noise propagation from a single wind turbine, as affected by mutually interacting turbine wake and environmental conditions. This is accomplished via a probabilistic model, applied to an ensemble of atmospheric conditions based upon......; the latter solves Reynolds-Averaged Navier-Stokes equations of momentum and temperature, including the effects of stability and the ABL depth, along with the drag due to the wind turbine. Sound levels are found to be highest downwind for modestly stable conditions not atypical of mid-latitude climates...
Statistics-Based Prediction Analysis for Head and Neck Cancer Tumor Deformation
Directory of Open Access Journals (Sweden)
Maryam Azimi
2012-01-01
Full Text Available Most of the current radiation therapy planning systems, which are based on pre-treatment Computer Tomography (CT images, assume that the tumor geometry does not change during the course of treatment. However, tumor geometry is shown to be changing over time. We propose a methodology to monitor and predict daily size changes of head and neck cancer tumors during the entire radiation therapy period. Using collected patients' CT scan data, MATLAB routines are developed to quantify the progressive geometric changes occurring in patients during radiation therapy. Regression analysis is implemented to develop predictive models for tumor size changes through entire period. The generated models are validated using leave-one-out cross validation. The proposed method will increase the accuracy of therapy and improve patient's safety and quality of life by reducing the number of harmful unnecessary CT scans.
Energy Technology Data Exchange (ETDEWEB)
Kamp, Derek van der [University of Victoria, Pacific Climate Impacts Consortium, Victoria, BC (Canada); University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada); Curry, Charles L. [Environment Canada University of Victoria, Canadian Centre for Climate Modelling and Analysis, Victoria, BC (Canada); University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada); Monahan, Adam H. [University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada)
2012-04-15
A regression-based downscaling technique was applied to monthly mean surface wind observations from stations throughout western Canada as well as from buoys in the Northeast Pacific Ocean over the period 1979-2006. A predictor set was developed from principal component analysis of the three wind components at 500 hPa and mean sea-level pressure taken from the NCEP Reanalysis II. Building on the results of a companion paper, Curry et al. (Clim Dyn 2011), the downscaling was applied to both wind speed and wind components, in an effort to evaluate the utility of each type of predictand. Cross-validated prediction skill varied strongly with season, with autumn and summer displaying the highest and lowest skill, respectively. In most cases wind components were predicted with better skill than wind speeds. The predictive ability of wind components was found to be strongly related to their orientation. Wind components with the best predictions were often oriented along topographically significant features such as constricted valleys, mountain ranges or ocean channels. This influence of directionality on predictive ability is most prominent during autumn and winter at inland sites with complex topography. Stations in regions with relatively flat terrain (where topographic steering is minimal) exhibit inter-station consistencies including region-wide seasonal shifts in the direction of the best predicted wind component. The conclusion that wind components can be skillfully predicted only over a limited range of directions at most stations limits the scope of statistically downscaled wind speed predictions. It seems likely that such limitations apply to other regions of complex terrain as well. (orig.)
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network.
Yu, Ying; Wang, Yirui; Gao, Shangce; Tang, Zheng
2017-01-01
With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model) is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model) to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient.
International Nuclear Information System (INIS)
Grotch, S.L.
1991-01-01
This study is a detailed intercomparison of the results produced by four general circulation models (GCMs) that have been used to estimate the climatic consequences of a doubling of the CO 2 concentration. Two variables, surface air temperature and precipitation, annually and seasonally averaged, are compared for both the current climate and for the predicted equilibrium changes after a doubling of the atmospheric CO 2 concentration. The major question considered here is: how well do the predictions from different GCMs agree with each other and with historical climatology over different areal extents, from the global scale down to the range of only several gridpoints? Although the models often agree well when estimating averages over large areas, substantial disagreements become apparent as the spatial scale is reduced. At scales below continental, the correlations observed between different model predictions are often very poor. The implications of this work for investigation of climatic impacts on a regional scale are profound. For these two important variables, at least, the poor agreement between model simulations of the current climate on the regional scale calls into question the ability of these models to quantitatively estimate future climatic change on anything approaching the scale of a few (< 10) gridpoints, which is essential if these results are to be used in meaningful resource-assessment studies. A stronger cooperative effort among the different modeling groups will be necessary to assure that we are getting better agreement for the right reasons, a prerequisite for improving confidence in model projections. 11 refs.; 10 figs
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network
Directory of Open Access Journals (Sweden)
Ying Yu
2017-01-01
Full Text Available With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient.
International Nuclear Information System (INIS)
Grotch, S.L.
1990-01-01
This study is a detailed intercomparison of the results produced by four general circulation models (GCMs) that have been used to estimate the climatic consequences of a doubling of the CO 2 concentration. Two variables, surface air temperature and precipitation, annually and seasonally averaged, are compared for both the current climate and for the predicted equilibrium changes after a doubling of the atmospheric CO 2 concentration. The major question considered here is: how well do the predictions from different GCMs agree with each other and with historical climatology over different areal extents, from the global scale down to the range of only several gridpoints? Although the models often agree well when estimating averages over large areas, substantial disagreements become apparent as the spatial scale is reduced. At scales below continental, the correlations observed between different model predictions are often very poor. The implications of this work for investigation of climatic impacts on a regional scale are profound. For these two important variables, at least, the poor agreement between model simulations of the current climate on the regional scale calls into question the ability of these models to quantitatively estimate future climatic change on anything approaching the scale of a few (< 10) gridpoints, which is essential if these results are to be used in meaningful resource-assessment studies. A stronger cooperative effort among the different modeling groups will be necessary to assure that we are getting better agreement for the right reasons, a prerequisite for improving confidence in model projections
Godoy-Lorite, Antonia; Guimerà, Roger; Sales-Pardo, Marta
2016-01-01
In social networks, individuals constantly drop ties and replace them by new ones in a highly unpredictable fashion. This highly dynamical nature of social ties has important implications for processes such as the spread of information or of epidemics. Several studies have demonstrated the influence of a number of factors on the intricate microscopic process of tie replacement, but the macroscopic long-term effects of such changes remain largely unexplored. Here we investigate whether, despite the inherent randomness at the microscopic level, there are macroscopic statistical regularities in the long-term evolution of social networks. In particular, we analyze the email network of a large organization with over 1,000 individuals throughout four consecutive years. We find that, although the evolution of individual ties is highly unpredictable, the macro-evolution of social communication networks follows well-defined statistical patterns, characterized by exponentially decaying log-variations of the weight of social ties and of individuals' social strength. At the same time, we find that individuals have social signatures and communication strategies that are remarkably stable over the scale of several years.
International Nuclear Information System (INIS)
Heinrich, S.
2006-01-01
Nucleus fission process is a very complex phenomenon and, even nowadays, no realistic models describing the overall process are available. The work presented here deals with a theoretical description of fission fragments distributions in mass, charge, energy and deformation. We have reconsidered and updated the B.D. Wilking Scission Point model. Our purpose was to test if this statistic model applied at the scission point and by introducing new results of modern microscopic calculations allows to describe quantitatively the fission fragments distributions. We calculate the surface energy available at the scission point as a function of the fragments deformations. This surface is obtained from a Hartree Fock Bogoliubov microscopic calculation which guarantee a realistic description of the potential dependence on the deformation for each fragment. The statistic balance is described by the level densities of the fragment. We have tried to avoid as much as possible the input of empirical parameters in the model. Our only parameter, the distance between each fragment at the scission point, is discussed by comparison with scission configuration obtained from full dynamical microscopic calculations. Also, the comparison between our results and experimental data is very satisfying and allow us to discuss the success and limitations of our approach. We finally proposed ideas to improve the model, in particular by applying dynamical corrections. (author)
Energy Technology Data Exchange (ETDEWEB)
Duran-Lobato, Matilde, E-mail: mduran@us.es [Universidad de Sevilla, Dpto. Farmacia y Tecnologia Farmaceutica, Facultad de Farmacia (Espana) (Spain); Enguix-Gonzalez, Alicia [Universidad de Sevilla, Dpto. Estadistica e Investigacion Operativa, Facultad de Matematicas (Espana) (Spain); Fernandez-Arevalo, Mercedes; Martin-Banderas, Lucia [Universidad de Sevilla, Dpto. Farmacia y Tecnologia Farmaceutica, Facultad de Farmacia (Espana) (Spain)
2013-02-15
Lipid nanoparticles (LNPs) are a promising carrier for all administration routes due to their safety, small size, and high loading of lipophilic compounds. Among the LNP production techniques, the easy scale-up, lack of organic solvents, and short production times of the high-pressure homogenization technique (HPH) make this method stand out. In this study, a statistical analysis was applied to the production of LNP by HPH. Spherical LNPs with mean size ranging from 65 nm to 11.623 {mu}m, negative zeta potential under -30 mV, and smooth surface were produced. Manageable equations based on commonly used parameters in the pharmaceutical field were obtained. The lipid to emulsifier ratio (R{sub L/S}) was proved to statistically explain the influence of oil phase and surfactant concentration on final nanoparticles size. Besides, the homogenization pressure was found to ultimately determine LNP size for a given R{sub L/S}, while the number of passes applied mainly determined polydispersion. {alpha}-Tocopherol was used as a model drug to illustrate release properties of LNP as a function of particle size, which was optimized by the regression models. This study is intended as a first step to optimize production conditions prior to LNP production at both laboratory and industrial scale from an eminently practical approach, based on parameters extensively used in formulation.
Energy Technology Data Exchange (ETDEWEB)
Daisuke Onozuka; Akihito Hagihara [Fukuoka Institute of Health and Environmental Sciences, Fukuoka (Japan). Department of Information Science
2007-07-01
Tuberculosis (TB) has reemerged as a global public health epidemic in recent years. Although evaluating local disease clusters leads to effective prevention and control of TB, there are few, if any, spatiotemporal comparisons for epidemic diseases. TB cases among residents in Fukuoka Prefecture between 1999 and 2004 (n = 9,119) were geocoded at the census tract level (n = 109) based on residence at the time of diagnosis. The spatial and space-time scan statistics were then used to identify clusters of census tracts with elevated proportions of TB cases. In the purely spatial analyses, the most likely clusters were in the Chikuho coal mining area (in 1999, 2002, 2003, 2004), the Kita-Kyushu industrial area (in 2000), and the Fukuoka urban area (in 2001). In the space-time analysis, the most likely cluster was the Kita-Kyushu industrial area (in 2000). The north part of Fukuoka Prefecture was the most likely to have a cluster with a significantly high occurrence of TB. The spatial and space-time scan statistics are effective ways of describing circular disease clusters. Since, in reality, infectious diseases might form other cluster types, the effectiveness of the method may be limited under actual practice. The sophistication of the analytical methodology, however, is a topic for future study. 48 refs., 3 figs., 3 tabs.
Directory of Open Access Journals (Sweden)
Onozuka Daisuke
2007-04-01
Full Text Available Abstract Background Tuberculosis (TB has reemerged as a global public health epidemic in recent years. Although evaluating local disease clusters leads to effective prevention and control of TB, there are few, if any, spatiotemporal comparisons for epidemic diseases. Methods TB cases among residents in Fukuoka Prefecture between 1999 and 2004 (n = 9,119 were geocoded at the census tract level (n = 109 based on residence at the time of diagnosis. The spatial and space-time scan statistics were then used to identify clusters of census tracts with elevated proportions of TB cases. Results In the purely spatial analyses, the most likely clusters were in the Chikuho coal mining area (in 1999, 2002, 2003, 2004, the Kita-Kyushu industrial area (in 2000, and the Fukuoka urban area (in 2001. In the space-time analysis, the most likely cluster was the Kita-Kyushu industrial area (in 2000. The north part of Fukuoka Prefecture was the most likely to have a cluster with a significantly high occurrence of TB. Conclusion The spatial and space-time scan statistics are effective ways of describing circular disease clusters. Since, in reality, infectious diseases might form other cluster types, the effectiveness of the method may be limited under actual practice. The sophistication of the analytical methodology, however, is a topic for future study.
International Nuclear Information System (INIS)
Zahedi, Gholamreza; Karami, Zohre; Yaghoobi, Hamed
2009-01-01
In this study, various estimation methods have been reviewed for hydrate formation temperature (HFT) and two procedures have been presented. In the first method, two general correlations have been proposed for HFT. One of the correlations has 11 parameters, and the second one has 18 parameters. In order to obtain constants in proposed equations, 203 experimental data points have been collected from literatures. The Engineering Equation Solver (EES) and Statistical Package for the Social Sciences (SPSS) soft wares have been employed for statistical analysis of the data. Accuracy of the obtained correlations also has been declared by comparison with experimental data and some recent common used correlations. In the second method, HFT is estimated by artificial neural network (ANN) approach. In this case, various architectures have been checked using 70% of experimental data for training of ANN. Among the various architectures multi layer perceptron (MLP) network with trainlm training algorithm was found as the best architecture. Comparing the obtained ANN model results with 30% of unseen data confirms ANN excellent estimation performance. It was found that ANN is more accurate than traditional methods and even our two proposed correlations for HFT estimation.
Breast cancer statistics and prediction methodology: a systematic review and analysis.
Dubey, Ashutosh Kumar; Gupta, Umesh; Jain, Sonal
2015-01-01
Breast cancer is a menacing cancer, primarily affecting women. Continuous research is going on for detecting breast cancer in the early stage as the possibility of cure in early stages is bright. There are two main objectives of this current study, first establish statistics for breast cancer and second to find methodologies which can be helpful in the early stage detection of the breast cancer based on previous studies. The breast cancer statistics for incidence and mortality of the UK, US, India and Egypt were considered for this study. The finding of this study proved that the overall mortality rates of the UK and US have been improved because of awareness, improved medical technology and screening, but in case of India and Egypt the condition is less positive because of lack of awareness. The methodological findings of this study suggest a combined framework based on data mining and evolutionary algorithms. It provides a strong bridge in improving the classification and detection accuracy of breast cancer data.
International Nuclear Information System (INIS)
Durán-Lobato, Matilde; Enguix-González, Alicia; Fernández-Arévalo, Mercedes; Martín-Banderas, Lucía
2013-01-01
Lipid nanoparticles (LNPs) are a promising carrier for all administration routes due to their safety, small size, and high loading of lipophilic compounds. Among the LNP production techniques, the easy scale-up, lack of organic solvents, and short production times of the high-pressure homogenization technique (HPH) make this method stand out. In this study, a statistical analysis was applied to the production of LNP by HPH. Spherical LNPs with mean size ranging from 65 nm to 11.623 μm, negative zeta potential under –30 mV, and smooth surface were produced. Manageable equations based on commonly used parameters in the pharmaceutical field were obtained. The lipid to emulsifier ratio (R L/S ) was proved to statistically explain the influence of oil phase and surfactant concentration on final nanoparticles size. Besides, the homogenization pressure was found to ultimately determine LNP size for a given R L/S , while the number of passes applied mainly determined polydispersion. α-Tocopherol was used as a model drug to illustrate release properties of LNP as a function of particle size, which was optimized by the regression models. This study is intended as a first step to optimize production conditions prior to LNP production at both laboratory and industrial scale from an eminently practical approach, based on parameters extensively used in formulation.
Agur, Zvia; Elishmereni, Moran; Kheifetz, Yuri
2014-01-01
Despite its great promise, personalized oncology still faces many hurdles, and it is increasingly clear that targeted drugs and molecular biomarkers alone yield only modest clinical benefit. One reason is the complex relationships between biomarkers and the patient's response to drugs, obscuring the true weight of the biomarkers in the overall patient's response. This complexity can be disentangled by computational models that integrate the effects of personal biomarkers into a simulator of drug-patient dynamic interactions, for predicting the clinical outcomes. Several computational tools have been developed for personalized oncology, notably evidence-based tools for simulating pharmacokinetics, Bayesian-estimated tools for predicting survival, etc. We describe representative statistical and mathematical tools, and discuss their merits, shortcomings and preliminary clinical validation attesting to their potential. Yet, the individualization power of mathematical models alone, or statistical models alone, is limited. More accurate and versatile personalization tools can be constructed by a new application of the statistical/mathematical nonlinear mixed effects modeling (NLMEM) approach, which until recently has been used only in drug development. Using these advanced tools, clinical data from patient populations can be integrated with mechanistic models of disease and physiology, for generating personal mathematical models. Upon a more substantial validation in the clinic, this approach will hopefully be applied in personalized clinical trials, P-trials, hence aiding the establishment of personalized medicine within the main stream of clinical oncology. © 2014 Wiley Periodicals, Inc.
International Nuclear Information System (INIS)
Sun, Yanan; Dong, Jizhe; Ding, Lijuan
2017-01-01
Highlights: • A day–ahead wind–thermal unit commitment model is presented. • Wind speed transfer matrix is formed to depict the sequential wind features. • Spinning reserve setting considering wind power accuracy and variation is proposed. • Verified study is performed to check the correctness of the program. - Abstract: The increasing penetration of intermittent wind power affects the secure operation of power systems and leads to a requirement of robust and economic generation scheduling. This paper presents an optimal day–ahead wind–thermal generation scheduling method that considers the statistical and predicted features of wind speeds. In this method, the statistical analysis of historical wind data, which represents the local wind regime, is first implemented. Then, according to the statistical results and the predicted wind power, the spinning reserve requirements for the scheduling period are calculated. Based on the calculated spinning reserve requirements, the wind–thermal generation scheduling is finally conducted. To validate the program, a verified study is performed on a test system. Then, numerical studies to demonstrate the effectiveness of the proposed method are conducted.
Energy Technology Data Exchange (ETDEWEB)
Granderson, Jessica [Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Energy Technologies Area Div.; Price, Phillip N. [Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Energy Technologies Area Div.
2014-03-01
This paper documents the development and application of a general statistical methodology to assess the accuracy of baseline energy models, focusing on its application to Measurement and Verification (M&V) of whole-building energy savings. The methodology complements the principles addressed in resources such as ASHRAE Guideline 14 and the International Performance Measurement and Verification Protocol. It requires fitting a baseline model to data from a ``training period’’ and using the model to predict total electricity consumption during a subsequent ``prediction period.’’ We illustrate the methodology by evaluating five baseline models using data from 29 buildings. The training period and prediction period were varied, and model predictions of daily, weekly, and monthly energy consumption were compared to meter data to determine model accuracy. Several metrics were used to characterize the accuracy of the predictions, and in some cases the best-performing model as judged by one metric was not the best performer when judged by another metric.
A statistical model to predict total column ozone in Peninsular Malaysia
Tan, K. C.; Lim, H. S.; Mat Jafri, M. Z.
2016-03-01
This study aims to predict monthly columnar ozone in Peninsular Malaysia based on concentrations of several atmospheric gases. Data pertaining to five atmospheric gases (CO2, O3, CH4, NO2, and H2O vapor) were retrieved by satellite scanning imaging absorption spectrometry for atmospheric chartography from 2003 to 2008 and used to develop a model to predict columnar ozone in Peninsular Malaysia. Analyses of the northeast monsoon (NEM) and the southwest monsoon (SWM) seasons were conducted separately. Based on the Pearson correlation matrices, columnar ozone was negatively correlated with H2O vapor but positively correlated with CO2 and NO2 during both the NEM and SWM seasons from 2003 to 2008. This result was expected because NO2 is a precursor of ozone. Therefore, an increase in columnar ozone concentration is associated with an increase in NO2 but a decrease in H2O vapor. In the NEM season, columnar ozone was negatively correlated with H2O (-0.847), NO2 (0.754), and CO2 (0.477); columnar ozone was also negatively but weakly correlated with CH4 (-0.035). In the SWM season, columnar ozone was highly positively correlated with NO2 (0.855), CO2 (0.572), and CH4 (0.321) and also highly negatively correlated with H2O (-0.832). Both multiple regression and principal component analyses were used to predict the columnar ozone value in Peninsular Malaysia. We obtained the best-fitting regression equations for the columnar ozone data using four independent variables. Our results show approximately the same R value (≈ 0.83) for both the NEM and SWM seasons.
Overview of statistical methods, models and analysis for predicting equipment end of life
Energy Technology Data Exchange (ETDEWEB)
NONE
2009-07-01
Utility equipment can be operated and maintained for many years following installation. However, as the equipment ages, utility operators must decide whether to extend the service life or replace the equipment. Condition assessment modelling is used by many utilities to determine the condition of equipment and to prioritize the maintenance or repair. Several factors are weighted and combined in assessment modelling, which gives a single index number to rate the equipment. There is speculation that this index alone may not be adequate for a business case to rework or replace an asset because it only ranks an asset into a particular category. For that reason, a new methodology was developed to determine the economic end of life of an asset. This paper described the different statistical methods available and their use in determining the remaining service life of electrical equipment. A newly developed Excel-based demonstration computer tool is also an integral part of the deliverables of this project.
Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2
Directory of Open Access Journals (Sweden)
Gutmann Michael
2005-02-01
Full Text Available Abstract Background It has been shown that the classical receptive fields of simple and complex cells in the primary visual cortex emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse or independent. We investigate how to learn features beyond the primary visual cortex from the statistical properties of modelled complex-cell outputs. In previous work, we showed that a new model, non-negative sparse coding, led to the emergence of features which code for contours of a given spatial frequency band. Results We applied ordinary independent component analysis to modelled outputs of complex cells that span different frequency bands. The analysis led to the emergence of features which pool spatially coherent across-frequency activity in the modelled primary visual cortex. Thus, the statistically optimal way of processing complex-cell outputs abandons separate frequency channels, while preserving and even enhancing orientation tuning and spatial localization. As a technical aside, we found that the non-negativity constraint is not necessary: ordinary independent component analysis produces essentially the same results as our previous work. Conclusion We propose that the pooling that emerges allows the features to code for realistic low-level image features related to step edges. Further, the results prove the viability of statistical modelling of natural images as a framework that produces quantitative predictions of visual processing.
Directory of Open Access Journals (Sweden)
Cuesta Eduardo
2015-09-01
Full Text Available This paper presents a multivariate regression predictive model of drift on the Coordinate Measuring Machine (CMM behaviour. Evaluation tests on a CMM with a multi-step gauge were carried out following an extended version of an ISO evaluation procedure with a periodicity of at least once a week and during more than five months. This test procedure consists in measuring the gauge for several range volumes, spatial locations, distances and repetitions. The procedure, environment conditions and even the gauge have been kept invariables, so a massive measurement dataset was collected over time under high repeatability conditions. A multivariate regression analysis has revealed the main parameters that could affect the CMM behaviour, and then detected a trend on the CMM performance drift. A performance model that considers both the size of the measured dimension and the elapsed time since the last CMM calibration has been developed. This model can predict the CMM performance and measurement reliability over time and also can estimate an optimized period between calibrations for a specific measurement length or accuracy level.
DEFF Research Database (Denmark)
Siriwardhana, Chathura; Fang, Rui; Salanti, Ali
2017-01-01
Background Plasmodium falciparum infections are especially severe in pregnant women because infected erythrocytes (IE) express VAR2CSA, a ligand that binds to placental trophoblasts, causing IE to accumulate in the placenta. Resulting inflammation and pathology increases a woman’s risk of anemia...... to 28 malarial antigens and used the data to develop statistical models for predicting if a woman has sufficient immunity to prevent PM. Methods Archival plasma samples from 1377 women were screened in a bead-based multiplex assay for Ab to 17 VAR2CSA-associated antigens (full length VAR2CSA (FV2), DBL...... in the following seven statistical approaches: logistic regression full model, logistic regression reduced model, recursive partitioning, random forests, linear discriminant analysis, quadratic discriminant analysis, and support vector machine. Results The best and simplest model proved to be the logistic...
Energy Technology Data Exchange (ETDEWEB)
Bard, D.; Chang, C.; Kahn, S. M.; Gilmore, K.; Marshall, S. [KIPAC, Stanford University, 452 Lomita Mall, Stanford, CA 94309 (United States); Kratochvil, J. M.; Huffenberger, K. M. [Department of Physics, University of Miami, Coral Gables, FL 33124 (United States); May, M. [Physics Department, Brookhaven National Laboratory, Upton, NY 11973 (United States); AlSayyad, Y.; Connolly, A.; Gibson, R. R.; Jones, L.; Krughoff, S. [Department of Astronomy, University of Washington, Seattle, WA 98195 (United States); Ahmad, Z.; Bankert, J.; Grace, E.; Hannel, M.; Lorenz, S. [Department of Physics, Purdue University, West Lafayette, IN 47907 (United States); Haiman, Z.; Jernigan, J. G., E-mail: djbard@slac.stanford.edu [Department of Astronomy and Astrophysics, Columbia University, New York, NY 10027 (United States); and others
2013-09-01
We study the effect of galaxy shape measurement errors on predicted cosmological constraints from the statistics of shear peak counts with the Large Synoptic Survey Telescope (LSST). We use the LSST Image Simulator in combination with cosmological N-body simulations to model realistic shear maps for different cosmological models. We include both galaxy shape noise and, for the first time, measurement errors on galaxy shapes. We find that the measurement errors considered have relatively little impact on the constraining power of shear peak counts for LSST.
The importance of vegetation change in the prediction of future tropical cyclone flood statistics
Irish, J. L.; Resio, D.; Bilskie, M. V.; Hagen, S. C.; Weiss, R.
2015-12-01
Global sea level rise is a near certainty over the next century (e.g., Stocker et al. 2013 [IPCC] and references therein). With sea level rise, coastal topography and land cover (hereafter "landscape") is expected to change and tropical cyclone flood hazard is expected to accelerate (e.g., Irish et al. 2010 [Ocean Eng], Woodruff et al. 2013 [Nature], Bilskie et al. 2014 [Geophys Res Lett], Ferreira et al. 2014 [Coast Eng], Passeri et al. 2015 [Nat Hazards]). Yet, the relative importance of sea-level rise induced landscape change on future tropical cyclone flood hazard assessment is not known. In this paper, idealized scenarios are used to evaluate the relative impact of one class of landscape change on future tropical cyclone extreme-value statistics in back-barrier regions: sea level rise induced vegetation migration and loss. The joint probability method with optimal sampling (JPM-OS) (Resio et al. 2009 [Nat Hazards]) with idealized surge response functions (e.g., Irish et al. 2009 [Nat Hazards]) is used to quantify the present-day and future flood hazard under various sea level rise scenarios. Results are evaluated in terms of their impact on the flood statistics (a) when projected flood elevations are included directly in the JPM analysis (Figure 1) and (b) when represented as additional uncertainty within the JPM integral (Resio et al. 2013 [Nat Hazards]), i.e., as random error. Findings are expected to aid in determining the level of effort required to reasonably account for future landscape change in hazard assessments, namely in determining when such processes are sufficiently captured by added uncertainty and when sea level rise induced vegetation changes must be considered dynamically, via detailed modeling initiatives. Acknowledgements: This material is based upon work supported by the National Science Foundation under Grant No. CMMI-1206271 and by the National Sea Grant College Program of the U.S. Department of Commerce's National Oceanic and
Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes
Spanò, M.; Lillo, F.; Miccichè, S.; Mantegna, R. N.
2008-10-01
By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups, hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact, we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.
Directory of Open Access Journals (Sweden)
ROBSON B. DE LIMA
2017-08-01
Full Text Available ABSTRACT Dry tropical forests are a key component in the global carbon cycle and their biomass estimates depend almost exclusively of fitted equations for multi-species or individual species data. Therefore, a systematic evaluation of statistical models through validation of estimates of aboveground biomass stocks is justifiable. In this study was analyzed the capacity of generic and specific equations obtained from different locations in Mexico and Brazil, to estimate aboveground biomass at multi-species levels and for four different species. Generic equations developed in Mexico and Brazil performed better in estimating tree biomass for multi-species data. For Poincianella bracteosa and Mimosa ophthalmocentra, only the Sampaio and Silva (2005 generic equation was the most recommended. These equations indicate lower tendency and lower bias, and biomass estimates for these equations are similar. For the species Mimosa tenuiflora, Aspidosperma pyrifolium and for the genus Croton the specific regional equations are more recommended, although the generic equation of Sampaio and Silva (2005 is not discarded for biomass estimates. Models considering gender, families, successional groups, climatic variables and wood specific gravity should be adjusted, tested and the resulting equations should be validated at both local and regional levels as well as on the scales of tropics with dry forest dominance.
Lima, Robson B DE; Alves, Francisco T; Oliveira, Cinthia P DE; Silva, José A A DA; Ferreira, Rinaldo L C
2017-01-01
Dry tropical forests are a key component in the global carbon cycle and their biomass estimates depend almost exclusively of fitted equations for multi-species or individual species data. Therefore, a systematic evaluation of statistical models through validation of estimates of aboveground biomass stocks is justifiable. In this study was analyzed the capacity of generic and specific equations obtained from different locations in Mexico and Brazil, to estimate aboveground biomass at multi-species levels and for four different species. Generic equations developed in Mexico and Brazil performed better in estimating tree biomass for multi-species data. For Poincianella bracteosa and Mimosa ophthalmocentra, only the Sampaio and Silva (2005) generic equation was the most recommended. These equations indicate lower tendency and lower bias, and biomass estimates for these equations are similar. For the species Mimosa tenuiflora, Aspidosperma pyrifolium and for the genus Croton the specific regional equations are more recommended, although the generic equation of Sampaio and Silva (2005) is not discarded for biomass estimates. Models considering gender, families, successional groups, climatic variables and wood specific gravity should be adjusted, tested and the resulting equations should be validated at both local and regional levels as well as on the scales of tropics with dry forest dominance.
Schlichting, Margaret L; Guarino, Katharine F; Schapiro, Anna C; Turk-Browne, Nicholas B; Preston, Alison R
2017-01-01
Despite the importance of learning and remembering across the lifespan, little is known about how the episodic memory system develops to support the extraction of associative structure from the environment. Here, we relate individual differences in volumes along the hippocampal long axis to performance on statistical learning and associative inference tasks-both of which require encoding associations that span multiple episodes-in a developmental sample ranging from ages 6 to 30 years. Relating age to volume, we found dissociable patterns across the hippocampal long axis, with opposite nonlinear volume changes in the head and body. These structural differences were paralleled by performance gains across the age range on both tasks, suggesting improvements in the cross-episode binding ability from childhood to adulthood. Controlling for age, we also found that smaller hippocampal heads were associated with superior behavioral performance on both tasks, consistent with this region's hypothesized role in forming generalized codes spanning events. Collectively, these results highlight the importance of examining hippocampal development as a function of position along the hippocampal axis and suggest that the hippocampal head is particularly important in encoding associative structure across development.
Statistical surrogate models for prediction of high-consequence climate change.
Energy Technology Data Exchange (ETDEWEB)
Constantine, Paul; Field, Richard V., Jr.; Boslough, Mark Bruce Elrick
2011-09-01
In safety engineering, performance metrics are defined using probabilistic risk assessments focused on the low-probability, high-consequence tail of the distribution of possible events, as opposed to best estimates based on central tendencies. We frame the climate change problem and its associated risks in a similar manner. To properly explore the tails of the distribution requires extensive sampling, which is not possible with existing coupled atmospheric models due to the high computational cost of each simulation. We therefore propose the use of specialized statistical surrogate models (SSMs) for the purpose of exploring the probability law of various climate variables of interest. A SSM is different than a deterministic surrogate model in that it represents each climate variable of interest as a space/time random field. The SSM can be calibrated to available spatial and temporal data from existing climate databases, e.g., the Program for Climate Model Diagnosis and Intercomparison (PCMDI), or to a collection of outputs from a General Circulation Model (GCM), e.g., the Community Earth System Model (CESM) and its predecessors. Because of its reduced size and complexity, the realization of a large number of independent model outputs from a SSM becomes computationally straightforward, so that quantifying the risk associated with low-probability, high-consequence climate events becomes feasible. A Bayesian framework is developed to provide quantitative measures of confidence, via Bayesian credible intervals, in the use of the proposed approach to assess these risks.
Gregoire, Alexandre David
2011-07-01
The goal of this research was to accurately predict the ultimate compressive load of impact damaged graphite/epoxy coupons using a Kohonen self-organizing map (SOM) neural network and multivariate statistical regression analysis (MSRA). An optimized use of these data treatment tools allowed the generation of a simple, physically understandable equation that predicts the ultimate failure load of an impacted damaged coupon based uniquely on the acoustic emissions it emits at low proof loads. Acoustic emission (AE) data were collected using two 150 kHz resonant transducers which detected and recorded the AE activity given off during compression to failure of thirty-four impacted 24-ply bidirectional woven cloth laminate graphite/epoxy coupons. The AE quantification parameters duration, energy and amplitude for each AE hit were input to the Kohonen self-organizing map (SOM) neural network to accurately classify the material failure mechanisms present in the low proof load data. The number of failure mechanisms from the first 30% of the loading for twenty-four coupons were used to generate a linear prediction equation which yielded a worst case ultimate load prediction error of 16.17%, just outside of the +/-15% B-basis allowables, which was the goal for this research. Particular emphasis was placed upon the noise removal process which was largely responsible for the accuracy of the results.
Mocellin, Simone; Thompson, John F; Pasquali, Sandro; Montesco, Maria C; Pilati, Pierluigi; Nitti, Donato; Saw, Robyn P; Scolyer, Richard A; Stretch, Jonathan R; Rossi, Carlo R
2009-12-01
To improve selection for sentinel node (SN) biopsy (SNB) in patients with cutaneous melanoma using statistical models predicting SN status. About 80% of patients currently undergoing SNB are node negative. In the absence of conclusive evidence of a SNBassociated survival benefit, these patients may be over-treated. Here, we tested the efficiency of 4 different models in predicting SN status. The clinicopathologic data (age, gender, tumor thickness, Clark level, regression, ulceration, histologic subtype, and mitotic index) of 1132 melanoma patients who had undergone SNB at institutions in Italy and Australia were analyzed. Logistic regression, classification tree, random forest, and support vector machine models were fitted to the data. The predictive models were built with the aim of maximizing the negative predictive value (NPV) and reducing the rate of SNB procedures though minimizing the error rate. After cross-validation logistic regression, classification tree, random forest, and support vector machine predictive models obtained clinically relevant NPV (93.6%, 94.0%, 97.1%, and 93.0%, respectively), SNB reduction (27.5%, 29.8%, 18.2%, and 30.1%, respectively), and error rates (1.8%, 1.8%, 0.5%, and 2.1%, respectively). Using commonly available clinicopathologic variables, predictive models can preoperatively identify a proportion of patients ( approximately 25%) who might be spared SNB, with an acceptable (1%-2%) error. If validated in large prospective series, these models might be implemented in the clinical setting for improved patient selection, which ultimately would lead to better quality of life for patients and optimization of resource allocation for the health care system.
Statistical Analysis of Wave Climate Data Using Mixed Distributions and Extreme Wave Prediction
Directory of Open Access Journals (Sweden)
Wei Li
2016-05-01
Full Text Available The investigation of various aspects of the wave climate at a wave energy test site is essential for the development of reliable and efficient wave energy conversion technology. This paper presents studies of the wave climate based on nine years of wave observations from the 2005–2013 period measured with a wave measurement buoy at the Lysekil wave energy test site located off the west coast of Sweden. A detailed analysis of the wave statistics is investigated to reveal the characteristics of the wave climate at this specific test site. The long-term extreme waves are estimated from applying the Peak over Threshold (POT method on the measured wave data. The significant wave height and the maximum wave height at the test site for different return periods are also compared. In this study, a new approach using a mixed-distribution model is proposed to describe the long-term behavior of the significant wave height and it shows an impressive goodness of fit to wave data from the test site. The mixed-distribution model is also applied to measured wave data from four other sites and it provides an illustration of the general applicability of the proposed model. The methodologies used in this paper can be applied to general wave climate analysis of wave energy test sites to estimate extreme waves for the survivability assessment of wave energy converters and characterize the long wave climate to forecast the wave energy resource of the test sites and the energy production of the wave energy converters.
Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2018-03-01
Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2D seismic reflection data processing flow focused on pre - stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching (BHM), to estimate the uncertainties of the depths of key horizons near the borehole DSDP-258 located in the Mentelle Basin, south west of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ± 2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program (IODP), leg 369.
Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2018-06-01
Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2-D seismic reflection data processing flow focused on pre-stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching, to estimate the uncertainties of the depths of key horizons near the Deep Sea Drilling Project (DSDP) borehole 258 (DSDP-258) located in the Mentelle Basin, southwest of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ±2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent to the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program, leg 369.
International Nuclear Information System (INIS)
Nam, Cheol; Choi, Byeong Kwon; Jeong, Yong Hwan; Jung, Youn Ho
2001-01-01
During the last decade, the failure behavior of high-burnup fuel rods under RIA has been an extensive concern since observations of fuel rod failures at low enthalpy. Of great importance is placed on failure prediction of fuel rod in the point of licensing criteria and safety in extending burnup achievement. To address the issue, a statistics-based methodology is introduced to predict failure probability of irradiated fuel rods. Based on RIA simulation results in literature, a failure enthalpy correlation for irradiated fuel rod is constructed as a function of oxide thickness, fuel burnup, and pulse width. From the failure enthalpy correlation, a single damage parameter, equivalent enthalpy, is defined to reflect the effects of the three primary factors as well as peak fuel enthalpy. Moreover, the failure distribution function with equivalent enthalpy is derived, applying a two-parameter Weibull statistical model. Using these equations, the sensitivity analysis is carried out to estimate the effects of burnup, corrosion, peak fuel enthalpy, pulse width and cladding materials used
Eloqayli, Haytham; Al-Yousef, Ali; Jaradat, Raid
2018-02-15
Despite the high prevalence of chronic neck pain, there is limited consensus about the primary etiology, risk factors, diagnostic criteria and therapeutic outcome. Here, we aimed to determine if Ferritin and Vitamin D are modifiable risk factors with chronic neck pain using slandered statistics and artificial intelligence neural network (ANN). Fifty-four patients with chronic neck pain treated between February 2016 and August 2016 in King Abdullah University Hospital and 54 patients age matched controls undergoing outpatient or minor procedures were enrolled. Patients and control demographic parameters, height, weight and single measurement of serum vitamin D, Vitamin B12, ferritin, calcium, phosphorus, zinc were obtained. An ANN prediction model was developed. The statistical analysis reveals that patients with chronic neck pain have significantly lower serum Vitamin D and Ferritin (p-value artificial neural network can be of future benefit in classification and prediction models for chronic neck pain. We hope this initial work will encourage a future larger cohort study addressing vitamin D and iron correction as modifiable factors and the application of artificial intelligence models in clinical practice.
Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru
2014-10-15
Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
Selby-Pham, Sophie N B; Howell, Kate S; Dunshea, Frank R; Ludbey, Joel; Lutz, Adrian; Bennett, Louise
2018-04-15
A diet rich in phytochemicals confers benefits for health by reducing the risk of chronic diseases via regulation of oxidative stress and inflammation (OSI). For optimal protective bio-efficacy, the time required for phytochemicals and their metabolites to reach maximal plasma concentrations (T max ) should be synchronised with the time of increased OSI. A statistical model has been reported to predict T max of individual phytochemicals based on molecular mass and lipophilicity. We report the application of the model for predicting the absorption profile of an uncharacterised phytochemical mixture, herein referred to as the 'functional fingerprint'. First, chemical profiles of phytochemical extracts were acquired using liquid chromatography mass spectrometry (LC-MS), then the molecular features for respective components were used to predict their plasma absorption maximum, based on molecular mass and lipophilicity. This method of 'functional fingerprinting' of plant extracts represents a novel tool for understanding and optimising the health efficacy of plant extracts. Copyright © 2017 Elsevier Ltd. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
2018-03-19
R code that performs the analysis of a data set presented in the paper ‘Leveraging Multiple Statistical Methods for Inverse Prediction in Nuclear Forensics Applications’ by Lewis, J., Zhang, A., Anderson-Cook, C. It provides functions for doing inverse predictions in this setting using several different statistical methods. The data set is a publicly available data set from a historical Plutonium production experiment.
Directory of Open Access Journals (Sweden)
Cutler Sean R
2007-06-01
Full Text Available Abstract Background The carboxy termini of proteins are a frequent site of activity for a variety of biologically important functions, ranging from post-translational modification to protein targeting. Several short peptide motifs involved in protein sorting roles and dependent upon their proximity to the C-terminus for proper function have already been characterized. As a limited number of such motifs have been identified, the potential exists for genome-wide statistical analysis and comparative genomics to reveal novel peptide signatures functioning in a C-terminal dependent manner. We have applied a novel methodology to the prediction of C-terminal-anchored peptide motifs involving a simple z-statistic and several techniques for improving the signal-to-noise ratio. Results We examined the statistical over-representation of position-specific C-terminal tripeptides in 7 eukaryotic proteomes. Sequence randomization models and simple-sequence masking were applied to the successful reduction of background noise. Similarly, as C-terminal homology among members of large protein families may artificially inflate tripeptide counts in an irrelevant and obfuscating manner, gene-family clustering was performed prior to the analysis in order to assess tripeptide over-representation across protein families as opposed to across all proteins. Finally, comparative genomics was used to identify tripeptides significantly occurring in multiple species. This approach has been able to predict, to our knowledge, all C-terminally anchored targeting motifs present in the literature. These include the PTS1 peroxisomal targeting signal (SKL*, the ER-retention signal (K/HDEL*, the ER-retrieval signal for membrane bound proteins (KKxx*, the prenylation signal (CC* and the CaaX box prenylation motif. In addition to a high statistical over-representation of these known motifs, a collection of significant tripeptides with a high propensity for biological function exists
Directory of Open Access Journals (Sweden)
Olson James M
2006-04-01
Full Text Available Abstract Background Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip® that uses multiple oligonucleotide probes (i.e. probe set, since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip® was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip® gene expression array data. Results We developed a two-step approach to predict alternative splicing from GeneChip® data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip® Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic
International Nuclear Information System (INIS)
Guan, Dong; Wu, Jiu Hui; Jing, Li
2015-01-01
Highlights: • A random internal morphology and structure generation-growth method, termed as the quartet structure generation set (QSGS), has been utilized based on the stochastic cluster growth theory for numerical generating the various microstructures of porous metal materials. • Effects of different parameters such as thickness and porosity on sound absorption performance of the generated structures are studied by the present method, and the obtained results are validated by an empirical model as well. • This method could be utilized to guide the design and fabrication of the sound-absorption porous metal materials. - Abstract: In this paper, a statistical method for predicting sound absorption properties of porous metal materials is presented. To reflect the stochastic distribution characteristics of the porous metal materials, a random internal morphology and structure generation-growth method, termed as the quartet structure generation set (QSGS), has been utilized based on the stochastic cluster growth theory for numerical generating the various microstructures of porous metal materials. Then by using the transfer-function approach along with the QSGS tool, we investigate the sound absorbing performance of porous metal materials with complex stochastic geometries. The statistical method has been validated by the good agreement among the numerical results for metal rubber from this method and a previous empirical model and the corresponding experimental data. Furthermore, the effects of different parameters such as thickness and porosity on sound absorption performance of the generated structures are studied by the present method, and the obtained results are validated by an empirical model as well. Therefore, the present method is a reliable and robust method for predicting the sound absorption performance of porous metal materials, and could be utilized to guide the design and fabrication of the sound-absorption porous metal materials
Ambler, Graeme K; Gohel, Manjit S; Mitchell, David C; Loftus, Ian M; Boyle, Jonathan R
2015-01-01
Accurate adjustment of surgical outcome data for risk is vital in an era of surgeon-level reporting. Current risk prediction models for abdominal aortic aneurysm (AAA) repair are suboptimal. We aimed to develop a reliable risk model for in-hospital mortality after intervention for AAA, using rigorous contemporary statistical techniques to handle missing data. Using data collected during a 15-month period in the United Kingdom National Vascular Database, we applied multiple imputation methodology together with stepwise model selection to generate preoperative and perioperative models of in-hospital mortality after AAA repair, using two thirds of the available data. Model performance was then assessed on the remaining third of the data by receiver operating characteristic curve analysis and compared with existing risk prediction models. Model calibration was assessed by Hosmer-Lemeshow analysis. A total of 8088 AAA repair operations were recorded in the National Vascular Database during the study period, of which 5870 (72.6%) were elective procedures. Both preoperative and perioperative models showed excellent discrimination, with areas under the receiver operating characteristic curve of .89 and .92, respectively. This was significantly better than any of the existing models (area under the receiver operating characteristic curve for best comparator model, .84 and .88; P AAA repair. These models were carefully developed with rigorous statistical methodology and significantly outperform existing methods for both elective cases and overall AAA mortality. These models will be invaluable for both preoperative patient counseling and accurate risk adjustment of published outcome data. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Predictive statistical mechanics
International Nuclear Information System (INIS)
Jaynes, E.T.
1986-01-01
This paper tries to explain the original motivation in quantum theory, the formalism that evolved from it, and some recent applications. The difficulty in finding a rational interpretation of quantum theory is examined. The paper presents and tries to solve the technical problem of how to use probability theory to help one do plausible reasoning in situations where, because of incomplete information, one cannot use deductive reasoning. Bayesian inference and the mass of Saturn are discussed and a generalized inverse problem is presented. The author examines the maximum entropy principle. Applications are discussed and include: spectrum analysis, image reconstruction, noisy data, and medical tomography
Sel, İlker; Çakmakcı, Mehmet; Özkaya, Bestamin; Suphi Altan, H
2016-10-01
Main objective of this study was to develop a statistical model for easier and faster Biochemical Methane Potential (BMP) prediction of landfilled municipal solid waste by analyzing waste composition of excavated samples from 12 sampling points and three waste depths representing different landfilling ages of closed and active sections of a sanitary landfill site located in İstanbul, Turkey. Results of Principal Component Analysis (PCA) were used as a decision support tool to evaluation and describe the waste composition variables. Four principal component were extracted describing 76% of data set variance. The most effective components were determined as PCB, PO, T, D, W, FM, moisture and BMP for the data set. Multiple Linear Regression (MLR) models were built by original compositional data and transformed data to determine differences. It was observed that even residual plots were better for transformed data the R(2) and Adjusted R(2) values were not improved significantly. The best preliminary BMP prediction models consisted of D, W, T and FM waste fractions for both versions of regressions. Adjusted R(2) values of the raw and transformed models were determined as 0.69 and 0.57, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.
Popescu, M D; Draghici, L; Secheli, I; Secheli, M; Codrescu, M; Draghici, I
2015-01-01
Infantile Hemangiomas (IH) are the most frequent tumors of vascular origin, and the differential diagnosis from vascular malformations is difficult to establish. Specific types of IH due to the location, dimensions and fast evolution, can determine important functional and esthetic sequels. To avoid these unfortunate consequences it is necessary to establish the exact appropriate moment to begin the treatment and decide which the most adequate therapeutic procedure is. Based on clinical data collected by a serial clinical observations correlated with imaging data, and processed by a computer-aided diagnosis system (CAD), the study intended to develop a treatment algorithm to accurately predict the best final results, from the esthetical and functional point of view, for a certain type of lesion. The preliminary database was composed of 75 patients divided into 4 groups according to the treatment management they received: medical therapy, sclerotherapy, surgical excision and no treatment. The serial clinical observation was performed each month and all the data was processed by using CAD. The project goal was to create a software that incorporated advanced methods to accurately measure the specific IH lesions, integrated medical information, statistical methods and computational methods to correlate this information with that obtained from the processing of images. Based on these correlations, a prediction mechanism of the evolution of hemangioma, which helped determine the best method of therapeutic intervention to minimize further complications, was established.
Karacan, C Özgen; Olea, Ricardo A
2013-08-01
Coal seam degasification and its success are important for controlling methane, and thus for the health and safety of coal miners. During the course of degasification, properties of coal seams change. Thus, the changes in coal reservoir conditions and in-place gas content as well as methane emission potential into mines should be evaluated by examining time-dependent changes and the presence of major heterogeneities and geological discontinuities in the field. In this work, time-lapsed reservoir and fluid storage properties of the New Castle coal seam, Mary Lee/Blue Creek seam, and Jagger seam of Black Warrior Basin, Alabama, were determined from gas and water production history matching and production forecasting of vertical degasification wellbores. These properties were combined with isotherm and other important data to compute gas-in-place (GIP) and its change with time at borehole locations. Time-lapsed training images (TIs) of GIP and GIP difference corresponding to each coal and date were generated by using these point-wise data and Voronoi decomposition on the TI grid, which included faults as discontinuities for expansion of Voronoi regions. Filter-based multiple-point geostatistical simulations, which were preferred in this study due to anisotropies and discontinuities in the area, were used to predict time-lapsed GIP distributions within the study area. Performed simulations were used for mapping spatial time-lapsed methane quantities as well as their uncertainties within the study area. The systematic approach presented in this paper is the first time in literature that history matching, TIs of GIPs and filter simulations are used for degasification performance evaluation and for assessing GIP for mining safety. Results from this study showed that using production history matching of coalbed methane wells to determine time-lapsed reservoir data could be used to compute spatial GIP and representative GIP TIs generated through Voronoi decomposition
International Nuclear Information System (INIS)
Pham, Binh T.; Hawkes, Grant L.; Einerson, Jeffrey J.
2014-01-01
As part of the High Temperature Reactors (HTR) R and D program, a series of irradiation tests, designated as Advanced Gas-cooled Reactor (AGR), have been defined to support development and qualification of fuel design, fabrication process, and fuel performance under normal operation and accident conditions. The AGR tests employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule and instrumented with thermocouples (TC) embedded in graphite blocks enabling temperature control. While not possible to obtain by direct measurements in the tests, crucial fuel conditions (e.g., temperature, neutron fast fluence, and burnup) are calculated using core physics and thermal modeling codes. This paper is focused on AGR test fuel temperature predicted by the ABAQUS code's finite element-based thermal models. The work follows up on a previous study, in which several statistical analysis methods were adapted, implemented in the NGNP Data Management and Analysis System (NDMAS), and applied for qualification of AGR-1 thermocouple data. Abnormal trends in measured data revealed by the statistical analysis are traced to either measuring instrument deterioration or physical mechanisms in capsules that may have shifted the system thermal response. The main thrust of this work is to exploit the variety of data obtained in irradiation and post-irradiation examination (PIE) for assessment of modeling assumptions. As an example, the uneven reduction of the control gas gap in Capsule 5 found in the capsule metrology measurements in PIE helps identify mechanisms other than TC drift causing the decrease in TC readings. This suggests a more physics-based modification of the thermal model that leads to a better fit with experimental data, thus reducing model uncertainty and increasing confidence in the calculated fuel temperatures of the AGR-1 test
Energy Technology Data Exchange (ETDEWEB)
Pham, Binh T., E-mail: Binh.Pham@inl.gov [Human Factor, Controls and Statistics Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States); Hawkes, Grant L. [Thermal Science and Safety Analysis Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States); Einerson, Jeffrey J. [Human Factor, Controls and Statistics Department, Nuclear Science and Technology, Idaho National Laboratory, Idaho Falls, ID 83415 (United States)
2014-05-01
As part of the High Temperature Reactors (HTR) R and D program, a series of irradiation tests, designated as Advanced Gas-cooled Reactor (AGR), have been defined to support development and qualification of fuel design, fabrication process, and fuel performance under normal operation and accident conditions. The AGR tests employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule and instrumented with thermocouples (TC) embedded in graphite blocks enabling temperature control. While not possible to obtain by direct measurements in the tests, crucial fuel conditions (e.g., temperature, neutron fast fluence, and burnup) are calculated using core physics and thermal modeling codes. This paper is focused on AGR test fuel temperature predicted by the ABAQUS code's finite element-based thermal models. The work follows up on a previous study, in which several statistical analysis methods were adapted, implemented in the NGNP Data Management and Analysis System (NDMAS), and applied for qualification of AGR-1 thermocouple data. Abnormal trends in measured data revealed by the statistical analysis are traced to either measuring instrument deterioration or physical mechanisms in capsules that may have shifted the system thermal response. The main thrust of this work is to exploit the variety of data obtained in irradiation and post-irradiation examination (PIE) for assessment of modeling assumptions. As an example, the uneven reduction of the control gas gap in Capsule 5 found in the capsule metrology measurements in PIE helps identify mechanisms other than TC drift causing the decrease in TC readings. This suggests a more physics-based modification of the thermal model that leads to a better fit with experimental data, thus reducing model uncertainty and increasing confidence in the calculated fuel temperatures of the AGR-1 test.
International Nuclear Information System (INIS)
Kolluri, Srinivas Sahan; Esfahani, Iman Janghorban; Garikiparthy, Prithvi Sai Nadh; Yoo, Chang Kyoo
2015-01-01
Our aim was to analyze, monitor, and predict the outcomes of processes in a full-scale seawater reverse osmosis (SWRO) desalination plant using multivariate statistical techniques. Multivariate analysis of variance (MANOVA) was used to investigate the performance and efficiencies of two SWRO processes, namely, pore controllable fiber filterreverse osmosis (PCF-SWRO) and sand filtration-ultra filtration-reverse osmosis (SF-UF-SWRO). Principal component analysis (PCA) was applied to monitor the two SWRO processes. PCA monitoring revealed that the SF-UF-SWRO process could be analyzed reliably with a low number of outliers and disturbances. Partial least squares (PLS) analysis was then conducted to predict which of the seven input parameters of feed flow rate, PCF/SF-UF filtrate flow rate, temperature of feed water, turbidity feed, pH, reverse osmosis (RO)flow rate, and pressure had a significant effect on the outcome variables of permeate flow rate and concentration. Root mean squared errors (RMSEs) of the PLS models for permeate flow rates were 31.5 and 28.6 for the PCF-SWRO process and SF-UF-SWRO process, respectively, while RMSEs of permeate concentrations were 350.44 and 289.4, respectively. These results indicate that the SF-UF-SWRO process can be modeled more accurately than the PCF-SWRO process, because the RMSE values of permeate flowrate and concentration obtained using a PLS regression model of the SF-UF-SWRO process were lower than those obtained for the PCF-SWRO process.
Energy Technology Data Exchange (ETDEWEB)
Kolluri, Srinivas Sahan; Esfahani, Iman Janghorban; Garikiparthy, Prithvi Sai Nadh; Yoo, Chang Kyoo [Kyung Hee University, Yongin (Korea, Republic of)
2015-08-15
Our aim was to analyze, monitor, and predict the outcomes of processes in a full-scale seawater reverse osmosis (SWRO) desalination plant using multivariate statistical techniques. Multivariate analysis of variance (MANOVA) was used to investigate the performance and efficiencies of two SWRO processes, namely, pore controllable fiber filterreverse osmosis (PCF-SWRO) and sand filtration-ultra filtration-reverse osmosis (SF-UF-SWRO). Principal component analysis (PCA) was applied to monitor the two SWRO processes. PCA monitoring revealed that the SF-UF-SWRO process could be analyzed reliably with a low number of outliers and disturbances. Partial least squares (PLS) analysis was then conducted to predict which of the seven input parameters of feed flow rate, PCF/SF-UF filtrate flow rate, temperature of feed water, turbidity feed, pH, reverse osmosis (RO)flow rate, and pressure had a significant effect on the outcome variables of permeate flow rate and concentration. Root mean squared errors (RMSEs) of the PLS models for permeate flow rates were 31.5 and 28.6 for the PCF-SWRO process and SF-UF-SWRO process, respectively, while RMSEs of permeate concentrations were 350.44 and 289.4, respectively. These results indicate that the SF-UF-SWRO process can be modeled more accurately than the PCF-SWRO process, because the RMSE values of permeate flowrate and concentration obtained using a PLS regression model of the SF-UF-SWRO process were lower than those obtained for the PCF-SWRO process.
Directory of Open Access Journals (Sweden)
Carl A. Nelson
2012-01-01
Full Text Available This paper presents an analysis of 67 minimally invasive surgical procedures covering 11 different procedure types to determine patterns of tool use. A new graph-theoretic approach was taken to organize and analyze the data. Through grouping surgeries by type, trends of common tool changes were identified. Using the concept of signal/noise ratio, these trends were found to be statistically strong. The tool-use trends were used to generate tool placement patterns for modular (multi-tool, cartridge-type surgical tool systems, and the same 67 surgeries were numerically simulated to determine the optimality of these tool arrangements. The results indicate that aggregated tool-use data (by procedure type can be employed to predict tool-use sequences with good accuracy, and also indicate the potential for artificial intelligence as a means of preoperative and/or intraoperative planning. Furthermore, this suggests that the use of multifunction surgical tools can be optimized to streamline surgical workflow.
Alvarez, Diego A.; Hurtado, Jorge E.; Bedoya-Ruíz, Daniel Alveiro
2012-07-01
Despite technological advances in seismic instrumentation, the assessment of the intensity of an earthquake using an observational scale as given, for example, by the modified Mercalli intensity scale is highly useful for practical purposes. In order to link the qualitative numbers extracted from the acceleration record of an earthquake and other instrumental data such as peak ground velocity, epicentral distance, and moment magnitude on the one hand and the modified Mercalli intensity scale on the other, simple statistical regression has been generally employed. In this paper, we will employ three methods of nonlinear regression, namely support vector regression, multilayer perceptrons, and genetic programming in order to find a functional dependence between the instrumental records and the modified Mercalli intensity scale. The proposed methods predict the intensity of an earthquake while dealing with nonlinearity and the noise inherent to the data. The nonlinear regressions with good estimation results have been performed using the "Did You Feel It?" database of the US Geological Survey and the database of the Center for Engineering Strong Motion Data for the California region.
Evans, Mark
2016-12-01
A new parametric approach, termed the Wilshire equations, offers the realistic potential of being able to accurately lift materials operating at in-service conditions from accelerated test results lasting no more than 5000 hours. The success of this approach can be attributed to a well-defined linear relationship that appears to exist between various creep properties and a log transformation of the normalized stress. However, these linear trends are subject to discontinuities, the number of which appears to differ from material to material. These discontinuities have until now been (1) treated as abrupt in nature and (2) identified by eye from an inspection of simple graphical plots of the data. This article puts forward a statistical test for determining the correct number of discontinuities present within a creep data set and a method for allowing these discontinuities to occur more gradually, so that the methodology is more in line with the accepted view as to how creep mechanisms evolve with changing test conditions. These two developments are fully illustrated using creep data sets on two steel alloys. When these new procedures are applied to these steel alloys, not only do they produce more accurate and realistic looking long-term predictions of the minimum creep rate, but they also lead to different conclusions about the mechanisms determining the rates of creep from those originally put forward by Wilshire.
Ugarelli, Rita; Kristensen, Stig Morten; Røstum, Jon; Saegrov, Sveinung; Di Federico, Vittorio
2009-01-01
Oslo Vann og Avløpsetaten (Oslo VAV)-the water/wastewater utility in the Norwegian capital city of Oslo-is assessing future strategies for selection of most reliable materials for wastewater networks, taking into account not only material technical performance but also material performance, regarding operational condition of the system.The research project undertaken by SINTEF Group, the largest research organisation in Scandinavia, NTNU (Norges Teknisk-Naturvitenskapelige Universitet) and Oslo VAV adopts several approaches to understand reasons for failures that may impact flow capacity, by analysing historical data for blockages in Oslo.The aim of the study was to understand whether there is a relationship between the performance of the pipeline and a number of specific attributes such as age, material, diameter, to name a few. This paper presents the characteristics of the data set available and discusses the results obtained by performing two different approaches: a traditional statistical analysis by segregating the pipes into classes, each of which with the same explanatory variables, and a Evolutionary Polynomial Regression model (EPR), developed by Technical University of Bari and University of Exeter, to identify possible influence of pipe's attributes on the total amount of predicted blockages in a period of time.Starting from a detailed analysis of the available data for the blockage events, the most important variables are identified and a classification scheme is adopted.From the statistical analysis, it can be stated that age, size and function do seem to have a marked influence on the proneness of a pipeline to blockages, but, for the reduced sample available, it is difficult to say which variable it is more influencing. If we look at total number of blockages the oldest class seems to be the most prone to blockages, but looking at blockage rates (number of blockages per km per year), then it is the youngest class showing the highest blockage rate
Melsen, W G; Rovers, M M; Bonten, M J M; Bootsma, M C J|info:eu-repo/dai/nl/304830305
Variance between studies in a meta-analysis will exist. This heterogeneity may be of clinical, methodological or statistical origin. The last of these is quantified by the I(2) -statistic. We investigated, using simulated studies, the accuracy of I(2) in the assessment of heterogeneity and the
Geske, Jenenne A.; Mickelson, William T.; Bandalos, Deborah L.; Jonson, Jessica; Smith, Russell W.
The bulk of experimental research related to reforms in the teaching of statistics concentrates on the effects of alternative teaching methods on statistics achievement. This study expands on that research by including an examination of the effects of instructor and the interaction between instructor and method on achievement as well as attitudes,…
DEFF Research Database (Denmark)
Hansen, Niels Christian; Loui, Psyche; Vuust, Peter
Statistical learning underlies the generation of expectations with different degrees of uncertainty. In music, uncertainty applies to expectations for pitches in a melody. This uncertainty can be quantified by Shannon entropy from distributions of expectedness ratings for multiple continuations o...
Hu, Yijia; Zhong, Zhong; Zhu, Yimin; Ha, Yao
2018-04-01
In this paper, a statistical forecast model using the time-scale decomposition method is established to do the seasonal prediction of the rainfall during flood period (FPR) over the middle and lower reaches of the Yangtze River Valley (MLYRV). This method decomposites the rainfall over the MLYRV into three time-scale components, namely, the interannual component with the period less than 8 years, the interdecadal component with the period from 8 to 30 years, and the interdecadal component with the period larger than 30 years. Then, the predictors are selected for the three time-scale components of FPR through the correlation analysis. At last, a statistical forecast model is established using the multiple linear regression technique to predict the three time-scale components of the FPR, respectively. The results show that this forecast model can capture the interannual and interdecadal variation of FPR. The hindcast of FPR during 14 years from 2001 to 2014 shows that the FPR can be predicted successfully in 11 out of the 14 years. This forecast model performs better than the model using traditional scheme without time-scale decomposition. Therefore, the statistical forecast model using the time-scale decomposition technique has good skills and application value in the operational prediction of FPR over the MLYRV.
Manning, Robert M.
1986-01-01
A rain attenuation prediction model is described for use in calculating satellite communication link availability for any specific location in the world that is characterized by an extended record of rainfall. Such a formalism is necessary for the accurate assessment of such availability predictions in the case of the small user-terminal concept of the Advanced Communication Technology Satellite (ACTS) Project. The model employs the theory of extreme value statistics to generate the necessary statistical rainrate parameters from rain data in the form compiled by the National Weather Service. These location dependent rain statistics are then applied to a rain attenuation model to obtain a yearly prediction of the occurrence of attenuation on any satellite link at that location. The predictions of this model are compared to those of the Crane Two-Component Rain Model and some empirical data and found to be very good. The model is then used to calculate rain attenuation statistics at 59 locations in the United States (including Alaska and Hawaii) for the 20 GHz downlinks and 30 GHz uplinks of the proposed ACTS system. The flexibility of this modeling formalism is such that it allows a complete and unified treatment of the temporal aspects of rain attenuation that leads to the design of an optimum stochastic power control algorithm, the purpose of which is to efficiently counter such rain fades on a satellite link.
Kim, Da-Eun; Yang, Hyeri; Jang, Won-Hee; Jung, Kyoung-Mi; Park, Miyoung; Choi, Jin Kyu; Jung, Mi-Sook; Jeon, Eun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Park, Jung Eun; Sohn, Soo Jung; Kim, Tae Sung; Ahn, Il Young; Jeong, Tae-Cheon; Lim, Kyung-Min; Bae, SeungJin
2016-01-01
In order for a novel test method to be applied for regulatory purposes, its reliability and relevance, i.e., reproducibility and predictive capacity, must be demonstrated. Here, we examine the predictive capacity of a novel non-radioisotopic local lymph node assay, LLNA:BrdU-FCM (5-bromo-2'-deoxyuridine-flow cytometry), with a cutoff approach and inferential statistics as a prediction model. 22 reference substances in OECD TG429 were tested with a concurrent positive control, hexylcinnamaldehyde 25%(PC), and the stimulation index (SI) representing the fold increase in lymph node cells over the vehicle control was obtained. The optimal cutoff SI (2.7≤cutoff <3.5), with respect to predictive capacity, was obtained by a receiver operating characteristic curve, which produced 90.9% accuracy for the 22 substances. To address the inter-test variability in responsiveness, SI values standardized with PC were employed to obtain the optimal percentage cutoff (42.6≤cutoff <57.3% of PC), which produced 86.4% accuracy. A test substance may be diagnosed as a sensitizer if a statistically significant increase in SI is elicited. The parametric one-sided t-test and non-parametric Wilcoxon rank-sum test produced 77.3% accuracy. Similarly, a test substance could be defined as a sensitizer if the SI means of the vehicle control, and of the low, middle, and high concentrations were statistically significantly different, which was tested using ANOVA or Kruskal-Wallis, with post hoc analysis, Dunnett, or DSCF (Dwass-Steel-Critchlow-Fligner), respectively, depending on the equal variance test, producing 81.8% accuracy. The absolute SI-based cutoff approach produced the best predictive capacity, however the discordant decisions between prediction models need to be examined further. Copyright © 2015 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Đurić I.
2010-01-01
Full Text Available This paper presents the results of defining the mathematical model which describes the dependence of leaching degree of Al2O3 in bauxite from the most influential input parameters in industrial conditions of conducting the leaching process in the Bayer technology of alumina production. Mathematical model is defined using the stepwise MLRA method, with R2 = 0.764 and significant statistical reliability - VIF<2 and p<0.05, on the one-year statistical sample. Validation of the acquired model was performed using the data from the following year, collected from the process conducted under industrial conditions, rendering the same statistical reliability, with R2 = 0.759.
International Nuclear Information System (INIS)
Hashemi, Seyyedhossein; Javaherian, Abdolrahim; Ataee-pour, Majid; Khoshdel, Hossein
2014-01-01
Facies models try to explain facies architectures which have a primary control on the subsurface heterogeneities and the fluid flow characteristics of a given reservoir. In the process of facies modeling, geostatistical methods are implemented to integrate different sources of data into a consistent model. The facies models should describe facies interactions; the shape and geometry of the geobodies as they occur in reality. Two distinct categories of geostatistical techniques are two-point and multiple-point (geo) statistics (MPS). In this study, both of the aforementioned categories were applied to generate facies models. A sequential indicator simulation (SIS) and a truncated Gaussian simulation (TGS) represented two-point geostatistical methods, and a single normal equation simulation (SNESIM) selected as an MPS simulation representative. The dataset from an extremely channelized carbonate reservoir located in southwest Iran was applied to these algorithms to analyze their performance in reproducing complex curvilinear geobodies. The SNESIM algorithm needs consistent training images (TI) in which all possible facies architectures that are present in the area are included. The TI model was founded on the data acquired from modern occurrences. These analogies delivered vital information about the possible channel geometries and facies classes that are typically present in those similar environments. The MPS results were conditioned to both soft and hard data. Soft facies probabilities were acquired from a neural network workflow. In this workflow, seismic-derived attributes were implemented as the input data. Furthermore, MPS realizations were conditioned to hard data to guarantee the exact positioning and continuity of the channel bodies. A geobody extraction workflow was implemented to extract the most certain parts of the channel bodies from the seismic data. These extracted parts of the channel bodies were applied to the simulation workflow as hard data
Shabetia, Alexander; Rodichev, Yurii; Veer, F.A.; Soroka, Elena; Louter, Christian; Bos, Freek; Belis, Jan; Veer, Fred; Nijsse, Rob
An analytical approach based on the on the sequential partitioning of the data and Weibull Statistical Distribution for inhomogeneous - defective materials is proposed. It allows assessing the guaranteed strength of glass structures for the low probability of fracture with a higher degree of
Kabeshova, Anastasiia; Launay, Cyrille P; Gromov, Vasilii A; Fantino, Bruno; Levinoff, Elise J; Allali, Gilles; Beauchet, Olivier
2016-01-01
To compare performance criteria (i.e., sensitivity, specificity, positive predictive value, negative predictive value, area under receiver operating characteristic curve and accuracy) of linear and non-linear statistical models for fall risk in older community-dwellers. Participants were recruited in two large population-based studies, "Prévention des Chutes, Réseau 4" (PCR4, n=1760, cross-sectional design, retrospective collection of falls) and "Prévention des Chutes Personnes Agées" (PCPA, n=1765, cohort design, prospective collection of falls). Six linear statistical models (i.e., logistic regression, discriminant analysis, Bayes network algorithm, decision tree, random forest, boosted trees), three non-linear statistical models corresponding to artificial neural networks (multilayer perceptron, genetic algorithm and neuroevolution of augmenting topologies [NEAT]) and the adaptive neuro fuzzy interference system (ANFIS) were used. Falls ≥1 characterizing fallers and falls ≥2 characterizing recurrent fallers were used as outcomes. Data of studies were analyzed separately and together. NEAT and ANFIS had better performance criteria compared to other models. The highest performance criteria were reported with NEAT when using PCR4 database and falls ≥1, and with both NEAT and ANFIS when pooling data together and using falls ≥2. However, sensitivity and specificity were unbalanced. Sensitivity was higher than specificity when identifying fallers, whereas the converse was found when predicting recurrent fallers. Our results showed that NEAT and ANFIS were non-linear statistical models with the best performance criteria for the prediction of falls but their sensitivity and specificity were unbalanced, underscoring that models should be used respectively for the screening of fallers and the diagnosis of recurrent fallers. Copyright © 2015 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Understanding Statistics - Cancer Statistics
Annual reports of U.S. cancer statistics including new cases, deaths, trends, survival, prevalence, lifetime risk, and progress toward Healthy People targets, plus statistical summaries for a number of common cancer types.
Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D
2017-01-01
If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
2012-01-01
Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence
Czech Academy of Sciences Publication Activity Database
Brabec, Marek; Eben, Kryštof; Konár, Ondřej; Krč, Pavel; Pelikán, Emil; Juruš, Pavel
2014-01-01
Roč. 11, - (2014), EMS2014-429 [EMS Annual Meeting /14./ & European Conference on Applied Climatology (ECAC) /10./. 06.10.2014-10.10.2014, Prague] Institutional support: RVO:67985807 Keywords : prediction * mathematical modeling * renewable energies Subject RIV: JE - Non-nuclear Energetics, Energy Consumption ; Use
Directory of Open Access Journals (Sweden)
Jean-Baptiste Durand
2017-06-01
Full Text Available Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI with an autoregressive coefficient (γg efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γg and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.
Durand, Jean-Baptiste; Allard, Alix; Guitton, Baptiste; van de Weg, Eric; Bink, Marco C A M; Costes, Evelyne
2017-01-01
Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS) were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI) with an autoregressive coefficient (γ g ) efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γ g and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.
Directory of Open Access Journals (Sweden)
Shashank Vyas
2016-01-01
Full Text Available Integration of solar photovoltaic (PV generation with power distribution networks leads to many operational challenges and complexities. Unintentional islanding is one of them which is of rising concern given the steady increase in grid-connected PV power. This paper builds up on an exploratory study of unintentional islanding on a modeled radial feeder having large PV penetration. Dynamic simulations, also run in real time, resulted in exploration of unique potential causes of creation of accidental islands. The resulting voltage and current data underwent dimensionality reduction using principal component analysis (PCA which formed the basis for the application of Q statistic control charts for detecting the anomalous currents that could island the system. For reducing the false alarm rate of anomaly detection, Kullback-Leibler (K-L divergence was applied on the principal component projections which concluded that Q statistic based approach alone is not reliable for detection of the symptoms liable to cause unintentional islanding. The obtained data was labeled and a K-nearest neighbor (K-NN binomial classifier was then trained for identification and classification of potential islanding precursors from other power system transients. The three-phase short-circuit fault case was successfully identified as statistically different from islanding symptoms.
DEFF Research Database (Denmark)
Græsbøll, Kaare; Kirkeby, Carsten Thure; Nielsen, Søren Saxmose
2017-01-01
The future value of an individual dairy cow depends greatly on its projected milk yield. In developed countries with developed dairy industry infrastructures, facilities exist to record individual cow production and reproduction outcomes consistently and accurately. Accurate prediction of the fut...
Perai, A H; Nassiri Moghaddam, H; Asadpour, S; Bahrampour, J; Mansoori, Gh
2010-07-01
There has been a considerable and continuous interest to develop equations for rapid and accurate prediction of the ME of meat and bone meal. In this study, an artificial neural network (ANN), a partial least squares (PLS), and a multiple linear regression (MLR) statistical method were used to predict the TME(n) of meat and bone meal based on its CP, ether extract, and ash content. The accuracy of the models was calculated by R(2) value, MS error, mean absolute percentage error, mean absolute deviation, bias, and Theil's U. The predictive ability of an ANN was compared with a PLS and a MLR model using the same training data sets. The squared regression coefficients of prediction for the MLR, PLS, and ANN models were 0.38, 0.36, and 0.94, respectively. The results revealed that ANN produced more accurate predictions of TME(n) as compared with PLS and MLR methods. Based on the results of this study, ANN could be used as a promising approach for rapid prediction of nutritive value of meat and bone meal.
Petrovecki, Mladen; Rahelić, Dario; Bilić-Zulle, Lidija; Jelec, Vjekoslav
2003-02-01
To investigate whether and to what extent various parameters, such as individual characteristics, computer habits, situational factors, and pseudoscientific variables, influence Medical Informatics examination grade, and how inadequate statistical analysis can lead to wrong conclusions. The study included a total of 382 second-year undergraduate students at the Rijeka University School of Medicine in the period from 1996/97 to 2000/01 academic year. After passing the Medical Informatics exam, students filled out an anonymous questionnaire about their attitude toward learning medical informatics. They were asked to grade the course organization and curriculum content, and provide their date of birth; sex; study year; high school grades; Medical Informatics examination grade, type, and term; and describe their computer habits. From these data, we determined their zodiac signs and biorhythm. Data were compared by the use of t-test, one-way ANOVA with Tukey's honest significance difference test, and randomized complete block design ANOVA. Out of 21 variables analyzed, only 10 correlated with the average grade. Students taking Medical Informatics examination in the 1998/99 academic year earned lower average grade than any other generation. Significantly higher Medical Informatics exam grade was earned by students who finished a grammar high school; owned and regularly used a computer, Internet, and e-mail (pzodiac sign, zodiac sign quality, or biorhythm cycles, except when intentionally inadequate statistics was used for data analysis. Medical Informatics examination grades correlated with general learning capacity and computer habits of students, but showed no relation to other investigated parameters, such as examination term or pseudoscientific parameters. Inadequate statistical analysis can always confirm false conclusions.
DEFF Research Database (Denmark)
Græsbøll, Kaare; Kirkeby, Carsten Thure; Nielsen, Søren Saxmose
2017-01-01
of the future value of a dairy cow requires further detailed knowledge of the costs associated with feed, management practices, production systems, and disease. Here, we present a method to predict the future value of the milk production of a dairy cow based on herd recording data only. The method consists......The future value of an individual dairy cow depends greatly on its projected milk yield. In developed countries with developed dairy industry infrastructures, facilities exist to record individual cow production and reproduction outcomes consistently and accurately. Accurate prediction...... of somatic cell count. We conclude that estimates of future average production can be used on a day-to-day basis to rank cows for culling, or can be implemented in simulation models of within-herd disease spread to make operational decisions, such as culling versus treatment. An advantage of the approach...
International Nuclear Information System (INIS)
Alberts, A.C.
1995-01-01
The ability to noninvasively estimate clutch size and predict oviposition date in reptiles can be useful not only to veterinary clinicians but also to managers of captive collections and field researchers. Measurements of egg size and shape, as well as position of the clutch within the coelomic cavity, were taken from diagnostic radiographs of 20 female Cuban rock iguanas, Cyclura nubila, 81 to 18 days prior to laying. Combined with data on maternal body size, these variables were entered into multiple regression models to predict clutch size and timing of egg laying. The model for clutch size was accurate to 0.53 ± 0.08 eggs, while the model for oviposition date was accurate to 6.22 ± 0.81 days. Equations were generated that should be applicable to this and other large Cyclura species. © 1995 Wiley-Liss, Inc
Ashrafi, Mahnaz; Bahmanabadi, Akram; Akhond, Mohammad Reza; Arabipoor, Arezoo
2015-11-01
To evaluate demographic, medical history and clinical cycle characteristics of infertile non-polycystic ovary syndrome (NPCOS) women with the purpose of investigating their associations with the prevalence of moderate-to-severe OHSS. In this retrospective study, among 7073 in vitro fertilization and/or intracytoplasmic sperm injection (IVF/ICSI) cycles, 86 cases of NPCO patients who developed moderate-to-severe OHSS while being treated with IVF/ICSI cycles were analyzed during the period of January 2008 to December 2010 at Royan Institute. To review the OHSS risk factors, 172 NPCOS patients without developing OHSS, treated at the same period of time, were selected randomly by computer as control group. We used multiple logistic regression in a backward manner to build a prediction model. The regression analysis revealed that the variables, including age [odds ratio (OR) 0.9, confidence interval (CI) 0.81-0.99], antral follicles count (OR 4.3, CI 2.7-6.9), infertility cause (tubal factor, OR 11.5, CI 1.1-51.3), hypothyroidism (OR 3.8, CI 1.5-9.4) and positive history of ovarian surgery (OR 0.2, CI 0.05-0.9) were the most important predictors of OHSS. The regression model had an area under curve of 0.94, presenting an allowable discriminative performance that was equal with two strong predictive variables, including the number of follicles and serum estradiol level on human chorionic gonadotropin day. The predictive regression model based on primary characteristics of NPCOS patients had equal specificity in comparison with two mentioned strong predictive variables. Therefore, it may be beneficial to apply this model before the beginning of ovarian stimulation protocol.
Cai, Y.
2017-12-01
Accurately forecasting crop yields has broad implications for economic trading, food production monitoring, and global food security. However, the variation of environmental variables presents challenges to model yields accurately, especially when the lack of highly accurate measurements creates difficulties in creating models that can succeed across space and time. In 2016, we developed a sequence of machine-learning based models forecasting end-of-season corn yields for the US at both the county and national levels. We combined machine learning algorithms in a hierarchical way, and used an understanding of physiological processes in temporal feature selection, to achieve high precision in our intra-season forecasts, including in very anomalous seasons. During the live run, we predicted the national corn yield within 1.40% of the final USDA number as early as August. In the backtesting of the 2000-2015 period, our model predicts national yield within 2.69% of the actual yield on average already by mid-August. At the county level, our model predicts 77% of the variation in final yield using data through the beginning of August and improves to 80% by the beginning of October, with the percentage of counties predicted within 10% of the average yield increasing from 68% to 73%. Further, the lowest errors are in the most significant producing regions, resulting in very high precision national-level forecasts. In addition, we identify the changes of important variables throughout the season, specifically early-season land surface temperature, and mid-season land surface temperature and vegetation index. For the 2017 season, we feed 2016 data to the training set, together with additional geospatial data sources, aiming to make the current model even more precise. We will show how our 2017 US corn yield forecasts converges in time, which factors affect the yield the most, as well as present our plans for 2018 model adjustments.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yao, Zhigang; Xue, Zuo; He, Ruoying; Bao, Xianwen; Song, Jun
2016-08-01
A multivariate statistical downscaling method is developed to produce regional, high-resolution, coastal surface wind fields based on the IPCC global model predictions for the U.S. east coastal ocean, the Gulf of Mexico (GOM), and the Caribbean Sea. The statistical relationship is built upon linear regressions between the empirical orthogonal function (EOF) spaces of a cross- calibrated, multi-platform, multi-instrument ocean surface wind velocity dataset (predictand) and the global NCEP wind reanalysis (predictor) over a 10 year period from 2000 to 2009. The statistical relationship is validated before applications and its effectiveness is confirmed by the good agreement between downscaled wind fields based on the NCEP reanalysis and in-situ surface wind measured at 16 National Data Buoy Center (NDBC) buoys in the U.S. east coastal ocean and the GOM during 1992-1999. The predictand-predictor relationship is applied to IPCC GFDL model output (2.0°×2.5°) of downscaled coastal wind at 0.25°×0.25° resolution. The temporal and spatial variability of future predicted wind speeds and wind energy potential over the study region are further quantified. It is shown that wind speed and power would significantly be reduced in the high CO2 climate scenario offshore of the mid-Atlantic and northeast U.S., with the speed falling to one quarter of its original value.
Directory of Open Access Journals (Sweden)
J. Soete
2017-01-01
Full Text Available Microcomputed tomography (μCT and Lattice Boltzmann Method (LBM simulations were applied to continental carbonates to quantify fluid flow. Fluid flow characteristics in these complex carbonates with multiscale pore networks are unique and the applied method allows studying their heterogeneity and anisotropy. 3D pore network models were introduced to single-phase flow simulations in Palabos, a software tool for particle-based modelling of classic computational fluid dynamics. In addition, permeability simulations were also performed on rock models generated with multiple-point geostatistics (MPS. This allowed assessing the applicability of MPS in upscaling high-resolution porosity patterns into large rock models that exceed the volume limitations of the μCT. Porosity and tortuosity control fluid flow in these porous media. Micro- and mesopores influence flow properties at larger scales in continental carbonates. Upscaling with MPS is therefore necessary to overcome volume-resolution problems of CT scanning equipment. The presented LBM-MPS workflow is applicable to other lithologies, comprising different pore types, shapes, and pore networks altogether. The lack of straightforward porosity-permeability relationships in complex carbonates highlights the necessity for a 3D approach. 3D fluid flow studies provide the best understanding of flow through porous media, which is of crucial importance in reservoir modelling.
Possemiers, Mathias; Huysmans, Marijke; Batelaan, Okke
2015-08-01
Adequate aquifer characterization and simulation using heat transport models are indispensible for determining the optimal design for aquifer thermal energy storage (ATES) systems and wells. Recent model studies indicate that meter-scale heterogeneities in the hydraulic conductivity field introduce a considerable uncertainty in the distribution of thermal energy around an ATES system and can lead to a reduction in the thermal recoverability. In a study site in Bierbeek, Belgium, the influence of centimeter-scale clay drapes on the efficiency of a doublet ATES system and the distribution of the thermal energy around the ATES wells are quantified. Multiple-point geostatistical simulation of edge properties is used to incorporate the clay drapes in the models. The results show that clay drapes have an influence both on the distribution of thermal energy in the subsurface and on the efficiency of the ATES system. The distribution of the thermal energy is determined by the strike of the clay drapes, with the major axis of anisotropy parallel to the clay drape strike. The clay drapes have a negative impact (3.3-3.6 %) on the energy output in the models without a hydraulic gradient. In the models with a hydraulic gradient, however, the presence of clay drapes has a positive influence (1.6-10.2 %) on the energy output of the ATES system. It is concluded that it is important to incorporate small-scale heterogeneities in heat transport models to get a better estimate on ATES efficiency and distribution of thermal energy.
Possemiers, Mathias; Huysmans, Marijke; Batelaan, Okke
2015-04-01
Adequate aquifer characterization and simulation using heat transport models are indispensible for determining the optimal design for Aquifer Thermal Energy Storage (ATES) systems and wells. Recent model studies indicate that meter scale heterogeneities in the hydraulic conductivity field introduce a considerable uncertainty in the distribution of thermal energy around an ATES system and can lead to a reduction in the thermal recoverability. In this paper, the influence of centimeter scale clay drapes on the efficiency of a doublet ATES system and the distribution of the thermal energy around the ATES wells are quantified. Multiple-point geostatistical simulation of edge properties is used to incorporate the clay drapes in the models. The results show that clay drapes have an influence both on the distribution of thermal energy in the subsurface and on the efficiency of the ATES system. The distribution of the thermal energy is determined by the strike of the clay drapes, with the major axis of anisotropy parallel to the clay drape strike. The clay drapes have a negative impact (3.3 - 3.6%) on the energy output in the models without a hydraulic gradient. In the models with a hydraulic gradient, however, the presence of clay drapes has a positive influence (1.6 - 10.2%) on the energy output of the ATES system. It is concluded that it is important to incorporate small scale heterogeneities in heat transport models to get a better estimate on ATES efficiency and distribution of thermal energy.
International Nuclear Information System (INIS)
Silva Junior, H.C. da.
1978-12-01
Reactor fuel elements generally consist of rod bundles with the coolant flowing axially through the region between the rods. The confiability of the thermohydraulic design of such elements is related to a detailed description of the velocity field. A two-equation statistical model (K-epsilon) of turbulence is applied to compute main and secondary flow fields, wall shear stress distributions and friction factors of steady, fully developed turbulent flows, with incompressible, temperature independent fluid flowing axially through triangular or square arrays of rod bundles. The numerical procedure uses the vorticity and the stream function to describe the velocity field. Comparison with experimental and analytical data of several investigators is presented. Results are in good agreement. (Author) [pt
Directory of Open Access Journals (Sweden)
Lapierre FabianD
2010-01-01
Full Text Available Abstract For locating maritime vessels longer than 45 meters, such vessels are required to set up an Automatic Identification System (AIS used by vessel traffic services. However, when a boat is shutting down its AIS, there are no means to detect it in open sea. In this paper, we use Electro-Optical (EO imagers for noncooperative vessel detection when the AIS is not operational. As compared to radar sensors, EO sensors have lower cost, lower payload, and better computational processing load. EO sensors are mounted on LEO microsatellites. We propose a real-time statistical methodology to estimate sensor Receiver Operating Characteristic (ROC curves. It does not require the computation of the entire image received at the sensor. We then illustrate the use of this methodology to design a simple simulator that can help sensor manufacturers in optimizing the design of EO sensors for maritime applications.
Energy Technology Data Exchange (ETDEWEB)
Grimm, Lars J., E-mail: Lars.grimm@duke.edu; Ghate, Sujata V.; Yoon, Sora C.; Kim, Connie [Department of Radiology, Duke University Medical Center, Box 3808, Durham, North Carolina 27710 (United States); Kuzmiak, Cherie M. [Department of Radiology, University of North Carolina School of Medicine, 2006 Old Clinic, CB No. 7510, Chapel Hill, North Carolina 27599 (United States); Mazurowski, Maciej A. [Duke University Medical Center, Box 2731 Medical Center, Durham, North Carolina 27710 (United States)
2014-03-15
Purpose: The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Methods: Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth and clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Results: Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502–0.739, 95% Confidence Interval: 0.543–0.680,p < 0.002). Conclusions: Patterns in detection errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees.
Grimm, Lars J; Ghate, Sujata V; Yoon, Sora C; Kuzmiak, Cherie M; Kim, Connie; Mazurowski, Maciej A
2014-03-01
The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth and clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502-0.739, 95% Confidence Interval: 0.543-0.680,p errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees.
International Nuclear Information System (INIS)
Grimm, Lars J.; Ghate, Sujata V.; Yoon, Sora C.; Kim, Connie; Kuzmiak, Cherie M.; Mazurowski, Maciej A.
2014-01-01
Purpose: The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Methods: Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth and clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Results: Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502–0.739, 95% Confidence Interval: 0.543–0.680,p < 0.002). Conclusions: Patterns in detection errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees
Directory of Open Access Journals (Sweden)
Mónica F. Díaz
2012-12-01
Full Text Available Volatile organic compounds (VOCs are contained in a variety of chemicals that can be found in household products and may have undesirable effects on health. Thereby, it is important to model blood-to-liver partition coefficients (log Pliver for VOCs in a fast and inexpensive way. In this paper, we present two new quantitative structure-property relationship (QSPR models for the prediction of log Pliver, where we also propose a hybrid approach for the selection of the descriptors. This hybrid methodology combines a machine learning method with a manual selection based on expert knowledge. This allows obtaining a set of descriptors that is interpretable in physicochemical terms. Our regression models were trained using decision trees and neural networks and validated using an external test set. Results show high prediction accuracy compared to previous log Pliver models, and the descriptor selection approach provides a means to get a small set of descriptors that is in agreement with theoretical understanding of the target property.
Directory of Open Access Journals (Sweden)
Kathleen M O'Reilly
2011-10-01
Full Text Available Outbreaks of poliomyelitis in African countries that were previously free of wild-type poliovirus cost the Global Polio Eradication Initiative US$850 million during 2003-2009, and have limited the ability of the program to focus on endemic countries. A quantitative understanding of the factors that predict the distribution and timing of outbreaks will enable their prevention and facilitate the completion of global eradication.Children with poliomyelitis in Africa from 1 January 2003 to 31 December 2010 were identified through routine surveillance of cases of acute flaccid paralysis, and separate outbreaks associated with importation of wild-type poliovirus were defined using the genetic relatedness of these viruses in the VP1/2A region. Potential explanatory variables were examined for their association with the number, size, and duration of poliomyelitis outbreaks in 6-mo periods using multivariable regression analysis. The predictive ability of 6-mo-ahead forecasts of poliomyelitis outbreaks in each country based on the regression model was assessed. A total of 142 genetically distinct outbreaks of poliomyelitis were recorded in 25 African countries, resulting in 1-228 cases (median of two cases. The estimated number of people arriving from infected countries and <5-y childhood mortality were independently associated with the number of outbreaks. Immunisation coverage based on the reported vaccination history of children with non-polio acute flaccid paralysis was associated with the duration and size of each outbreak, as well as the number of outbreaks. Six-month-ahead forecasts of the number of outbreaks in a country or region changed over time and had a predictive ability of 82%.Outbreaks of poliomyelitis resulted primarily from continued transmission in Nigeria and the poor immunisation status of populations in neighbouring countries. From 1 January 2010 to 30 June 2011, reduced transmission in Nigeria and increased incidence in reinfected
Myung-Hee, Y. Kim; Shaowen, Hu; Cucinotta, Francis A.
2009-01-01
Large solar particle events (SPEs) present significant acute radiation risks to the crew members during extra-vehicular activities (EVAs) or in lightly shielded space vehicles for space missions beyond the protection of the Earth's magnetic field. Acute radiation sickness (ARS) can impair performance and result in failure of the mission. Improved forecasting capability and/or early-warning systems and proper shielding solutions are required to stay within NASA's short-term dose limits. Exactly how to make use of observations of SPEs for predicting occurrence and size is a great challenge, because SPE occurrences themselves are random in nature even though the expected frequency of SPEs is strongly influenced by the time position within the solar activity cycle. Therefore, we developed a probabilistic model approach, where a cumulative expected occurrence curve of SPEs for a typical solar cycle was formed from a non-homogeneous Poisson process model fitted to a database of proton fluence measurements of SPEs that occurred during the past 5 solar cycles (19 - 23) and those of large SPEs identified from impulsive nitrate enhancements in polar ice. From the fitted model, the expected frequency of SPEs was estimated at any given proton fluence threshold (Phi(sub E)) with energy (E) >30 MeV during a defined space mission period. Corresponding Phi(sub E) (E=30, 60, and 100 MeV) fluence distributions were simulated with a random draw from a gamma distribution, and applied for SPE ARS risk analysis for a specific mission period. It has been found that the accurate prediction of deep-seated organ doses was more precisely predicted at high energies, Phi(sub 100), than at lower energies such as Phi(sub 30) or Phi(sub 60), because of the high penetration depth of high energy protons. Estimates of ARS are then described for 90th and 95th percentile events for several mission lengths and for several likely organ dose-rates. The ability to accurately measure high energy protons
International Nuclear Information System (INIS)
Jauregui Ch, V.
2013-01-01
In this work the obtained results for a statistical analysis are shown, with the purpose of studying the performance of the fuel lattice, taking into account the frequency of the pins that were used. For this objective, different statistical distributions were used; one approximately to normal, another type X 2 but in an inverse form and a random distribution. Also, the prediction of some parameters of the nuclear reactor in a fuel reload was made through a neuronal network, which was trained. The statistical analysis was made using the parameters of the fuel lattice, which was generated through three heuristic techniques: Ant Colony Optimization System, Neuronal Networks and a hybrid among Scatter Search and Path Re linking. The behavior of the local power peak factor was revised in the fuel lattice with the use of different frequencies of enrichment uranium pines, using the three techniques mentioned before, in the same way the infinite multiplication factor of neutrons was analyzed (k..), to determine within what range this factor in the reactor is. Taking into account all the information, which was obtained through the statistical analysis, a neuronal network was trained; that will help to predict the behavior of some parameters of the nuclear reactor, considering a fixed fuel reload with their respective control rods pattern. In the same way, the quality of the training was evaluated using different fuel lattices. The neuronal network learned to predict the next parameters: Shutdown Margin (SDM), the pin burn peaks for two different fuel batches, Thermal Limits and the Effective Neutron Multiplication Factor (k eff ). The results show that the fuel lattices in which the frequency, which the inverted form of the X 2 distribution, was used revealed the best values of local power peak factor. Additionally it is shown that the performance of a fuel lattice could be enhanced controlling the frequency of the uranium enrichment rods and the variety of the gadolinium
Barman, S.; Bhattacharjya, R. K.
2017-12-01
The River Subansiri is the major north bank tributary of river Brahmaputra. It originates from the range of Himalayas beyond the Great Himalayan range at an altitude of approximately 5340m. Subansiri basin extends from tropical to temperate zones and hence exhibits a great diversity in rainfall characteristics. In the Northern and Central Himalayan tracts, precipitation is scarce on account of high altitudes. On the other hand, Southeast part of the Subansiri basin comprising the sub-Himalayan and the plain tract in Arunachal Pradesh and Assam, lies in the tropics. Due to Northeast as well as Southwest monsoon, precipitation occurs in this region in abundant quantities. Particularly, Southwest monsoon causes very heavy precipitation in the entire Subansiri basin during May to October. In this study, the rainfall over Subansiri basin has been studied at 24 different locations by multiple linear and non-linear regression based statistical downscaling techniques and by Artificial Neural Network based model. APHRODITE's gridded rainfall data of 0.25˚ x 0.25˚ resolutions and climatic parameters of HadCM3 GCM of resolution 2.5˚ x 3.75˚ (latitude by longitude) have been used in this study. It has been found that multiple non-linear regression based statistical downscaling technique outperformed the other techniques. Using this method, the future rainfall pattern over the Subansiri basin has been analyzed up to the year 2099 for four different time periods, viz., 2020-39, 2040-59, 2060-79, and 2080-99 at all the 24 locations. On the basis of historical rainfall, the months have been categorized as wet months, months with moderate rainfall and dry months. The spatial changes in rainfall patterns for all these three types of months have also been analyzed over the basin. Potential decrease of rainfall in the wet months and months with moderate rainfall and increase of rainfall in the dry months are observed for the future rainfall pattern of the Subansiri basin.
Chakraborty, Prithviraj; Parcha, Versha; Chakraborty, Debarupa D; Ghosh, Amitava
2016-01-01
Buccoadhesive wafer dosage form containing Loratadine is formulated utilizing Formulation by Design (FbD) approach incorporating sodium alginate and lactose monohydrate as independent variable employing solvent casting method. The wafers were statistically optimized using Response Surface Methodology (RSM) and Artificial Neural Network algorithm (ANN) for predicting physicochemical and physico-mechanical properties of the wafers as responses. Morphologically wafers were tested using SEM. Quick disintegration of the samples was examined employing Optical Contact Angle (OCA). The comparison of the predictability of RSM and ANN showed a high prognostic capacity of RSM model over ANN model in forecasting mechanical and physicochemical properties of the wafers. The in vivo assessment of the optimized buccoadhesive wafer exhibits marked increase in bioavailability justifying the administration of Loratadine through buccal route, bypassing hepatic first pass metabolism.
Energy Technology Data Exchange (ETDEWEB)
Tran, A; Yu, V; Nguyen, D; Woods, K; Low, D; Sheng, K [UCLA, Los Angeles, CA (United States)
2015-06-15
Purpose: Knowledge learned from previous plans can be used to guide future treatment planning. Existing knowledge-based treatment planning methods study the correlation between organ geometry and dose volume histogram (DVH), which is a lossy representation of the complete dose distribution. A statistical voxel dose learning (SVDL) model was developed that includes the complete dose volume information. Its accuracy of predicting volumetric-modulated arc therapy (VMAT) and non-coplanar 4π radiotherapy was quantified. SVDL provided more isotropic dose gradients and may improve knowledge-based planning. Methods: 12 prostate SBRT patients originally treated using two full-arc VMAT techniques were re-planned with 4π using 20 intensity-modulated non-coplanar fields to a prescription dose of 40 Gy. The bladder and rectum voxels were binned based on their distances to the PTV. The dose distribution in each bin was resampled by convolving to a Gaussian kernel, resulting in 1000 data points in each bin that predicted the statistical dose information of a voxel with unknown dose in a new patient without triaging information that may be collectively important to a particular patient. We used this method to predict the DVHs, mean and max doses in a leave-one-out cross validation (LOOCV) test and compared its performance against lossy estimators including mean, median, mode, Poisson and Rayleigh of the voxelized dose distributions. Results: SVDL predicted the bladder and rectum doses more accurately than other estimators, giving mean percentile errors ranging from 13.35–19.46%, 4.81–19.47%, 22.49–28.69%, 23.35–30.5%, 21.05–53.93% for predicting mean, max dose, V20, V35, and V40 respectively, to OARs in both planning techniques. The prediction errors were generally lower for 4π than VMAT. Conclusion: By employing all dose volume information in the SVDL model, the OAR doses were more accurately predicted. 4π plans are better suited for knowledge-based planning than
International Nuclear Information System (INIS)
Salmahaminati; Yahya, Amri; Husnaqilati, Atina
2017-01-01
Trash management is one of the society participation to have a good hygiene for each area or nationally. Trash is known as the remainder of regular consumption that should be disposed to do waste processing which will be beneficial and improve the hygiene. The way to do is by sorting plastic which is processed into goods in accordance with the waste. In this study, we will know what are the factors that affect the desire of citizens to process the waste. The factors would have the identity and the state of being of each resident, having known of these factors will be the education about waste management, so it can be compared how the results of the extension by using preliminary data prior to the extension and the final data after extension. The analysis uses multiple logistic regression is the identify factors that influence people’s to desire the waste while the comparison results using t analysis. Data is derived from statistical instrument in the form of a questionnaire. (paper)
Kwon, Hyun-Han; Lall, Upmanu; Engel, Vic
2011-09-01
The ability to map relationships between ecological outcomes and hydrologic conditions in the Everglades National Park (ENP) is a key building block for their restoration program, a primary goal of which is to improve conditions for wading birds. This paper presents a model linking wading bird foraging numbers to hydrologic conditions in the ENP. Seasonal hydrologic statistics derived from a single water level recorder are well correlated with water depths throughout most areas of the ENP, and are effective as predictors of wading bird numbers when using a nonlinear hierarchical Bayesian model to estimate the conditional distribution of bird populations. Model parameters are estimated using a Markov chain Monte Carlo (MCMC) procedure. Parameter and model uncertainty is assessed as a byproduct of the estimation process. Water depths at the beginning of the nesting season, the average dry season water level, and the numbers of reversals from the dry season recession are identified as significant predictors, consistent with the hydrologic conditions considered important in the production and concentration of prey organisms in this system. Long-term hydrologic records at the index location allow for a retrospective analysis (1952-2006) of foraging bird numbers showing low frequency oscillations in response to decadal fluctuations in hydroclimatic conditions. Simulations of water levels at the index location used in the Bayesian model under alternative water management scenarios allow the posterior probability distributions of the number of foraging birds to be compared, thus providing a mechanism for linking management schemes to seasonal rainfall forecasts.
International Nuclear Information System (INIS)
Choi, K.Y.; Yoon, Y.K.; Chang, S.H.
1991-01-01
This paper reports on a new statistical fuel failure model developed to take into account the effects of damaging environmental conditions and the overall operating history of the fuel elements. The degradation of material properties and damage resistance of the fuel cladding is mainly caused by the combined effects of accumulated dynamic stresses, neutron irradiation, and chemical and stress corrosion at operating temperature. Since the degradation of material properties due to these effects can be considered as a stochastic process, a dynamic reliability function is derived based on the Markov process. Four damage parameters, namely, dynamic stresses, magnitude of power increase from the preceding power level and with ramp rate, and fatigue cycles, are used to build this model. The dynamic reliability function and damage parameters are used to obtain effective damage parameters. The entropy maximization principle is used to generate a probability density function of the effective damage parameters. The entropy minimization principle is applied to determine weighting factors for amalgamation of the failure probabilities due to the respective failure modes. In this way, the effects of operating history, damaging environmental conditions, and damage sequence are taken into account
Energy Technology Data Exchange (ETDEWEB)
Mishra, Srikanta; Schuetter, Jared
2014-11-01
We compare two approaches for building a statistical proxy model (metamodel) for CO₂ geologic sequestration from the results of full-physics compositional simulations. The first approach involves a classical Box-Behnken or Augmented Pairs experimental design with a quadratic polynomial response surface. The second approach used a space-filling maxmin Latin Hypercube sampling or maximum entropy design with the choice of five different meta-modeling techniques: quadratic polynomial, kriging with constant and quadratic trend terms, multivariate adaptive regression spline (MARS) and additivity and variance stabilization (AVAS). Simulations results for CO₂ injection into a reservoir-caprock system with 9 design variables (and 97 samples) were used to generate the data for developing the proxy models. The fitted models were validated with using an independent data set and a cross-validation approach for three different performance metrics: total storage efficiency, CO₂ plume radius and average reservoir pressure. The Box-Behnken–quadratic polynomial metamodel performed the best, followed closely by the maximin LHS–kriging metamodel.
Stergiopoulos, Ch.; Stavrakas, I.; Triantis, D.; Vallianatos, F.; Stonham, J.
2015-02-01
Weak electric signals termed as 'Pressure Stimulated Currents, PSC' are generated and detected while cement based materials are found under mechanical load, related to the creation of cracks and the consequent evolution of cracks' network in the bulk of the specimen. During the experiment a set of cement mortar beams of rectangular cross-section were subjected to Three-Point Bending (3PB). For each one of the specimens an abrupt mechanical load step was applied, increased from the low load level (Lo) to a high final value (Lh) , where Lh was different for each specimen and it was maintained constant for long time. The temporal behavior of the recorded PSC show that during the load increase a spike-like PSC emission was recorded and consequently a relaxation of the PSC, after reaching its final value, follows. The relaxation process of the PSC was studied using non-extensive statistical physics (NESP) based on Tsallis entropy equation. The behavior of the Tsallis q parameter was studied in relaxation PSCs in order to investigate its potential use as an index for monitoring the crack evolution process with a potential use in non-destructive laboratory testing of cement-based specimens of unknown internal damage level. The dependence of the q-parameter on the Lh (when Lh <0.8Lf), where Lf represents the 3PB strength of the specimen, shows an increase on the q value when the specimens are subjected to gradually higher bending loadings and reaches a maximum value close to 1.4 when the applied Lh becomes higher than 0.8Lf. While the applied Lh becomes higher than 0.9Lf the value of the q-parameter gradually decreases. This analysis of the experimental data manifests that the value of the entropic index q obtains a characteristic decrease while reaching the ultimate strength of the specimen, and thus could be used as a forerunner of the expected failure.
Directory of Open Access Journals (Sweden)
Gambo Anthony VICTOR
2017-06-01
Full Text Available A model to predict the dry sliding wear behaviour of Aluminium-Jute bast ash particulate composites produced by double stir-casting method was developed in terms of weight fraction of jute bast ash (JBA. Experiments were designed on the basis of the Design of Experiments (DOE technique. A 2k factorial, where k is the number of variables, with central composite second-order rotatable design was used to improve the reliability of results and to reduce the size of experimentation without loss of accuracy. The factors considered in this study were sliding velocity, sliding distance, normal load and mass fraction of JBA reinforcement in the matrix. The developed regression model was validated by statistical software MINITAB-R14 and statistical tool such as analysis of variance (ANOVA. It was found that the developed regression model could be effectively used to predict the wear rate at 95% confidence level. The wear rate of cast Al-JBAp composite decreased with an increase in the mass fraction of JBA and increased with an increase of the sliding velocity, sliding distance and normal load acting on the composite specimen.
Farmer, William H.; Knight, Rodney R.; Eash, David A.; Kasey J. Hutchinson,; Linhart, S. Mike; Christiansen, Daniel E.; Archfield, Stacey A.; Over, Thomas M.; Kiang, Julie E.
2015-08-24
Daily records of streamflow are essential to understanding hydrologic systems and managing the interactions between human and natural systems. Many watersheds and locations lack streamgages to provide accurate and reliable records of daily streamflow. In such ungaged watersheds, statistical tools and rainfall-runoff models are used to estimate daily streamflow. Previous work compared 19 different techniques for predicting daily streamflow records in the southeastern United States. Here, five of the better-performing methods are compared in a different hydroclimatic region of the United States, in Iowa. The methods fall into three classes: (1) drainage-area ratio methods, (2) nonlinear spatial interpolations using flow duration curves, and (3) mechanistic rainfall-runoff models. The first two classes are each applied with nearest-neighbor and map-correlated index streamgages. Using a threefold validation and robust rank-based evaluation, the methods are assessed for overall goodness of fit of the hydrograph of daily streamflow, the ability to reproduce a daily, no-fail storage-yield curve, and the ability to reproduce key streamflow statistics. As in the Southeast study, a nonlinear spatial interpolation of daily streamflow using flow duration curves is found to be a method with the best predictive accuracy. Comparisons with previous work in Iowa show that the accuracy of mechanistic models with at-site calibration is substantially degraded in the ungaged framework.
Energy Technology Data Exchange (ETDEWEB)
Curry, Charles L. [Canadian Centre for Climate Modelling and Analysis, Environment Canada, University of Victoria, Victoria, BC (Canada); School of Earth and Ocean Sciences, University of Victoria, Victoria, BC (Canada); Kamp, Derek van der [School of Earth and Ocean Sciences, University of Victoria, Victoria, BC (Canada); Pacific Climate Impacts Consortium, University of Victoria, Victoria, BC (Canada); Monahan, Adam H. [School of Earth and Ocean Sciences, University of Victoria, Victoria, BC (Canada)
2012-04-15
Surface wind speed is a key climatic variable of interest in many applications, including assessments of storm-related infrastructure damage and feasibility studies of wind power generation. In this work and a companion paper (van der Kamp et al. 2011), the relationship between local surface wind and large-scale climate variables was studied using multiple regression analysis. The analysis was performed using monthly mean station data from British Columbia, Canada and large-scale climate variables (predictors) from the NCEP-2 reanalysis over the period 1979-2006. Two regression-based methodologies were compared. The first relates the annual cycle of station wind speed to that of the large-scale predictors at the closest grid box to the station. It is shown that the relatively high correlation coefficients obtained with this method are attributable to the dominant influence of region-wide seasonality, and thus contain minimal information about local wind behaviour at the stations. The second method uses interannually varying data for individual months, aggregated into seasons, and is demonstrated to contain intrinsically local information about the surface winds. The dependence of local wind speed upon large-scale predictors over a much larger region surrounding the station was also explored, resulting in 2D maps of spatial correlations. The cross-validated explained variance using the interannual method was highest in autumn and winter, ranging from 30 to 70% at about a dozen stations in the region. Reasons for the limited predictive skill of the regressions and directions for future progress are reviewed. (orig.)
Dėdelė, Audrius; Miškinytė, Auksė
2015-09-01
In many countries, road traffic is one of the main sources of air pollution associated with adverse effects on human health and environment. Nitrogen dioxide (NO2) is considered to be a measure of traffic-related air pollution, with concentrations tending to be higher near highways, along busy roads, and in the city centers, and the exceedances are mainly observed at measurement stations located close to traffic. In order to assess the air quality in the city and the air pollution impact on public health, air quality models are used. However, firstly, before the model can be used for these purposes, it is important to evaluate the accuracy of the dispersion modelling as one of the most widely used method. The monitoring and dispersion modelling are two components of air quality monitoring system (AQMS), in which statistical comparison was made in this research. The evaluation of the Atmospheric Dispersion Modelling System (ADMS-Urban) was made by comparing monthly modelled NO2 concentrations with the data of continuous air quality monitoring stations in Kaunas city. The statistical measures of model performance were calculated for annual and monthly concentrations of NO2 for each monitoring station site. The spatial analysis was made using geographic information systems (GIS). The calculation of statistical parameters indicated a good ADMS-Urban model performance for the prediction of NO2. The results of this study showed that the agreement of modelled values and observations was better for traffic monitoring stations compared to the background and residential stations.
Energy Technology Data Exchange (ETDEWEB)
Guo, Boyun [Univ. of Louisiana, Lafayette, LA (United States); Duguid, Andrew [Battelle, Columbus, OH (United States); Nygaard, Ronar [Missouri Univ. of Science and Technology, Rolla, MO (United States)
2017-08-05
The objective of this project is to develop a computerized statistical model with the Integrated Neural-Genetic Algorithm (INGA) for predicting the probability of long-term leak of wells in CO_{2} sequestration operations. This object has been accomplished by conducting research in three phases: 1) data mining of CO_{2}-explosed wells, 2) INGA computer model development, and 3) evaluation of the predictive performance of the computer model with data from field tests. Data mining was conducted for 510 wells in two CO_{2} sequestration projects in the Texas Gulf Coast region. They are the Hasting West field and Oyster Bayou field in the Southern Texas. Missing wellbore integrity data were estimated using an analytical and Finite Element Method (FEM) model. The INGA was first tested for performances of convergence and computing efficiency with the obtained data set of high dimension. It was concluded that the INGA can handle the gathered data set with good accuracy and reasonable computing time after a reduction of dimension with a grouping mechanism. A computerized statistical model with the INGA was then developed based on data pre-processing and grouping. Comprehensive training and testing of the model were carried out to ensure that the model is accurate and efficient enough for predicting the probability of long-term leak of wells in CO_{2} sequestration operations. The Cranfield in the southern Mississippi was select as the test site. Observation wells CFU31F2 and CFU31F3 were used for pressure-testing, formation-logging, and cement-sampling. Tools run in the wells include Isolation Scanner, Slim Cement Mapping Tool (SCMT), Cased Hole Formation Dynamics Tester (CHDT), and Mechanical Sidewall Coring Tool (MSCT). Analyses of the obtained data indicate no leak of CO_{2} cross the cap zone while it is evident that the well cement sheath was invaded by the CO_{2} from the storage zone. This observation is consistent
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
2016-08-31
Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesian inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.
Karaismailoğlu, Eda; Dikmen, Zeliha Günnur; Akbıyık, Filiz; Karaağaoğlu, Ahmet Ergun
2018-04-30
Background/aim: Myoglobin, cardiac troponin T, B-type natriuretic peptide (BNP), and creatine kinase isoenzyme MB (CK-MB) are frequently used biomarkers for evaluating risk of patients admitted to an emergency department with chest pain. Recently, time- dependent receiver operating characteristic (ROC) analysis has been used to evaluate the predictive power of biomarkers where disease status can change over time. We aimed to determine the best set of biomarkers that estimate cardiac death during follow-up time. We also obtained optimal cut-off values of these biomarkers, which differentiates between patients with and without risk of death. A web tool was developed to estimate time intervals in risk. Materials and methods: A total of 410 patients admitted to the emergency department with chest pain and shortness of breath were included. Cox regression analysis was used to determine an optimal set of biomarkers that can be used for estimating cardiac death and to combine the significant biomarkers. Time-dependent ROC analysis was performed for evaluating performances of significant biomarkers and a combined biomarker during 240 h. The bootstrap method was used to compare statistical significance and the Youden index was used to determine optimal cut-off values. Results : Myoglobin and BNP were significant by multivariate Cox regression analysis. Areas under the time-dependent ROC curves of myoglobin and BNP were about 0.80 during 240 h, and that of the combined biomarker (myoglobin + BNP) increased to 0.90 during the first 180 h. Conclusion: Although myoglobin is not clinically specific to a cardiac event, in our study both myoglobin and BNP were found to be statistically significant for estimating cardiac death. Using this combined biomarker may increase the power of prediction. Our web tool can be useful for evaluating the risk status of new patients and helping clinicians in making decisions.
International Nuclear Information System (INIS)
Lim, Gyeong Hui
2008-03-01
This book consists of 15 chapters, which are basic conception and meaning of statistical thermodynamics, Maxwell-Boltzmann's statistics, ensemble, thermodynamics function and fluctuation, statistical dynamics with independent particle system, ideal molecular system, chemical equilibrium and chemical reaction rate in ideal gas mixture, classical statistical thermodynamics, ideal lattice model, lattice statistics and nonideal lattice model, imperfect gas theory on liquid, theory on solution, statistical thermodynamics of interface, statistical thermodynamics of a high molecule system and quantum statistics
Energy Technology Data Exchange (ETDEWEB)
Alfonso, L. [Universidad Autonoma de la Ciudad de Mexico, Mexico D.F. 09790 (Mexico); Caleyo, F.; Hallen, J. M.; Araujo, J. [ESIQIE, Instituto Politecnico Nacional, Mexico D.F. (Mexico)
2010-07-01
Pitting corrosion entails serious risks in industrial plants, since a perforation resulting from a single pit can cause the failure of in-service components like water pipes, heat exchangers or oil tanks. A number of statistical methods have been suggested to estimate the maximum pit depth. Over the years, a successful application of extreme value analysis has been found in the application of the Gumbel distribution to predict the maximum pit depth from a smaller number of samples with small area. There is a lack of studies devoted to the applicability of the Gumbel method to the prediction of maximum pitting-corrosion depth. The aim of the work presented in this paper is to introduce a new strategy for the application of the Gumbel method in real pipelines. The methodology proposed is based on the fact that the clustered pattern of the pit depth distribution is less pronounced when the analysis is restricted to sections of the pipeline that exhibits similar characteristics.
Kafle, Gopi Krishna; Chen, Lide
2016-02-01
There is a lack of literature reporting the methane potential of several livestock manures under the same anaerobic digestion conditions (same inoculum, temperature, time, and size of the digester). To the best of our knowledge, no previous study has reported biochemical methane potential (BMP) predicting models developed and evaluated by solely using at least five different livestock manure tests results. The goal of this study was to evaluate the BMP of five different livestock manures (dairy manure (DM), horse manure (HM), goat manure (GM), chicken manure (CM) and swine manure (SM)) and to predict the BMP using different statistical models. Nutrients of the digested different manures were also monitored. The BMP tests were conducted under mesophilic temperatures with a manure loading factor of 3.5g volatile solids (VS)/L and a feed to inoculum ratio (F/I) of 0.5. Single variable and multiple variable regression models were developed using manure total carbohydrate (TC), crude protein (CP), total fat (TF), lignin (LIG) and acid detergent fiber (ADF), and measured BMP data. Three different kinetic models (first order kinetic model, modified Gompertz model and Chen and Hashimoto model) were evaluated for BMP predictions. The BMPs of DM, HM, GM, CM and SM were measured to be 204, 155, 159, 259, and 323mL/g VS, respectively and the VS removals were calculated to be 58.6%, 52.9%, 46.4%, 81.4%, 81.4%, respectively. The technical digestion time (T80-90, time required to produce 80-90% of total biogas production) for DM, HM, GM, CM and SM was calculated to be in the ranges of 19-28, 27-37, 31-44, 13-18, 12-17days, respectively. The effluents from the HM showed the lowest nitrogen, phosphorus and potassium concentrations. The effluents from the CM digesters showed highest nitrogen and phosphorus concentrations and digested SM showed highest potassium concentration. Based on the results of the regression analysis, the model using the variable of LIG showed the best (R(2
Kourgialas, Nektarios N; Dokou, Zoi; Karatzas, George P
2015-05-01
The purpose of this study was to create a modeling management tool for the simulation of extreme flow events under current and future climatic conditions. This tool is a combination of different components and can be applied in complex hydrogeological river basins, where frequent flood and drought phenomena occur. The first component is the statistical analysis of the available hydro-meteorological data. Specifically, principal components analysis was performed in order to quantify the importance of the hydro-meteorological parameters that affect the generation of extreme events. The second component is a prediction-forecasting artificial neural network (ANN) model that simulates, accurately and efficiently, river flow on an hourly basis. This model is based on a methodology that attempts to resolve a very difficult problem related to the accurate estimation of extreme flows. For this purpose, the available measurements (5 years of hourly data) were divided in two subsets: one for the dry and one for the wet periods of the hydrological year. This way, two ANNs were created, trained, tested and validated for a complex Mediterranean river basin in Crete, Greece. As part of the second management component a statistical downscaling tool was used for the creation of meteorological data according to the higher and lower emission climate change scenarios A2 and B1. These data are used as input in the ANN for the forecasting of river flow for the next two decades. The final component is the application of a meteorological index on the measured and forecasted precipitation and flow data, in order to assess the severity and duration of extreme events. Copyright © 2015 Elsevier Ltd. All rights reserved.
Energy Technology Data Exchange (ETDEWEB)
Werth, D.; Chen, K. F.
2013-08-22
The ability of water managers to maintain adequate supplies in coming decades depends, in part, on future weather conditions, as climate change has the potential to alter river flows from their current values, possibly rendering them unable to meet demand. Reliable climate projections are therefore critical to predicting the future water supply for the United States. These projections cannot be provided solely by global climate models (GCMs), however, as their resolution is too coarse to resolve the small-scale climate changes that can affect hydrology, and hence water supply, at regional to local scales. A process is needed to ‘downscale’ the GCM results to the smaller scales and feed this into a surface hydrology model to help determine the ability of rivers to provide adequate flow to meet future needs. We apply a statistical downscaling to GCM projections of precipitation and temperature through the use of a scaling method. This technique involves the correction of the cumulative distribution functions (CDFs) of the GCM-derived temperature and precipitation results for the 20{sup th} century, and the application of the same correction to 21{sup st} century GCM projections. This is done for three meteorological stations located within the Coosa River basin in northern Georgia, and is used to calculate future river flow statistics for the upper Coosa River. Results are compared to the historical Coosa River flow upstream from Georgia Power Company’s Hammond coal-fired power plant and to flows calculated with the original, unscaled GCM results to determine the impact of potential changes in meteorology on future flows.
... What Is Cancer? Cancer Statistics Cancer Disparities Cancer Statistics Cancer has a major impact on society in ... success of efforts to control and manage cancer. Statistics at a Glance: The Burden of Cancer in ...
Directory of Open Access Journals (Sweden)
Mohammed M. Kamal
2012-09-01
Full Text Available MODerate Resolution Imaging Spectroradiometer (MODIS aerosol retrievals over the North Atlantic spanning seven hurricane seasons are combined with the Statistical Hurricane Intensity Prediction Scheme (SHIPS parameters. The difference between the current and future intensity changes were selected as response variables. For 24 major hurricanes (category 3, 4 and 5 between 2003 and 2009, eight lead time response variables were determined to be between 6 and 48 h. By combining MODIS and SHIPS data, 56 variables were compiled and selected as predictors for this study. Variable reduction from 56 to 31 was performed in two steps; the first step was via correlation coefficients (cc followed by Principal Component Analysis (PCA extraction techniques. The PCA reduced 31 variables to 20. Five categories were established based on the PCA group variables exhibiting similar physical phenomena. Average aerosol retrievals from MODIS Level 2 data in the vicinity of UTC 1,200 and 1,800 h were mapped to the SHIPS parameters to perform Multiple Linear Regression (MLR between each response variable against six sets of predictors of 31, 30, 28, 27, 23 and 20 variables. The deviation among the predictors Root Mean Square Error (RMSE varied between 0.01 through 0.05 and, therefore, implied that reducing the number of variables did not change the core physical information. Even when the parameters are reduced from 56 to 20, the correlation values exhibit a stronger relationship between the response and predictors. Therefore, the same phenomena can be explained by the reduction of variables.
Kiss, I.; Cioată, V. G.; Ratiu, S. A.; Rackov, M.; Penčić, M.
2018-01-01
Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. This article focuses on expressing the multiple linear regression model related to the hardness assurance by the chemical composition of the phosphorous cast irons destined to the brake shoes, having in view that the regression coefficients will illustrate the unrelated contributions of each independent variable towards predicting the dependent variable. In order to settle the multiple correlations between the hardness of the cast-iron brake shoes, and their chemical compositions several regression equations has been proposed. Is searched a mathematical solution which can determine the optimum chemical composition for the hardness desirable values. Starting from the above-mentioned affirmations two new statistical experiments are effectuated related to the values of Phosphorus [P], Manganese [Mn] and Silicon [Si]. Therefore, the regression equations, which describe the mathematical dependency between the above-mentioned elements and the hardness, are determined. As result, several correlation charts will be revealed.
International Nuclear Information System (INIS)
Siddiqui, A.R.; Khan, M.M.A.; Ismail, B.M.
1999-01-01
Oxygen is blown in Converter process to oxidize hot metal. This introduces dissolved oxygen in the metal, which may cause embrittlement, voids, inclusion and other undesirable properties in steel. The steel bath at the time of tapping contains 400 to 800 ppm oxygen. Deoxidation is carried out during tapping by adding into the tap ladle appropriate amounts of ferromanganese, ferrosilicon and/or aluminum or other special deoxidizers. In the research aluminum killed grade steel which are casted at the slab caster of Pakistan Steel were investigated. Amount of aluminum added is very critical because if we add lesser amount of aluminum then the required quantity then there will be an incomplete killing of oxygen which results uncleanness in steel. Addition of larger amount of aluminum not only increases the cost of the production but also results as higher amount of alumina, which results in nozzle clogging and increase, loses. The purpose of the research is to develop a statistical model which would predict correct amount of aluminum addition for complete deoxidation of aluminum killed grade casted at slab continuous caster of Pakistan Steel. In the model aluminum added is taken as dependent variable while tapping temperature, turn down carbon composition, turndown manganese composition and oxygen content in steel would be the independent variable. This work is based on operational practice on 130 tons Basic Oxygen furnace. (author)
Apel, Heiko; Baimaganbetov, Azamat; Kalashnikova, Olga; Gavrilenko, Nadejda; Abdykerimova, Zharkinay; Agalhanova, Marina; Gerlitz, Lars; Unger-Shayesteh, Katy; Vorogushyn, Sergiy; Gafurov, Abror
2017-04-01
The semi-arid regions of Central Asia crucially depend on the water resources supplied by the mountainous areas of the Tien-Shan and Pamirs. During the summer months the snow and glacier melt dominated river discharge originating in the mountains provides the main water resource available for agricultural production, but also for storage in reservoirs for energy generation during the winter months. Thus a reliable seasonal forecast of the water resources is crucial for a sustainable management and planning of water resources. In fact, seasonal forecasts are mandatory tasks of all national hydro-meteorological services in the region. In order to support the operational seasonal forecast procedures of hydromet services, this study aims at the development of a generic tool for deriving statistical forecast models of seasonal river discharge. The generic model is kept as simple as possible in order to be driven by available hydrological and meteorological data, and be applicable for all catchments with their often limited data availability in the region. As snowmelt dominates summer runoff, the main meteorological predictors for the forecast models are monthly values of winter precipitation and temperature as recorded by climatological stations in the catchments. These data sets are accompanied by snow cover predictors derived from the operational ModSnow tool, which provides cloud free snow cover data for the selected catchments based on MODIS satellite images. In addition to the meteorological data antecedent streamflow is used as a predictor variable. This basic predictor set was further extended by multi-monthly means of the individual predictors, as well as composites of the predictors. Forecast models are derived based on these predictors as linear combinations of up to 3 or 4 predictors. A user selectable number of best models according to pre-defined performance criteria is extracted automatically by the developed model fitting algorithm, which includes a test
Generalized interpolative quantum statistics
International Nuclear Information System (INIS)
Ramanathan, R.
1992-01-01
A generalized interpolative quantum statistics is presented by conjecturing a certain reordering of phase space due to the presence of possible exotic objects other than bosons and fermions. Such an interpolation achieved through a Bose-counting strategy predicts the existence of an infinite quantum Boltzmann-Gibbs statistics akin to the one discovered by Greenberg recently
International Nuclear Information System (INIS)
Goldflam, R.; Kouri, D.J.
1976-01-01
New methods for predicting the full matrix of integral cross sections are developed by combining the surprisal analysis of Bernstein and Levine with the j/sub z/-conserving coupled states method (j/sub z/CCS) of McGuire, Kouri, and Pack and with the statistical j/sub z/ approximation (Sj/sub z/) of Kouri, Shimoni, and Heil. A variety of approaches is possible and only three are studied in the present work. These are (a) a surprisal fit of the j=0→j' column of the j/sub z/CCS cross section matrix (thereby requiring only a solution of the lambda=0 set of j/sub z/CCS equations), (b) a surprisal fit of the lambda-bar=0 Sj/sub z/ cross section matrix (again requiring solution of the lambda=0 set of j/sub z/CCS equations only), and (c) a surprisal fit of a lambda-bar not equal to 0 Sj/sub z/ submatrix (involving input cross sections for j,j'> or =lambda-bar transitions only). The last approach requires the solution of the lambda=lambda-bar set of j/sub z/CCS equations only, which requires less computation effort than the effective potential method. We explore three different choices for the prior and two-parameter (i.e., linear) and three-parameter (i.e., parabolic) fits as applied to Ar--N 2 collisions. The results are in general very encouraging and for one choice of prior give results which are within 20% of the exact j/sub z/CCS results
Statistical learning and prejudice.
Madison, Guy; Ullén, Fredrik
2012-12-01
Human behavior is guided by evolutionarily shaped brain mechanisms that make statistical predictions based on limited information. Such mechanisms are important for facilitating interpersonal relationships, avoiding dangers, and seizing opportunities in social interaction. We thus suggest that it is essential for analyses of prejudice and prejudice reduction to take the predictive accuracy and adaptivity of the studied prejudices into account.
... this page: https://medlineplus.gov/usestatistics.html MedlinePlus Statistics To use the sharing features on this page, ... By Quarter View image full size Quarterly User Statistics Quarter Page Views Unique Visitors Oct-Dec-98 ...
Pestman, Wiebe R
2009-01-01
This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.
Whole Frog Project and Virtual Frog Dissection Statistics wwwstats output for January 1 through duplicate or extraneous accesses. For example, in these statistics, while a POST requesting an image is as well. Note that this under-represents the bytes requested. Starting date for following statistics
Porter, Kristin E.
2016-01-01
In education research and in many other fields, researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple…
Porter, Kristin E.
2018-01-01
Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…
DEFF Research Database (Denmark)
Breinholt, Anders; Møller, Jan Kloppenborg; Madsen, Henrik
2012-01-01
While there seems to be consensus that hydrological model outputs should be accompanied with an uncertainty estimate the appropriate method for uncertainty estimation is not agreed upon and a debate is ongoing between advocators of formal statistical methods who consider errors as stochastic...... and GLUE advocators who consider errors as epistemic, arguing that the basis of formal statistical approaches that requires the residuals to be stationary and conform to a statistical distribution is unrealistic. In this paper we take a formal frequentist approach to parameter estimation and uncertainty...... necessary but the statistical assumptions were nevertheless not 100% justified. The residual analysis showed that significant autocorrelation was present for all simulation models. We believe users of formal approaches to uncertainty evaluation within hydrology and within environmental modelling in general...
Sadovskii, Michael V
2012-01-01
This volume provides a compact presentation of modern statistical physics at an advanced level. Beginning with questions on the foundations of statistical mechanics all important aspects of statistical physics are included, such as applications to ideal gases, the theory of quantum liquids and superconductivity and the modern theory of critical phenomena. Beyond that attention is given to new approaches, such as quantum field theory methods and non-equilibrium problems.
Goodman, Joseph W
2015-01-01
This book discusses statistical methods that are useful for treating problems in modern optics, and the application of these methods to solving a variety of such problems This book covers a variety of statistical problems in optics, including both theory and applications. The text covers the necessary background in statistics, statistical properties of light waves of various types, the theory of partial coherence and its applications, imaging with partially coherent light, atmospheric degradations of images, and noise limitations in the detection of light. New topics have been introduced i
Energy Technology Data Exchange (ETDEWEB)
Eliazar, Iddo, E-mail: eliazar@post.tau.ac.il
2017-05-15
The exponential, the normal, and the Poisson statistical laws are of major importance due to their universality. Harmonic statistics are as universal as the three aforementioned laws, but yet they fall short in their ‘public relations’ for the following reason: the full scope of harmonic statistics cannot be described in terms of a statistical law. In this paper we describe harmonic statistics, in their full scope, via an object termed harmonic Poisson process: a Poisson process, over the positive half-line, with a harmonic intensity. The paper reviews the harmonic Poisson process, investigates its properties, and presents the connections of this object to an assortment of topics: uniform statistics, scale invariance, random multiplicative perturbations, Pareto and inverse-Pareto statistics, exponential growth and exponential decay, power-law renormalization, convergence and domains of attraction, the Langevin equation, diffusions, Benford’s law, and 1/f noise. - Highlights: • Harmonic statistics are described and reviewed in detail. • Connections to various statistical laws are established. • Connections to perturbation, renormalization and dynamics are established.
International Nuclear Information System (INIS)
Eliazar, Iddo
2017-01-01
The exponential, the normal, and the Poisson statistical laws are of major importance due to their universality. Harmonic statistics are as universal as the three aforementioned laws, but yet they fall short in their ‘public relations’ for the following reason: the full scope of harmonic statistics cannot be described in terms of a statistical law. In this paper we describe harmonic statistics, in their full scope, via an object termed harmonic Poisson process: a Poisson process, over the positive half-line, with a harmonic intensity. The paper reviews the harmonic Poisson process, investigates its properties, and presents the connections of this object to an assortment of topics: uniform statistics, scale invariance, random multiplicative perturbations, Pareto and inverse-Pareto statistics, exponential growth and exponential decay, power-law renormalization, convergence and domains of attraction, the Langevin equation, diffusions, Benford’s law, and 1/f noise. - Highlights: • Harmonic statistics are described and reviewed in detail. • Connections to various statistical laws are established. • Connections to perturbation, renormalization and dynamics are established.
Szulc, Stefan
1965-01-01
Statistical Methods provides a discussion of the principles of the organization and technique of research, with emphasis on its application to the problems in social statistics. This book discusses branch statistics, which aims to develop practical ways of collecting and processing numerical data and to adapt general statistical methods to the objectives in a given field.Organized into five parts encompassing 22 chapters, this book begins with an overview of how to organize the collection of such information on individual units, primarily as accomplished by government agencies. This text then
... Testing Treatment & Outcomes Health Professionals Statistics More Resources Candidiasis Candida infections of the mouth, throat, and esophagus Vaginal candidiasis Invasive candidiasis Definition Symptoms Risk & Prevention Sources Diagnosis ...
Petocz, Peter; Sowey, Eric
2012-01-01
The term "data snooping" refers to the practice of choosing which statistical analyses to apply to a set of data after having first looked at those data. Data snooping contradicts a fundamental precept of applied statistics, that the scheme of analysis is to be planned in advance. In this column, the authors shall elucidate the…
Petocz, Peter; Sowey, Eric
2008-01-01
In this article, the authors focus on hypothesis testing--that peculiarly statistical way of deciding things. Statistical methods for testing hypotheses were developed in the 1920s and 1930s by some of the most famous statisticians, in particular Ronald Fisher, Jerzy Neyman and Egon Pearson, who laid the foundations of almost all modern methods of…
Glaz, Joseph
2009-01-01
Suitable for graduate students and researchers in applied probability and statistics, as well as for scientists in biology, computer science, pharmaceutical science and medicine, this title brings together a collection of chapters illustrating the depth and diversity of theory, methods and applications in the area of scan statistics.
Lyons, L.
2016-01-01
Accelerators and detectors are expensive, both in terms of money and human effort. It is thus important to invest effort in performing a good statistical anal- ysis of the data, in order to extract the best information from it. This series of five lectures deals with practical aspects of statistical issues that arise in typical High Energy Physics analyses.
Nick, Todd G
2007-01-01
Statistics is defined by the Medical Subject Headings (MeSH) thesaurus as the science and art of collecting, summarizing, and analyzing data that are subject to random variation. The two broad categories of summarizing and analyzing data are referred to as descriptive and inferential statistics. This chapter considers the science and art of summarizing data where descriptive statistics and graphics are used to display data. In this chapter, we discuss the fundamentals of descriptive statistics, including describing qualitative and quantitative variables. For describing quantitative variables, measures of location and spread, for example the standard deviation, are presented along with graphical presentations. We also discuss distributions of statistics, for example the variance, as well as the use of transformations. The concepts in this chapter are useful for uncovering patterns within the data and for effectively presenting the results of a project.
2016-05-31
Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded
Blakemore, J S
1962-01-01
Semiconductor Statistics presents statistics aimed at complementing existing books on the relationships between carrier densities and transport effects. The book is divided into two parts. Part I provides introductory material on the electron theory of solids, and then discusses carrier statistics for semiconductors in thermal equilibrium. Of course a solid cannot be in true thermodynamic equilibrium if any electrical current is passed; but when currents are reasonably small the distribution function is but little perturbed, and the carrier distribution for such a """"quasi-equilibrium"""" co
Wannier, Gregory Hugh
1966-01-01
Until recently, the field of statistical physics was traditionally taught as three separate subjects: thermodynamics, statistical mechanics, and kinetic theory. This text, a forerunner in its field and now a classic, was the first to recognize the outdated reasons for their separation and to combine the essentials of the three subjects into one unified presentation of thermal physics. It has been widely adopted in graduate and advanced undergraduate courses, and is recommended throughout the field as an indispensable aid to the independent study and research of statistical physics.Designed for
Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James
2014-01-01
Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Energy Technology Data Exchange (ETDEWEB)
Wendelberger, Laura Jean [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-08
In large datasets, it is time consuming or even impossible to pick out interesting images. Our proposed solution is to find statistics to quantify the information in each image and use those to identify and pick out images of interest.
Department of Homeland Security — Accident statistics available on the Coast Guard’s website by state, year, and one variable to obtain tables and/or graphs. Data from reports has been loaded for...
U.S. Department of Health & Human Services — The CMS Center for Strategic Planning produces an annual CMS Statistics reference booklet that provides a quick reference for summary information about health...
Allegheny County / City of Pittsburgh / Western PA Regional Data Center — Data about the usage of the WPRDC site and its various datasets, obtained by combining Google Analytics statistics with information from the WPRDC's data portal.
Serdobolskii, Vadim Ivanovich
2007-01-01
This monograph presents mathematical theory of statistical models described by the essentially large number of unknown parameters, comparable with sample size but can also be much larger. In this meaning, the proposed theory can be called "essentially multiparametric". It is developed on the basis of the Kolmogorov asymptotic approach in which sample size increases along with the number of unknown parameters.This theory opens a way for solution of central problems of multivariate statistics, which up until now have not been solved. Traditional statistical methods based on the idea of an infinite sampling often break down in the solution of real problems, and, dependent on data, can be inefficient, unstable and even not applicable. In this situation, practical statisticians are forced to use various heuristic methods in the hope the will find a satisfactory solution.Mathematical theory developed in this book presents a regular technique for implementing new, more efficient versions of statistical procedures. ...
... Search Form Controls Cancel Submit Search the CDC Gonorrhea Note: Javascript is disabled or is not supported ... Twitter STD on Facebook Sexually Transmitted Diseases (STDs) Gonorrhea Statistics Recommend on Facebook Tweet Share Compartir Gonorrhea ...
DEFF Research Database (Denmark)
Tryggestad, Kjell
2004-01-01
The study aims is to describe how the inclusion and exclusion of materials and calculative devices construct the boundaries and distinctions between statistical facts and artifacts in economics. My methodological approach is inspired by John Graunt's (1667) Political arithmetic and more recent work...... within constructivism and the field of Science and Technology Studies (STS). The result of this approach is here termed reversible statistics, reconstructing the findings of a statistical study within economics in three different ways. It is argued that all three accounts are quite normal, albeit...... in different ways. The presence and absence of diverse materials, both natural and political, is what distinguishes them from each other. Arguments are presented for a more symmetric relation between the scientific statistical text and the reader. I will argue that a more symmetric relation can be achieved...
MacKenzie, Dana
2004-01-01
The drawbacks of using 19th-century mathematics in physics and astronomy are illustrated. To continue with the expansion of the knowledge about the cosmos, the scientists will have to come in terms with modern statistics. Some researchers have deliberately started importing techniques that are used in medical research. However, the physicists need to identify the brand of statistics that will be suitable for them, and make a choice between the Bayesian and the frequentists approach. (Edited abstract).
Manning, Robert M.
1987-01-01
A dynamic rain attenuation prediction model is developed for use in obtaining the temporal characteristics, on time scales of minutes or hours, of satellite communication link availability. Analagous to the associated static rain attenuation model, which yields yearly attenuation predictions, this dynamic model is applicable at any location in the world that is characterized by the static rain attenuation statistics peculiar to the geometry of the satellite link and the rain statistics of the location. Such statistics are calculated by employing the formalism of Part I of this report. In fact, the dynamic model presented here is an extension of the static model and reduces to the static model in the appropriate limit. By assuming that rain attenuation is dynamically described by a first-order stochastic differential equation in time and that this random attenuation process is a Markov process, an expression for the associated transition probability is obtained by solving the related forward Kolmogorov equation. This transition probability is then used to obtain such temporal rain attenuation statistics as attenuation durations and allowable attenuation margins versus control system delay.
Statistics Anxiety and Instructor Immediacy
Williams, Amanda S.
2010-01-01
The purpose of this study was to investigate the relationship between instructor immediacy and statistics anxiety. It was predicted that students receiving immediacy would report lower levels of statistics anxiety. Using a pretest-posttest-control group design, immediacy was measured using the Instructor Immediacy scale. Statistics anxiety was…
Ning, D.; Zhang, M.; Ren, S.; Hou, Y.; Yu, L.; Meng, Z.
2017-01-01
Forest plays an important role in hydrological cycle, and forest changes will inevitably affect runoff across multiple spatial scales. The selection of a suitable indicator for forest changes is essential for predicting forest-related hydrological response. This study used the Meijiang River, one of the headwaters of the Poyang Lake as an example to identify the best indicator of forest changes for predicting forest change-induced hydrological responses. Correlation analysis was conducted first to detect the relationships between monthly runoff and its predictive variables including antecedent monthly precipitation and indicators for forest changes (forest coverage, vegetation indices including EVI, NDVI, and NDWI), and by use of the identified predictive variables that were most correlated with monthly runoff, multiple linear regression models were then developed. The model with best performance identified in this study included two independent variables -antecedent monthly precipitation and NDWI. It indicates that NDWI is the best indicator of forest change in hydrological prediction while forest coverage, the most commonly used indicator of forest change is insignificantly related to monthly runoff. This highlights the use of vegetation index such as NDWI to indicate forest changes in hydrological studies. This study will provide us with an efficient way to quantify the hydrological impact of large-scale forest changes in the Meijiang River watershed, which is crucial for downstream water resource management and ecological protection in the Poyang Lake basin.
Energy Technology Data Exchange (ETDEWEB)
Lee, Gwang-Se; Cheong, Cheolung, E-mail: ccheong@pusan.ac.kr [School of Mechanical Engineering, Pusan National University, Busan, 609-745, Rep. of Korea (Korea, Republic of)
2014-12-15
Despite increasing concern about low-frequency noise of modern large horizontal-axis wind turbines (HAWTs), few studies have focused on its origin or its prediction methods. In this paper, infra- and low-frequency (the ILF) wind turbine noise are closely examined and an efficient method is developed for its prediction. Although most previous studies have assumed that the ILF noise consists primarily of blade passing frequency (BPF) noise components, these tonal noise components are seldom identified in the measured noise spectrum, except for the case of downwind wind turbines. In reality, since modern HAWTs are very large, during rotation, a single blade of the turbine experiences inflow with variation in wind speed in time as well as in space, breaking periodic perturbations of the BPF. Consequently, this transforms acoustic contributions at the BPF harmonics into broadband noise components. In this study, the ILF noise of wind turbines is predicted by combining Lowson’s acoustic analogy with the stochastic wind model, which is employed to reproduce realistic wind speed conditions. In order to predict the effects of these wind conditions on pressure variation on the blade surface, unsteadiness in the incident wind speed is incorporated into the XFOIL code by varying incident flow velocities on each blade section, which depend on the azimuthal locations of the rotating blade. The calculated surface pressure distribution is subsequently used to predict acoustic pressure at an observing location by using Lowson’s analogy. These predictions are compared with measured data, which ensures that the present method can reproduce the broadband characteristics of the measured low-frequency noise spectrum. Further investigations are carried out to characterize the IFL noise in terms of pressure loading on blade surface, narrow-band noise spectrum and noise maps around the turbine.
Grudinin, Sergei; Kadukova, Maria; Eisenbarth, Andreas; Marillet, Simon; Cazals, Frédéric
2016-09-01
The 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it.
Directory of Open Access Journals (Sweden)
Gwang-Se Lee
2014-12-01
Full Text Available Despite increasing concern about low-frequency noise of modern large horizontal-axis wind turbines (HAWTs, few studies have focused on its origin or its prediction methods. In this paper, infra- and low-frequency (the ILF wind turbine noise are closely examined and an efficient method is developed for its prediction. Although most previous studies have assumed that the ILF noise consists primarily of blade passing frequency (BPF noise components, these tonal noise components are seldom identified in the measured noise spectrum, except for the case of downwind wind turbines. In reality, since modern HAWTs are very large, during rotation, a single blade of the turbine experiences inflow with variation in wind speed in time as well as in space, breaking periodic perturbations of the BPF. Consequently, this transforms acoustic contributions at the BPF harmonics into broadband noise components. In this study, the ILF noise of wind turbines is predicted by combining Lowson’s acoustic analogy with the stochastic wind model, which is employed to reproduce realistic wind speed conditions. In order to predict the effects of these wind conditions on pressure variation on the blade surface, unsteadiness in the incident wind speed is incorporated into the XFOIL code by varying incident flow velocities on each blade section, which depend on the azimuthal locations of the rotating blade. The calculated surface pressure distribution is subsequently used to predict acoustic pressure at an observing location by using Lowson’s analogy. These predictions are compared with measured data, which ensures that the present method can reproduce the broadband characteristics of the measured low-frequency noise spectrum. Further investigations are carried out to characterize the IFL noise in terms of pressure loading on blade surface, narrow-band noise spectrum and noise maps around the turbine.
Manning, Robert M.
1991-01-01
The dynamic and composite nature of propagation impairments that are incurred on Earth-space communications links at frequencies in and above 30/20 GHz Ka band, i.e., rain attenuation, cloud and/or clear air scintillation, etc., combined with the need to counter such degradations after the small link margins have been exceeded, necessitate the use of dynamic statistical identification and prediction processing of the fading signal in order to optimally estimate and predict the levels of each of the deleterious attenuation components. Such requirements are being met in NASA's Advanced Communications Technology Satellite (ACTS) Project by the implementation of optimal processing schemes derived through the use of the Rain Attenuation Prediction Model and nonlinear Markov filtering theory.
Montoya, Isaac D.
2008-01-01
Three classification techniques (Chi-square Automatic Interaction Detection [CHAID], Classification and Regression Tree [CART], and discriminant analysis) were tested to determine their accuracy in predicting Temporary Assistance for Needy Families program recipients' future employment. Technique evaluation was based on proportion of correctly…
Goodman, J. W.
This book is based on the thesis that some training in the area of statistical optics should be included as a standard part of any advanced optics curriculum. Random variables are discussed, taking into account definitions of probability and random variables, distribution functions and density functions, an extension to two or more random variables, statistical averages, transformations of random variables, sums of real random variables, Gaussian random variables, complex-valued random variables, and random phasor sums. Other subjects examined are related to random processes, some first-order properties of light waves, the coherence of optical waves, some problems involving high-order coherence, effects of partial coherence on imaging systems, imaging in the presence of randomly inhomogeneous media, and fundamental limits in photoelectric detection of light. Attention is given to deterministic versus statistical phenomena and models, the Fourier transform, and the fourth-order moment of the spectrum of a detected speckle image.
Schwabl, Franz
2006-01-01
The completely revised new edition of the classical book on Statistical Mechanics covers the basic concepts of equilibrium and non-equilibrium statistical physics. In addition to a deductive approach to equilibrium statistics and thermodynamics based on a single hypothesis - the form of the microcanonical density matrix - this book treats the most important elements of non-equilibrium phenomena. Intermediate calculations are presented in complete detail. Problems at the end of each chapter help students to consolidate their understanding of the material. Beyond the fundamentals, this text demonstrates the breadth of the field and its great variety of applications. Modern areas such as renormalization group theory, percolation, stochastic equations of motion and their applications to critical dynamics, kinetic theories, as well as fundamental considerations of irreversibility, are discussed. The text will be useful for advanced students of physics and other natural sciences; a basic knowledge of quantum mechan...
Jana, Madhusudan
2015-01-01
Statistical mechanics is self sufficient, written in a lucid manner, keeping in mind the exam system of the universities. Need of study this subject and its relation to Thermodynamics is discussed in detail. Starting from Liouville theorem gradually, the Statistical Mechanics is developed thoroughly. All three types of Statistical distribution functions are derived separately with their periphery of applications and limitations. Non-interacting ideal Bose gas and Fermi gas are discussed thoroughly. Properties of Liquid He-II and the corresponding models have been depicted. White dwarfs and condensed matter physics, transport phenomenon - thermal and electrical conductivity, Hall effect, Magneto resistance, viscosity, diffusion, etc. are discussed. Basic understanding of Ising model is given to explain the phase transition. The book ends with a detailed coverage to the method of ensembles (namely Microcanonical, canonical and grand canonical) and their applications. Various numerical and conceptual problems ar...
Guénault, Tony
2007-01-01
In this revised and enlarged second edition of an established text Tony Guénault provides a clear and refreshingly readable introduction to statistical physics, an essential component of any first degree in physics. The treatment itself is self-contained and concentrates on an understanding of the physical ideas, without requiring a high level of mathematical sophistication. A straightforward quantum approach to statistical averaging is adopted from the outset (easier, the author believes, than the classical approach). The initial part of the book is geared towards explaining the equilibrium properties of a simple isolated assembly of particles. Thus, several important topics, for example an ideal spin-½ solid, can be discussed at an early stage. The treatment of gases gives full coverage to Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein statistics. Towards the end of the book the student is introduced to a wider viewpoint and new chapters are included on chemical thermodynamics, interactions in, for exam...
Mandl, Franz
1988-01-01
The Manchester Physics Series General Editors: D. J. Sandiford; F. Mandl; A. C. Phillips Department of Physics and Astronomy, University of Manchester Properties of Matter B. H. Flowers and E. Mendoza Optics Second Edition F. G. Smith and J. H. Thomson Statistical Physics Second Edition E. Mandl Electromagnetism Second Edition I. S. Grant and W. R. Phillips Statistics R. J. Barlow Solid State Physics Second Edition J. R. Hook and H. E. Hall Quantum Mechanics F. Mandl Particle Physics Second Edition B. R. Martin and G. Shaw The Physics of Stars Second Edition A. C. Phillips Computing for Scient
Rohatgi, Vijay K
2003-01-01
Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth
Levine-Wissing, Robin
2012-01-01
All Access for the AP® Statistics Exam Book + Web + Mobile Everything you need to prepare for the Advanced Placement® exam, in a study system built around you! There are many different ways to prepare for an Advanced Placement® exam. What's best for you depends on how much time you have to study and how comfortable you are with the subject matter. To score your highest, you need a system that can be customized to fit you: your schedule, your learning style, and your current level of knowledge. This book, and the online tools that come with it, will help you personalize your AP® Statistics prep
Davidson, Norman
2003-01-01
Clear and readable, this fine text assists students in achieving a grasp of the techniques and limitations of statistical mechanics. The treatment follows a logical progression from elementary to advanced theories, with careful attention to detail and mathematical development, and is sufficiently rigorous for introductory or intermediate graduate courses.Beginning with a study of the statistical mechanics of ideal gases and other systems of non-interacting particles, the text develops the theory in detail and applies it to the study of chemical equilibrium and the calculation of the thermody
Shan, HuanYuan; Liu, Xiangkun; Hildebrandt, Hendrik; Pan, Chuzhong; Martinet, Nicolas; Fan, Zuhui; Schneider, Peter; Asgari, Marika; Harnois-Déraps, Joachim; Hoekstra, Henk; Wright, Angus; Dietrich, Jörg P.; Erben, Thomas; Getman, Fedor; Grado, Aniello; Heymans, Catherine; Klaes, Dominik; Kuijken, Konrad; Merten, Julian; Puddu, Emanuella; Radovich, Mario; Wang, Qiao
2018-02-01
This paper is the first of a series of papers constraining cosmological parameters with weak lensing peak statistics using ˜ 450 deg2 of imaging data from the Kilo Degree Survey (KiDS-450). We measure high signal-to-noise ratio (SNR: ν) weak lensing convergence peaks in the range of 3 < ν < 5, and employ theoretical models to derive expected values. These models are validated using a suite of simulations. We take into account two major systematic effects, the boost factor and the effect of baryons on the mass-concentration relation of dark matter haloes. In addition, we investigate the impacts of other potential astrophysical systematics including the projection effects of large-scale structures, intrinsic galaxy alignments, as well as residual measurement uncertainties in the shear and redshift calibration. Assuming a flat Λ cold dark matter model, we find constraints for S_8=σ _8(Ω _m/0.3)^{0.5}=0.746^{+0.046}_{-0.107} according to the degeneracy direction of the cosmic shear analysis and Σ _8=σ _8(Ω _m/0.3)^{0.38}=0.696^{+0.048}_{-0.050} based on the derived degeneracy direction of our high-SNR peak statistics. The difference between the power index of S8 and in Σ8 indicates that combining cosmic shear with peak statistics has the potential to break the degeneracy in σ8 and Ωm. Our results are consistent with the cosmic shear tomographic correlation analysis of the same data set and ˜2σ lower than the Planck 2016 results.
International Nuclear Information System (INIS)
Takahashi, W.; Nguyen-Cong, V.; Kawaguchi, S.; Minamiyama, M.; Ninomiya, S.
2000-01-01
Various multivariate regression models were examined with ten-fold cross-validation to develop efficient, accurate models to predict dry weight and nitrogen accumulation of rice crops (cultivars Koshihikari, Hanaechizen, Nipponbare, and IR-36) from the maximum tiller number stage to the meiosis stage, using plant-canopy reflectance of hyper-spectra within the 400-1100 nm domain without any variable selection. The results showed that the principal component regression using hyper-spectra gave better fits and predictability than that using specific wavelengths. On the other hand, partial least squares regression was the most useful among the models tested; this method avoided overfitting and multicollinearity by using all wavelength information without variable selection and by inclusion of both x and y variations in its latent variables. (author)
Indian Academy of Sciences (India)
inference and finite population sampling. Sudhakar Kunte. Elements of statistical computing are discussed in this series. ... which captain gets an option to decide whether to field first or bat first ... may of course not be fair, in the sense that the team which wins ... describe two methods of drawing a random number between 0.
Schrödinger, Erwin
1952-01-01
Nobel Laureate's brilliant attempt to develop a simple, unified standard method of dealing with all cases of statistical thermodynamics - classical, quantum, Bose-Einstein, Fermi-Dirac, and more.The work also includes discussions of Nernst theorem, Planck's oscillator, fluctuations, the n-particle problem, problem of radiation, much more.
Directory of Open Access Journals (Sweden)
Galal A. Ali
1998-12-01
Full Text Available Traffic accidents are among the major causes of death in the Sultanate of Oman This is particularly the case in the age group of I6 to 25. Studies indicate that, in spite of Oman's high population-per-vehicle ratio, its fatality rate per l0,000 vehicles is one of the highest in the world. This alarming Situation underlines the importance of analyzing traffic accident data and predicting accident casualties. Such steps will lead to understanding the underlying causes of traffic accidents, and thereby to devise appropriate measures to reduce the number of car accidents and enhance safety standards. In this paper, a comparative study of car accident casualties in Oman was undertaken. Artificial Neural Networks (ANNs were used to analyze the data and make predictions of the number of accident casualties. The results were compared with those obtained from the analysis and predictions by regression techniques. Both approaches attempted to model accident casualties using historical data on related factors, such as population, number of cars on the road and so on, covering the period from I976 to 1994. Forecasts for the years 1995 to 2000 were made using ANNs and regression equations. The results from ANNs provided the best fit for the data. However, it was found that ANNs gave lower forecasts relative to those obtained by the regression methods used, indicating that ANNs are suitable for interpolation but their use for extrapolation may be limited. Nevertheless, the study showed that ANNs provide a potentially powerful tool in analyzing and forecasting traffic accidents and casualties.
Paulinelli, Regis R; Oliveira, Luis-Fernando P; Freitas-Junior, Ruffo; Soares, Leonardo R
2016-01-01
The objective of the present study was to compare the accuracy of SONOBREAST for the prediction of malignancy in solid breast nodules detected at ultrasonography with that of the BI-RADS system and to assess the agreement between these two methods. This prospective study included 274 women and evaluated 500 breast nodules detected at ultrasonography. The probability of malignancy was calculated based on the SONOBREAST model, available at www.sonobreast.com.br, and on the BI-RADS system, with results being compared with the anatomopathology report. The lesions were considered suspect in 171 cases (34.20%), according to both SONOBREAST and BI-RADS. Agreement between the methods was perfect, as shown by a Kappa coefficient of 1 (pBI-RADS proved identical insofar as sensitivity (95.40%), specificity (78.69%), positive predictive value (48.54%), negative predictive value (98.78%) and accuracy (81.60%) are concerned. With respect to the categorical variables (BI-RADS categories 3, 4 and 5), the area under the receiver operating characteristic (ROC) curve was 94.41 for SONOBREAST (range 92.20-96.62) and 89.99 for BI-RADS (range 86.60-93.37). The accuracy of the SONOBREAST model is identical to that found with BI-RADS when the same parameters are used with respect to the cut-off point at which malignancy is suspected. Regarding the continuous probability of malignancy with BI-RADS categories 3, 4 and 5, SONOBREAST permits a more precise and individualized evaluation. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
DEFF Research Database (Denmark)
Belter, Klaus; Engsted, Tom; Tanggaard, Carsten
2005-01-01
is given. In the second part of the paper we analyze the time-series properties of daily, weekly, and monthly returns, and we present evidence on predictability of multi-period returns. We also compare stock returns with the returns on long-term bonds and short-term money market instruments (that is......We present a new dividend-adjusted blue chip index for the Danish stock market covering the period 1985-2002. In contrast to other indices on the Danish stock market, the index is calculated on a daily basis. In the first part of the paper a detailed description of the construction of the index...
International Nuclear Information System (INIS)
Yahya, Noorazrul; Ebert, Martin A.; Bulsara, Max; House, Michael J.; Kennedy, Angel; Joseph, David J.; Denham, James W.
2016-01-01
Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions
Energy Technology Data Exchange (ETDEWEB)
Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au [School of Physics, University of Western Australia, Western Australia 6009, Australia and School of Health Sciences, National University of Malaysia, Bangi 43600 (Malaysia); Ebert, Martin A. [School of Physics, University of Western Australia, Western Australia 6009, Australia and Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Bulsara, Max [Institute for Health Research, University of Notre Dame, Fremantle, Western Australia 6959 (Australia); House, Michael J. [School of Physics, University of Western Australia, Western Australia 6009 (Australia); Kennedy, Angel [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Joseph, David J. [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008, Australia and School of Surgery, University of Western Australia, Western Australia 6009 (Australia); Denham, James W. [School of Medicine and Public Health, University of Newcastle, New South Wales 2308 (Australia)
2016-05-15
Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions
International Nuclear Information System (INIS)
Anon.
1994-01-01
For the years 1992 and 1993, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period. The tables and figures shown in this publication are: Changes in the volume of GNP and energy consumption; Coal consumption; Natural gas consumption; Peat consumption; Domestic oil deliveries; Import prices of oil; Price development of principal oil products; Fuel prices for power production; Total energy consumption by source; Electricity supply; Energy imports by country of origin in 1993; Energy exports by recipient country in 1993; Consumer prices of liquid fuels; Consumer prices of hard coal and natural gas, prices of indigenous fuels; Average electricity price by type of consumer; Price of district heating by type of consumer and Excise taxes and turnover taxes included in consumer prices of some energy sources
Goodman, Joseph W.
2000-07-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
Pivato, Marcus
2013-01-01
We show that, in a sufficiently large population satisfying certain statistical regularities, it is often possible to accurately estimate the utilitarian social welfare function, even if we only have very noisy data about individual utility functions and interpersonal utility comparisons. In particular, we show that it is often possible to identify an optimal or close-to-optimal utilitarian social choice using voting rules such as the Borda rule, approval voting, relative utilitarianism, or a...
Natrella, Mary Gibbons
1963-01-01
Formulated to assist scientists and engineers engaged in army ordnance research and development programs, this well-known and highly regarded handbook is a ready reference for advanced undergraduate and graduate students as well as for professionals seeking engineering information and quantitative data for designing, developing, constructing, and testing equipment. Topics include characterizing and comparing the measured performance of a material, product, or process; general considerations in planning experiments; statistical techniques for analyzing extreme-value data; use of transformations
Scoring function to predict solubility mutagenesis
Directory of Open Access Journals (Sweden)
Deutsch Christopher
2010-10-01
Full Text Available Abstract Background Mutagenesis is commonly used to engineer proteins with desirable properties not present in the wild type (WT protein, such as increased or decreased stability, reactivity, or solubility. Experimentalists often have to choose a small subset of mutations from a large number of candidates to obtain the desired change, and computational techniques are invaluable to make the choices. While several such methods have been proposed to predict stability and reactivity mutagenesis, solubility has not received much attention. Results We use concepts from computational geometry to define a three body scoring function that predicts the change in protein solubility due to mutations. The scoring function captures both sequence and structure information. By exploring the literature, we have assembled a substantial database of 137 single- and multiple-point solubility mutations. Our database is the largest such collection with structural information known so far. We optimize the scoring function using linear programming (LP methods to derive its weights based on training. Starting with default values of 1, we find weights in the range [0,2] so that predictions of increase or decrease in solubility are optimized. We compare the LP method to the standard machine learning techniques of support vector machines (SVM and the Lasso. Using statistics for leave-one-out (LOO, 10-fold, and 3-fold cross validations (CV for training and prediction, we demonstrate that the LP method performs the best overall. For the LOOCV, the LP method has an overall accuracy of 81%. Availability Executables of programs, tables of weights, and datasets of mutants are available from the following web page: http://www.wsu.edu/~kbala/OptSolMut.html.
Energy Technology Data Exchange (ETDEWEB)
McKenna-Lawlor, S.M.P. [National Univ. of Ireland, Maynooth, Co. Kildare (Ireland). Space Technology Ireland; Fry, C.D. [Exploration Physics International, Inc., Huntsville, AL (United States); Dryer, M. [Exploration Physics International, Inc., Huntsville, AL (United States); NOAA Space Environment Center, Boulder, CO (United States); Heynderickx, D. [D-H Consultancy, Leuven (Belgium); Kecskemety, K. [KFKI Research Institute for Particle and Nuclear Physics, Budapest (Hungary); Kudela, K. [Institute of Experimental Physics, Kosice (Slovakia); Balaz, J. [National Univ. of Ireland, Maynooth, Co. Kildare (Ireland). Space Technology Ireland; Institute of Experimental Physics, Kosice (Slovakia)
2012-07-01
The performance of the Hakamada Akasofu-Fry, version 2 (HAFv.2) numerical model, which provides predictions of solar shock arrival times at Earth, was subjected to a statistical study to investigate those solar/interplanetary circumstances under which the model performed well/poorly during key phases (rise/maximum/decay) of solar cycle 23. In addition to analyzing elements of the overall data set (584 selected events) associated with particular cycle phases, subsets were formed such that those events making up a particular sub-set showed common characteristics. The statistical significance of the results obtained using the various sets/subsets was generally very low and these results were not significant as compared with the hit by chance rate (50 %). This implies a low level of confidence in the predictions of the model with no compelling result encouraging its use. However, the data suggested that the success rates of HAFv.2 were higher when the background solar wind speed at the time of shock initiation was relatively fast. Thus, in scenarios where the background solar wind speed is elevated and the calculated success rate significantly exceeds the rate by chance, the forecasts could provide potential value to the customer. With the composite statistics available for solar cycle 23, the calculated success rate at high solar wind speed, although clearly above 50 %, was indicative rather than conclusive. The RMS error estimated for shock arrival times for every cycle phase and for the composite sample was in each case significantly better than would be expected for a random data set. Also, the parameter ''Probability of Detection, yes'' (PODy) which presents the Proportion of Yes observations that were correctly forecast (i.e. the ratio between the shocks correctly predicted and all the shocks observed), yielded values for the rise/maximum/decay phases of the cycle and using the composite sample of 0.85, 0.64, 0.79 and 0.77, respectively. The
Directory of Open Access Journals (Sweden)
S. M. P. McKenna-Lawlor
2012-02-01
Full Text Available The performance of the Hakamada Akasofu-Fry, version 2 (HAFv.2 numerical model, which provides predictions of solar shock arrival times at Earth, was subjected to a statistical study to investigate those solar/interplanetary circumstances under which the model performed well/poorly during key phases (rise/maximum/decay of solar cycle 23. In addition to analyzing elements of the overall data set (584 selected events associated with particular cycle phases, subsets were formed such that those events making up a particular sub-set showed common characteristics. The statistical significance of the results obtained using the various sets/subsets was generally very low and these results were not significant as compared with the hit by chance rate (50%. This implies a low level of confidence in the predictions of the model with no compelling result encouraging its use. However, the data suggested that the success rates of HAFv.2 were higher when the background solar wind speed at the time of shock initiation was relatively fast. Thus, in scenarios where the background solar wind speed is elevated and the calculated success rate significantly exceeds the rate by chance, the forecasts could provide potential value to the customer. With the composite statistics available for solar cycle 23, the calculated success rate at high solar wind speed, although clearly above 50%, was indicative rather than conclusive. The RMS error estimated for shock arrival times for every cycle phase and for the composite sample was in each case significantly better than would be expected for a random data set. Also, the parameter "Probability of Detection, yes" (PODy which presents the Proportion of Yes observations that were correctly forecast (i.e. the ratio between the shocks correctly predicted and all the shocks observed, yielded values for the rise/maximum/decay phases of the cycle and using the composite sample of 0.85, 0.64, 0.79 and 0.77, respectively. The statistical
Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio
2016-01-01
The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR) model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80%) and test sets (20%). We also compiled an external validation set (101 compounds) for evaluating the performance of the model. To extract structural alerts (SAs) related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC) on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81, 63, and 68%, specificity 89, 33, and 33%, sensitivity 93, 88, and 80% and MCC 0.63, 0.27, and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model's architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform. PMID:27920722
Directory of Open Access Journals (Sweden)
Fabiola Pizzo
2016-11-01
Full Text Available The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80% and test sets (20%. We also compiled an external validation set (101 compounds for evaluating the performance of the model. To extract structural alerts (SAs related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81%, 63% and 68%, specificity 89%, 33% and 33%, sensitivity 93%, 88% and 80% and MCC 0.63, 0.27 and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model’s architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform.
International Nuclear Information System (INIS)
Anon.
1989-01-01
World data from the United Nation's latest Energy Statistics Yearbook, first published in our last issue, are completed here. The 1984-86 data were revised and 1987 data added for world commercial energy production and consumption, world natural gas plant liquids production, world LP-gas production, imports, exports, and consumption, world residual fuel oil production, imports, exports, and consumption, world lignite production, imports, exports, and consumption, world peat production and consumption, world electricity production, imports, exports, and consumption (Table 80), and world nuclear electric power production
Rixen, M.; Ferreira-Coelho, E.; Signell, R.
2008-01-01
Despite numerous and regular improvements in underlying models, surface drift prediction in the ocean remains a challenging task because of our yet limited understanding of all processes involved. Hence, deterministic approaches to the problem are often limited by empirical assumptions on underlying physics. Multi-model hyper-ensemble forecasts, which exploit the power of an optimal local combination of available information including ocean, atmospheric and wave models, may show superior forecasting skills when compared to individual models because they allow for local correction and/or bias removal. In this work, we explore in greater detail the potential and limitations of the hyper-ensemble method in the Adriatic Sea, using a comprehensive surface drifter database. The performance of the hyper-ensembles and the individual models are discussed by analyzing associated uncertainties and probability distribution maps. Results suggest that the stochastic method may reduce position errors significantly for 12 to 72??h forecasts and hence compete with pure deterministic approaches. ?? 2007 NATO Undersea Research Centre (NURC).
de Sousa Zanotti Stagliorio Coêlho, Micheline; Luiz Teixeira Gonçalves, Fabio; do Rosário Dias de Oliveira Latorre, Maria
2010-01-01
This study is aimed at creating a stochastic model, named Brazilian Climate and Health Model (BCHM), through Poisson regression, in order to predict the occurrence of hospital respiratory admissions (for children under thirteen years of age) as a function of air pollutants, meteorological variables, and thermal comfort indices (effective temperatures, ET). The data used in this study were obtained from the city of São Paulo, Brazil, between 1997 and 2000. The respiratory tract diseases were divided into three categories: URI (Upper Respiratory tract diseases), LRI (Lower Respiratory tract diseases), and IP (Influenza and Pneumonia). The overall results of URI, LRI, and IP show clear correlation with SO₂ and CO, PM₁₀ and O₃, and PM₁₀, respectively, and the ETw4 (Effective Temperature) for all the three disease groups. It is extremely important to warn the government of the most populated city in Brazil about the outcome of this study, providing it with valuable information in order to help it better manage its resources on behalf of the whole population of the city of Sao Paulo, especially those with low incomes.
National Statistical Commission and Indian Official Statistics*
Indian Academy of Sciences (India)
IAS Admin
a good collection of official statistics of that time. With more .... statistical agencies and institutions to provide details of statistical activities .... ing several training programmes. .... ful completion of Indian Statistical Service examinations, the.
Statistical decay of giant resonances
International Nuclear Information System (INIS)
Dias, H.; Teruya, N.; Wolynec, E.
1986-01-01
Statistical calculations to predict the neutron spectrum resulting from the decay of Giant Resonances are discussed. The dependence of the resutls on the optical potential parametrization and on the level density of the residual nucleus is assessed. A Hauser-Feshbach calculation is performed for the decay of the monople giant resonance in 208 Pb using the experimental levels of 207 Pb from a recent compilation. The calculated statistical decay is in excelent agreement with recent experimental data, showing that the decay of this resonance is dominantly statistical, as predicted by continuum RPA calculations. (Author) [pt
Statistical decay of giant resonances
International Nuclear Information System (INIS)
Dias, H.; Teruya, N.; Wolynec, E.
1986-02-01
Statistical calculations to predict the neutron spectrum resulting from the decay of Giant Resonances are discussed. The dependence of the results on the optical potential parametrization and on the level density of the residual nucleus is assessed. A Hauser-Feshbach calculation is performed for the decay of the monopole giant resonance in 208 Pb using the experimental levels of 207 Pb from a recent compilation. The calculated statistical decay is in excellent agreement with recent experimental data, showing that decay of this resonance is dominantly statistical, as predicted by continuum RPA calculations. (Author) [pt
Tellinghuisen, Joel
2008-01-01
The method of least squares is probably the most powerful data analysis tool available to scientists. Toward a fuller appreciation of that power, this work begins with an elementary review of statistics fundamentals, and then progressively increases in sophistication as the coverage is extended to the theory and practice of linear and nonlinear least squares. The results are illustrated in application to data analysis problems important in the life sciences. The review of fundamentals includes the role of sampling and its connection to probability distributions, the Central Limit Theorem, and the importance of finite variance. Linear least squares are presented using matrix notation, and the significance of the key probability distributions-Gaussian, chi-square, and t-is illustrated with Monte Carlo calculations. The meaning of correlation is discussed, including its role in the propagation of error. When the data themselves are correlated, special methods are needed for the fitting, as they are also when fitting with constraints. Nonlinear fitting gives rise to nonnormal parameter distributions, but the 10% Rule of Thumb suggests that such problems will be insignificant when the parameter is sufficiently well determined. Illustrations include calibration with linear and nonlinear response functions, the dangers inherent in fitting inverted data (e.g., Lineweaver-Burk equation), an analysis of the reliability of the van't Hoff analysis, the problem of correlated data in the Guggenheim method, and the optimization of isothermal titration calorimetry procedures using the variance-covariance matrix for experiment design. The work concludes with illustrations on assessing and presenting results.
Directory of Open Access Journals (Sweden)
Hélène eColineaux
2015-10-01
Full Text Available INTRODUCTION. The use of genetic predictive markers in medical practice does not necessarily bear the same kind of medical and ethical consequences than that of genes directly involved in monogenic diseases. However, the French bioethics law framed in the same way the production and use of any genetic information. It seems therefore necessary to explore the practical and ethical context of the actual use of predictive markers in order to highlight their specific stakes. In this study, we document the uses of HLA-B*27, which are an interesting example of the multiple features of genetic predictive marker in general medical practice.MATERIAL & METHODS. The aims of this monocentric and qualitative study were to identify concrete and ethical issues of using the HLA-B*27 marker and the interests and limits of the legal framework as perceived by prescribers. In this regard, a thematic and descriptive analysis of five rheumatologists’ semi-structured and face-to-face interviews was performed.RESULTS. According to most of the interviewees, HLA-B*27 is an overframed test because they considered that this test is not really genetic or at least does not have the same nature as classical genetic tests; HLA-B*27 is not concerned by the ethical challenges of genetic test; the major ethics stake of this marker is not linked to its genetic nature but rather to the complexity of the probabilistic information. This study allows also showing that HLA-B*27, validated for a certain usage, may be used in different ways in practice.DISCUSSION. This marker and its clinical uses underline the challenges of translating both statistical concepts and unifying legal framework in clinical practice. This study allows identifying some new aspects and stakes of genetics in medicine and shows the need of additional studies about the use of predictive genetic markers, in order to provide a better basis for decisions and legal framework regarding these practices.
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Larsen, Gunner Chr.; Hansen, Kurt Schaldemose
2004-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....
Statistical Prediction of Laminar-turbulent Transition
National Research Council Canada - National Science Library
Rubinstein, Robert
2000-01-01
... on representative stability theories including the resonant triad model and the parabolized stability equations. The first type of model can describe the effect of initial phase differences among disturbance modes on transition location...
Predicting retail banking consumer behaviour using statistics
Directory of Open Access Journals (Sweden)
Konstantinos Agaliotis
2015-04-01
Full Text Available The рurрοѕe οf any eсοnοmiс-based aсtivitу is the creation of needs. As the financial activities are not an exception to this rule, understanding clients’ necessities and their satisfaction is of primary concern for all financial institutions. Being conversant with the exact details that constitute client behaviour and the processes that lead to particular decisions, has become an advantage for financial institutions investing resources in it. Finally, it will not only pay off by satisfying the clients’ needs, but it will also secure a long-standing loyalty and relationships with them. As all relationships, the one between the client and the bank requires support and mutual understanding. Given the Serbian retail banking market, we may conclude the following: firstly, there is still potential for doing business in this filed; secondly, the particular segments of customers would accept new products; and thirdly, banks have to focus on the highest ranking clients concerning their credit worthiness. As regards the client behaviour over different product offerings, we can conclude that cash loans and credit card holders are not price sensitive, and that subsequently, the existing holders intend to increase their credit exposure.
Applied statistical thermodynamics
Lucas, Klaus
1991-01-01
The book guides the reader from the foundations of statisti- cal thermodynamics including the theory of intermolecular forces to modern computer-aided applications in chemical en- gineering and physical chemistry. The approach is new. The foundations of quantum and statistical mechanics are presen- ted in a simple way and their applications to the prediction of fluid phase behavior of real systems are demonstrated. A particular effort is made to introduce the reader to expli- cit formulations of intermolecular interaction models and to show how these models influence the properties of fluid sy- stems. The established methods of statistical mechanics - computer simulation, perturbation theory, and numerical in- tegration - are discussed in a style appropriate for newcom- ers and are extensively applied. Numerous worked examples illustrate how practical calculations should be carried out.
Statistics for experimentalists
Cooper, B E
2014-01-01
Statistics for Experimentalists aims to provide experimental scientists with a working knowledge of statistical methods and search approaches to the analysis of data. The book first elaborates on probability and continuous probability distributions. Discussions focus on properties of continuous random variables and normal variables, independence of two random variables, central moments of a continuous distribution, prediction from a normal distribution, binomial probabilities, and multiplication of probabilities and independence. The text then examines estimation and tests of significance. Topics include estimators and estimates, expected values, minimum variance linear unbiased estimators, sufficient estimators, methods of maximum likelihood and least squares, and the test of significance method. The manuscript ponders on distribution-free tests, Poisson process and counting problems, correlation and function fitting, balanced incomplete randomized block designs and the analysis of covariance, and experiment...
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Hansen, Kurt Schaldemose; Larsen, Gunner Chr.
2005-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
... Watchdog Ratings Feedback Contact Select Page Childhood Cancer Statistics Home > Cancer Resources > Childhood Cancer Statistics Childhood Cancer Statistics – Graphs and Infographics Number of Diagnoses Incidence Rates ...
Statistical and theoretical research
International Nuclear Information System (INIS)
Anon.
1983-01-01
Significant accomplishments include the creation of field designs to detect population impacts, new census procedures for small mammals, and methods for designing studies to determine where and how much of a contaminant is extent over certain landscapes. A book describing these statistical methods is currently being written and will apply to a variety of environmental contaminants, including radionuclides. PNL scientists also have devised an analytical method for predicting the success of field eexperiments on wild populations. Two highlights of current research are the discoveries that population of free-roaming horse herds can double in four years and that grizzly bear populations may be substantially smaller than once thought. As stray horses become a public nuisance at DOE and other large Federal sites, it is important to determine their number. Similar statistical theory can be readily applied to other situations where wild animals are a problem of concern to other government agencies. Another book, on statistical aspects of radionuclide studies, is written specifically for researchers in radioecology
... Standards Act and Program MQSA Insights MQSA National Statistics Share Tweet Linkedin Pin it More sharing options ... but should level off with time. Archived Scorecard Statistics 2018 Scorecard Statistics 2017 Scorecard Statistics 2016 Scorecard ...
State Transportation Statistics 2014
2014-12-15
The Bureau of Transportation Statistics (BTS) presents State Transportation Statistics 2014, a statistical profile of transportation in the 50 states and the District of Columbia. This is the 12th annual edition of State Transportation Statistics, a ...
Renyi statistics in equilibrium statistical mechanics
International Nuclear Information System (INIS)
Parvan, A.S.; Biro, T.S.
2010-01-01
The Renyi statistics in the canonical and microcanonical ensembles is examined both in general and in particular for the ideal gas. In the microcanonical ensemble the Renyi statistics is equivalent to the Boltzmann-Gibbs statistics. By the exact analytical results for the ideal gas, it is shown that in the canonical ensemble, taking the thermodynamic limit, the Renyi statistics is also equivalent to the Boltzmann-Gibbs statistics. Furthermore it satisfies the requirements of the equilibrium thermodynamics, i.e. the thermodynamical potential of the statistical ensemble is a homogeneous function of first degree of its extensive variables of state. We conclude that the Renyi statistics arrives at the same thermodynamical relations, as those stemming from the Boltzmann-Gibbs statistics in this limit.
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
A Simple Statistical Thermodynamics Experiment
LoPresto, Michael C.
2010-01-01
Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…
Graphene Statistical Mechanics
Bowick, Mark; Kosmrlj, Andrej; Nelson, David; Sknepnek, Rastko
2015-03-01
Graphene provides an ideal system to test the statistical mechanics of thermally fluctuating elastic membranes. The high Young's modulus of graphene means that thermal fluctuations over even small length scales significantly stiffen the renormalized bending rigidity. We study the effect of thermal fluctuations on graphene ribbons of width W and length L, pinned at one end, via coarse-grained Molecular Dynamics simulations and compare with analytic predictions of the scaling of width-averaged root-mean-squared height fluctuations as a function of distance along the ribbon. Scaling collapse as a function of W and L also allows us to extract the scaling exponent eta governing the long-wavelength stiffening of the bending rigidity. A full understanding of the geometry-dependent mechanical properties of graphene, including arrays of cuts, may allow the design of a variety of modular elements with desired mechanical properties starting from pure graphene alone. Supported by NSF grant DMR-1435794
A perceptual space of local image statistics.
Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M
2015-12-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
Statistical ecology comes of age
Gimenez, Olivier; Buckland, Stephen T.; Morgan, Byron J. T.; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M.; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M.; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-01-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1–4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data. PMID:25540151
Lies, damn lies and statistics
International Nuclear Information System (INIS)
Jones, M.D.
2001-01-01
Statistics are widely employed within archaeological research. This is becoming increasingly so as user friendly statistical packages make increasingly sophisticated analyses available to non statisticians. However, all statistical techniques are based on underlying assumptions of which the end user may be unaware. If statistical analyses are applied in ignorance of the underlying assumptions there is the potential for highly erroneous inferences to be drawn. This does happen within archaeology and here this is illustrated with the example of 'date pooling', a technique that has been widely misused in archaeological research. This misuse may have given rise to an inevitable and predictable misinterpretation of New Zealand's archaeological record. (author). 10 refs., 6 figs., 1 tab
Statistical ecology comes of age.
Gimenez, Olivier; Buckland, Stephen T; Morgan, Byron J T; Bez, Nicolas; Bertrand, Sophie; Choquet, Rémi; Dray, Stéphane; Etienne, Marie-Pierre; Fewster, Rachel; Gosselin, Frédéric; Mérigot, Bastien; Monestiez, Pascal; Morales, Juan M; Mortier, Frédéric; Munoz, François; Ovaskainen, Otso; Pavoine, Sandrine; Pradel, Roger; Schurr, Frank M; Thomas, Len; Thuiller, Wilfried; Trenkel, Verena; de Valpine, Perry; Rexstad, Eric
2014-12-01
The desire to predict the consequences of global environmental change has been the driver towards more realistic models embracing the variability and uncertainties inherent in ecology. Statistical ecology has gelled over the past decade as a discipline that moves away from describing patterns towards modelling the ecological processes that generate these patterns. Following the fourth International Statistical Ecology Conference (1-4 July 2014) in Montpellier, France, we analyse current trends in statistical ecology. Important advances in the analysis of individual movement, and in the modelling of population dynamics and species distributions, are made possible by the increasing use of hierarchical and hidden process models. Exciting research perspectives include the development of methods to interpret citizen science data and of efficient, flexible computational algorithms for model fitting. Statistical ecology has come of age: it now provides a general and mathematically rigorous framework linking ecological theory and empirical data.
Savage, Leonard J
1972-01-01
Classic analysis of the foundations of statistics and development of personal probability, one of the greatest controversies in modern statistical thought. Revised edition. Calculus, probability, statistics, and Boolean algebra are recommended.
State Transportation Statistics 2010
2011-09-14
The Bureau of Transportation Statistics (BTS), a part of DOTs Research and Innovative Technology Administration (RITA), presents State Transportation Statistics 2010, a statistical profile of transportation in the 50 states and the District of Col...
State Transportation Statistics 2012
2013-08-15
The Bureau of Transportation Statistics (BTS), a part of the U.S. Department of Transportation's (USDOT) Research and Innovative Technology Administration (RITA), presents State Transportation Statistics 2012, a statistical profile of transportation ...
Adrenal Gland Tumors: Statistics
... Gland Tumor: Statistics Request Permissions Adrenal Gland Tumor: Statistics Approved by the Cancer.Net Editorial Board , 03/ ... primary adrenal gland tumor is very uncommon. Exact statistics are not available for this type of tumor ...
State transportation statistics 2009
2009-01-01
The Bureau of Transportation Statistics (BTS), a part of DOTs Research and : Innovative Technology Administration (RITA), presents State Transportation : Statistics 2009, a statistical profile of transportation in the 50 states and the : District ...
State Transportation Statistics 2011
2012-08-08
The Bureau of Transportation Statistics (BTS), a part of DOTs Research and Innovative Technology Administration (RITA), presents State Transportation Statistics 2011, a statistical profile of transportation in the 50 states and the District of Col...
Neuroendocrine Tumor: Statistics
... Tumor > Neuroendocrine Tumor: Statistics Request Permissions Neuroendocrine Tumor: Statistics Approved by the Cancer.Net Editorial Board , 01/ ... the body. It is important to remember that statistics on the survival rates for people with a ...
State Transportation Statistics 2013
2014-09-19
The Bureau of Transportation Statistics (BTS), a part of the U.S. Department of Transportations (USDOT) Research and Innovative Technology Administration (RITA), presents State Transportation Statistics 2013, a statistical profile of transportatio...
BTS statistical standards manual
2005-10-01
The Bureau of Transportation Statistics (BTS), like other federal statistical agencies, establishes professional standards to guide the methods and procedures for the collection, processing, storage, and presentation of statistical data. Standards an...
Information Statistics in Schools Educate your students about the value and everyday use of statistics. The Statistics in Schools program provides resources for teaching and learning with real life data. Explore the site for standards-aligned, classroom-ready activities. Statistics in Schools Math Activities History
Transport Statistics - Transport - UNECE
Sustainable Energy Statistics Trade Transport Themes UNECE and the SDGs Climate Change Gender Ideas 4 Change UNECE Weekly Videos UNECE Transport Areas of Work Transport Statistics Transport Transport Statistics About us Terms of Reference Meetings and Events Meetings Working Party on Transport Statistics (WP.6
Generalized quantum statistics
International Nuclear Information System (INIS)
Chou, C.
1992-01-01
In the paper, a non-anyonic generalization of quantum statistics is presented, in which Fermi-Dirac statistics (FDS) and Bose-Einstein statistics (BES) appear as two special cases. The new quantum statistics, which is characterized by the dimension of its single particle Fock space, contains three consistent parts, namely the generalized bilinear quantization, the generalized quantum mechanical description and the corresponding statistical mechanics
Kim, E.; Newton, A. P.
2012-04-01
One major problem in dynamo theory is the multi-scale nature of the MHD turbulence, which requires statistical theory in terms of probability distribution functions. In this contribution, we present the statistical theory of magnetic fields in a simplified mean field α-Ω dynamo model by varying the statistical property of alpha, including marginal stability and intermittency, and then utilize observational data of solar activity to fine-tune the mean field dynamo model. Specifically, we first present a comprehensive investigation into the effect of the stochastic parameters in a simplified α-Ω dynamo model. Through considering the manifold of marginal stability (the region of parameter space where the mean growth rate is zero), we show that stochastic fluctuations are conductive to dynamo. Furthermore, by considering the cases of fluctuating alpha that are periodic and Gaussian coloured random noise with identical characteristic time-scales and fluctuating amplitudes, we show that the transition to dynamo is significantly facilitated for stochastic alpha with random noise. Furthermore, we show that probability density functions (PDFs) of the growth-rate, magnetic field and magnetic energy can provide a wealth of useful information regarding the dynamo behaviour/intermittency. Finally, the precise statistical property of the dynamo such as temporal correlation and fluctuating amplitude is found to be dependent on the distribution the fluctuations of stochastic parameters. We then use observations of solar activity to constrain parameters relating to the effect in stochastic α-Ω nonlinear dynamo models. This is achieved through performing a comprehensive statistical comparison by computing PDFs of solar activity from observations and from our simulation of mean field dynamo model. The observational data that are used are the time history of solar activity inferred for C14 data in the past 11000 years on a long time scale and direct observations of the sun spot
Statistical sampling approaches for soil monitoring
Brus, D.J.
2014-01-01
This paper describes three statistical sampling approaches for regional soil monitoring, a design-based, a model-based and a hybrid approach. In the model-based approach a space-time model is exploited to predict global statistical parameters of interest such as the space-time mean. In the hybrid
Shnirelman peak in the level spacing statistics
International Nuclear Information System (INIS)
Chirikov, B.V.; Shepelyanskij, D.L.
1994-01-01
The first results on the statistical properties of the quantum quasidegeneracy are presented. A physical interpretation of the Shnirelman theorem predicted the bulk quasidegeneracy is given. The conditions for the strong impact of the degeneracy on the quantum level statistics are formulated which allows to extend the application of the Shnirelman theorem into a broad class of quantum systems. 14 refs., 3 figs
National Statistical Commission and Indian Official Statistics
Indian Academy of Sciences (India)
Author Affiliations. T J Rao1. C. R. Rao Advanced Institute of Mathematics, Statistics and Computer Science (AIMSCS) University of Hyderabad Campus Central University Post Office, Prof. C. R. Rao Road Hyderabad 500 046, AP, India.
Rumsey, Deborah
2011-01-01
The fun and easy way to get down to business with statistics Stymied by statistics? No fear ? this friendly guide offers clear, practical explanations of statistical ideas, techniques, formulas, and calculations, with lots of examples that show you how these concepts apply to your everyday life. Statistics For Dummies shows you how to interpret and critique graphs and charts, determine the odds with probability, guesstimate with confidence using confidence intervals, set up and carry out a hypothesis test, compute statistical formulas, and more.Tracks to a typical first semester statistics cou
Industrial statistics with Minitab
Cintas, Pere Grima; Llabres, Xavier Tort-Martorell
2012-01-01
Industrial Statistics with MINITAB demonstrates the use of MINITAB as a tool for performing statistical analysis in an industrial context. This book covers introductory industrial statistics, exploring the most commonly used techniques alongside those that serve to give an overview of more complex issues. A plethora of examples in MINITAB are featured along with case studies for each of the statistical techniques presented. Industrial Statistics with MINITAB: Provides comprehensive coverage of user-friendly practical guidance to the essential statistical methods applied in industry.Explores
Lattice topology dictates photon statistics.
Kondakci, H Esat; Abouraddy, Ayman F; Saleh, Bahaa E A
2017-08-21
Propagation of coherent light through a disordered network is accompanied by randomization and possible conversion into thermal light. Here, we show that network topology plays a decisive role in determining the statistics of the emerging field if the underlying lattice is endowed with chiral symmetry. In such lattices, eigenmode pairs come in skew-symmetric pairs with oppositely signed eigenvalues. By examining one-dimensional arrays of randomly coupled waveguides arranged on linear and ring topologies, we are led to a remarkable prediction: the field circularity and the photon statistics in ring lattices are dictated by its parity while the same quantities are insensitive to the parity of a linear lattice. For a ring lattice, adding or subtracting a single lattice site can switch the photon statistics from super-thermal to sub-thermal, or vice versa. This behavior is understood by examining the real and imaginary fields on a lattice exhibiting chiral symmetry, which form two strands that interleave along the lattice sites. These strands can be fully braided around an even-sited ring lattice thereby producing super-thermal photon statistics, while an odd-sited lattice is incommensurate with such an arrangement and the statistics become sub-thermal.
Recreational Boating Statistics 2012
Department of Homeland Security — Every year, the USCG compiles statistics on reported recreational boating accidents. These statistics are derived from accident reports that are filed by the owners...
Recreational Boating Statistics 2013
Department of Homeland Security — Every year, the USCG compiles statistics on reported recreational boating accidents. These statistics are derived from accident reports that are filed by the owners...
Statistical data analysis handbook
National Research Council Canada - National Science Library
Wall, Francis J
1986-01-01
It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...
U.S. Department of Health & Human Services — The CMS Office of Enterprise Data and Analytics has developed CMS Program Statistics, which includes detailed summary statistics on national health care, Medicare...
Recreational Boating Statistics 2011
Department of Homeland Security — Every year, the USCG compiles statistics on reported recreational boating accidents. These statistics are derived from accident reports that are filed by the owners...
... Doing AMIGAS Stay Informed Cancer Home Uterine Cancer Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... the most commonly diagnosed gynecologic cancer. U.S. Cancer Statistics Data Visualizations Tool The Data Visualizations tool makes ...
Tuberculosis Data and Statistics
... Advisory Groups Federal TB Task Force Data and Statistics Language: English (US) Español (Spanish) Recommend on Facebook ... Set) Mortality and Morbidity Weekly Reports Data and Statistics Decrease in Reported Tuberculosis Cases MMWR 2010; 59 ( ...
National transportation statistics 2011
2011-04-01
Compiled and published by the U.S. Department of Transportation's Bureau of Transportation Statistics : (BTS), National Transportation Statistics presents information on the U.S. transportation system, including : its physical components, safety reco...
National Transportation Statistics 2008
2009-01-08
Compiled and published by the U.S. Department of Transportations Bureau of Transportation Statistics (BTS), National Transportation Statistics presents information on the U.S. transportation system, including its physical components, safety record...
... News & Events About Us Home > Health Information Share Statistics Research shows that mental illnesses are common in ... of mental illnesses, such as suicide and disability. Statistics Top ı cs Mental Illness Any Anxiety Disorder ...
School Violence: Data & Statistics
... Social Media Publications Injury Center School Violence: Data & Statistics Recommend on Facebook Tweet Share Compartir The first ... Vehicle Safety Traumatic Brain Injury Injury Response Data & Statistics (WISQARS) Funded Programs Press Room Social Media Publications ...
Caregiver Statistics: Demographics
... You are here Home Selected Long-Term Care Statistics Order this publication Printer-friendly version What is ... needs and services are wide-ranging and complex, statistics may vary from study to study. Sources for ...
... Summary Coverdell Program 2012-2015 State Summaries Data & Statistics Fact Sheets Heart Disease and Stroke Fact Sheets ... Roadmap for State Planning Other Data Resources Other Statistic Resources Grantee Information Cross-Program Information Online Tools ...
... Standard Drink? Drinking Levels Defined Alcohol Facts and Statistics Print version Alcohol Use in the United States: ... 1238–1245, 2004. PMID: 15010446 National Center for Statistics and Analysis. 2014 Crash Data Key Findings (Traffic ...
National Transportation Statistics 2009
2010-01-21
Compiled and published by the U.S. Department of Transportation's Bureau of Transportation Statistics (BTS), National Transportation Statistics presents information on the U.S. transportation system, including its physical components, safety record, ...
National transportation statistics 2010
2010-01-01
National Transportation Statistics presents statistics on the U.S. transportation system, including its physical components, safety record, economic performance, the human and natural environment, and national security. This is a large online documen...
DEFF Research Database (Denmark)
Lindström, Erik; Madsen, Henrik; Nielsen, Jan Nygaard
Statistics for Finance develops students’ professional skills in statistics with applications in finance. Developed from the authors’ courses at the Technical University of Denmark and Lund University, the text bridges the gap between classical, rigorous treatments of financial mathematics...
Principles of applied statistics
National Research Council Canada - National Science Library
Cox, D. R; Donnelly, Christl A
2011-01-01
.... David Cox and Christl Donnelly distil decades of scientific experience into usable principles for the successful application of statistics, showing how good statistical strategy shapes every stage of an investigation...
Quantum local asymptotic normality and other questions of quantum statistics
Kahn, Jonas
2008-01-01
This thesis is entitled Quantum Local Asymptotic Normality and other questions of Quantum Statistics ,. Quantum statistics are statistics on quantum objects. In classical statistics, we usually start from the data. Indeed, if we want to predict the weather, and can measure the wind or the
Climate prediction and predictability
Allen, Myles
2010-05-01
Climate prediction is generally accepted to be one of the grand challenges of the Geophysical Sciences. What is less widely acknowledged is that fundamental issues have yet to be resolved concerning the nature of the challenge, even after decades of research in this area. How do we verify or falsify a probabilistic forecast of a singular event such as anthropogenic warming over the 21st century? How do we determine the information content of a climate forecast? What does it mean for a modelling system to be "good enough" to forecast a particular variable? How will we know when models and forecasting systems are "good enough" to provide detailed forecasts of weather at specific locations or, for example, the risks associated with global geo-engineering schemes. This talk will provide an overview of these questions in the light of recent developments in multi-decade climate forecasting, drawing on concepts from information theory, machine learning and statistics. I will draw extensively but not exclusively from the experience of the climateprediction.net project, running multiple versions of climate models on personal computers.
Applying contemporary statistical techniques
Wilcox, Rand R
2003-01-01
Applying Contemporary Statistical Techniques explains why traditional statistical methods are often inadequate or outdated when applied to modern problems. Wilcox demonstrates how new and more powerful techniques address these problems far more effectively, making these modern robust methods understandable, practical, and easily accessible.* Assumes no previous training in statistics * Explains how and why modern statistical methods provide more accurate results than conventional methods* Covers the latest developments on multiple comparisons * Includes recent advanc
Interactive statistics with ILLMO
Martens, J.B.O.S.
2014-01-01
Progress in empirical research relies on adequate statistical analysis and reporting. This article proposes an alternative approach to statistical modeling that is based on an old but mostly forgotten idea, namely Thurstone modeling. Traditional statistical methods assume that either the measured
Lenard, Christopher; McCarthy, Sally; Mills, Terence
2014-01-01
There are many different aspects of statistics. Statistics involves mathematics, computing, and applications to almost every field of endeavour. Each aspect provides an opportunity to spark someone's interest in the subject. In this paper we discuss some ethical aspects of statistics, and describe how an introduction to ethics has been…
Youth Sports Safety Statistics
... 6):794-799. 31 American Heart Association. CPR statistics. www.heart.org/HEARTORG/CPRAndECC/WhatisCPR/CPRFactsandStats/CPRpercent20Statistics_ ... Mental Health Services Administration, Center for Behavioral Health Statistics and Quality. (January 10, 2013). The DAWN Report: ...
Dowdy, Shirley; Chilko, Daniel
2011-01-01
Praise for the Second Edition "Statistics for Research has other fine qualities besides superior organization. The examples and the statistical methods are laid out with unusual clarity by the simple device of using special formats for each. The book was written with great care and is extremely user-friendly."-The UMAP Journal Although the goals and procedures of statistical research have changed little since the Second Edition of Statistics for Research was published, the almost universal availability of personal computers and statistical computing application packages have made it possible f
Boslaugh, Sarah
2013-01-01
Need to learn statistics for your job? Want help passing a statistics course? Statistics in a Nutshell is a clear and concise introduction and reference for anyone new to the subject. Thoroughly revised and expanded, this edition helps you gain a solid understanding of statistics without the numbing complexity of many college texts. Each chapter presents easy-to-follow descriptions, along with graphics, formulas, solved examples, and hands-on exercises. If you want to perform common statistical analyses and learn a wide range of techniques without getting in over your head, this is your book.
Statistics & probaility for dummies
Rumsey, Deborah J
2013-01-01
Two complete eBooks for one low price! Created and compiled by the publisher, this Statistics I and Statistics II bundle brings together two math titles in one, e-only bundle. With this special bundle, you'll get the complete text of the following two titles: Statistics For Dummies, 2nd Edition Statistics For Dummies shows you how to interpret and critique graphs and charts, determine the odds with probability, guesstimate with confidence using confidence intervals, set up and carry out a hypothesis test, compute statistical formulas, and more. Tra
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Business statistics for dummies
Anderson, Alan
2013-01-01
Score higher in your business statistics course? Easy. Business statistics is a common course for business majors and MBA candidates. It examines common data sets and the proper way to use such information when conducting research and producing informational reports such as profit and loss statements, customer satisfaction surveys, and peer comparisons. Business Statistics For Dummies tracks to a typical business statistics course offered at the undergraduate and graduate levels and provides clear, practical explanations of business statistical ideas, techniques, formulas, and calculations, w
Griffiths, Dawn
2009-01-01
Wouldn't it be great if there were a statistics book that made histograms, probability distributions, and chi square analysis more enjoyable than going to the dentist? Head First Statistics brings this typically dry subject to life, teaching you everything you want and need to know about statistics through engaging, interactive, and thought-provoking material, full of puzzles, stories, quizzes, visual aids, and real-world examples. Whether you're a student, a professional, or just curious about statistical analysis, Head First's brain-friendly formula helps you get a firm grasp of statistics
Bootstrap prediction and Bayesian prediction under misspecified models
Fushiki, Tadayoshi
2005-01-01
We consider a statistical prediction problem under misspecified models. In a sense, Bayesian prediction is an optimal prediction method when an assumed model is true. Bootstrap prediction is obtained by applying Breiman's `bagging' method to a plug-in prediction. Bootstrap prediction can be considered to be an approximation to the Bayesian prediction under the assumption that the model is true. However, in applications, there are frequently deviations from the assumed model. In this paper, bo...
Bourne, Victoria J.
2014-01-01
Research methods and statistical analysis is typically the least liked and most anxiety provoking aspect of a psychology undergraduate degree, in large part due to the mathematical component of the content. In this first cycle of a piece of action research, students' mathematical ability is examined in relation to their performance across…
Levy, R.; Mcginness, H.
1976-01-01
Investigations were performed to predict the power available from the wind at the Goldstone, California, antenna site complex. The background for power prediction was derived from a statistical evaluation of available wind speed data records at this location and at nearby locations similarly situated within the Mojave desert. In addition to a model for power prediction over relatively long periods of time, an interim simulation model that produces sample wind speeds is described. The interim model furnishes uncorrelated sample speeds at hourly intervals that reproduce the statistical wind distribution at Goldstone. A stochastic simulation model to provide speed samples representative of both the statistical speed distributions and correlations is also discussed.
Lectures on algebraic statistics
Drton, Mathias; Sullivant, Seth
2009-01-01
How does an algebraic geometer studying secant varieties further the understanding of hypothesis tests in statistics? Why would a statistician working on factor analysis raise open problems about determinantal varieties? Connections of this type are at the heart of the new field of "algebraic statistics". In this field, mathematicians and statisticians come together to solve statistical inference problems using concepts from algebraic geometry as well as related computational and combinatorial techniques. The goal of these lectures is to introduce newcomers from the different camps to algebraic statistics. The introduction will be centered around the following three observations: many important statistical models correspond to algebraic or semi-algebraic sets of parameters; the geometry of these parameter spaces determines the behaviour of widely used statistical inference procedures; computational algebraic geometry can be used to study parameter spaces and other features of statistical models.
Naghshpour, Shahdad
2012-01-01
Statistics is the branch of mathematics that deals with real-life problems. As such, it is an essential tool for economists. Unfortunately, the way you and many other economists learn the concept of statistics is not compatible with the way economists think and learn. The problem is worsened by the use of mathematical jargon and complex derivations. Here's a book that proves none of this is necessary. All the examples and exercises in this book are constructed within the field of economics, thus eliminating the difficulty of learning statistics with examples from fields that have no relation to business, politics, or policy. Statistics is, in fact, not more difficult than economics. Anyone who can comprehend economics can understand and use statistics successfully within this field, including you! This book utilizes Microsoft Excel to obtain statistical results, as well as to perform additional necessary computations. Microsoft Excel is not the software of choice for performing sophisticated statistical analy...
Baseline Statistics of Linked Statistical Data
Scharnhorst, Andrea; Meroño-Peñuela, Albert; Guéret, Christophe
2014-01-01
We are surrounded by an ever increasing ocean of information, everybody will agree to that. We build sophisticated strategies to govern this information: design data models, develop infrastructures for data sharing, building tool for data analysis. Statistical datasets curated by National
Relativistic beaming and quasar statistics
International Nuclear Information System (INIS)
Orr, M.J.L.; Browne, I.W.A.
1982-01-01
The statistical predictions of a unified scheme for the radio emission from quasars are explored. This scheme attributes the observed differences between flat- and steep-spectrum quasars to projection and the effects of relativistic beaming of the emission from the nuclear components. We use a simple quasar model consisting of a compact relativistically beamed core with spectral index zero and unbeamed lobes, spectral index - 1, to predict the proportion of flat-spectrum sources in flux-limited samples selected at different frequencies. In our model this fraction depends on the core Lorentz factor, γ and we find that a value of approximately 5 gives satisfactory agreement with observation. In a similar way the model is used to construct the expected number/flux density counts for flat-spectrum quasars from the observed steep-spectrum counts. Again, good agreement with the observations is obtained if the average core Lorentz factor is about 5. Independent estimates of γ from observations of superluminal motion in quasars are of the same order of magnitude. We conclude that the statistical properties of quasars are entirely consistent with the predictions of simple relativistic-beam models. (author)
Statistical Physics An Introduction
Yoshioka, Daijiro
2007-01-01
This book provides a comprehensive presentation of the basics of statistical physics. The first part explains the essence of statistical physics and how it provides a bridge between microscopic and macroscopic phenomena, allowing one to derive quantities such as entropy. Here the author avoids going into details such as Liouville’s theorem or the ergodic theorem, which are difficult for beginners and unnecessary for the actual application of the statistical mechanics. In the second part, statistical mechanics is applied to various systems which, although they look different, share the same mathematical structure. In this way readers can deepen their understanding of statistical physics. The book also features applications to quantum dynamics, thermodynamics, the Ising model and the statistical dynamics of free spins.
Statistical symmetries in physics
International Nuclear Information System (INIS)
Green, H.S.; Adelaide Univ., SA
1994-01-01
Every law of physics is invariant under some group of transformations and is therefore the expression of some type of symmetry. Symmetries are classified as geometrical, dynamical or statistical. At the most fundamental level, statistical symmetries are expressed in the field theories of the elementary particles. This paper traces some of the developments from the discovery of Bose statistics, one of the two fundamental symmetries of physics. A series of generalizations of Bose statistics is described. A supersymmetric generalization accommodates fermions as well as bosons, and further generalizations, including parastatistics, modular statistics and graded statistics, accommodate particles with properties such as 'colour'. A factorization of elements of ggl(n b ,n f ) can be used to define truncated boson operators. A general construction is given for q-deformed boson operators, and explicit constructions of the same type are given for various 'deformed' algebras. A summary is given of some of the applications and potential applications. 39 refs., 2 figs
The statistical stability phenomenon
Gorban, Igor I
2017-01-01
This monograph investigates violations of statistical stability of physical events, variables, and processes and develops a new physical-mathematical theory taking into consideration such violations – the theory of hyper-random phenomena. There are five parts. The first describes the phenomenon of statistical stability and its features, and develops methods for detecting violations of statistical stability, in particular when data is limited. The second part presents several examples of real processes of different physical nature and demonstrates the violation of statistical stability over broad observation intervals. The third part outlines the mathematical foundations of the theory of hyper-random phenomena, while the fourth develops the foundations of the mathematical analysis of divergent and many-valued functions. The fifth part contains theoretical and experimental studies of statistical laws where there is violation of statistical stability. The monograph should be of particular interest to engineers...
Lagrangian statistics in weakly forced two-dimensional turbulence.
Rivera, Michael K; Ecke, Robert E
2016-01-01
Measurements of Lagrangian single-point and multiple-point statistics in a quasi-two-dimensional stratified layer system are reported. The system consists of a layer of salt water over an immiscible layer of Fluorinert and is forced electromagnetically so that mean-squared vorticity is injected at a well-defined spatial scale ri. Simultaneous cascades develop in which enstrophy flows predominately to small scales whereas energy cascades, on average, to larger scales. Lagrangian correlations and one- and two-point displacements are measured for random initial conditions and for initial positions within topological centers and saddles. Some of the behavior of these quantities can be understood in terms of the trapping characteristics of long-lived centers, the slow motion near strong saddles, and the rapid fluctuations outside of either centers or saddles. We also present statistics of Lagrangian velocity fluctuations using energy spectra in frequency space and structure functions in real space. We compare with complementary Eulerian velocity statistics. We find that simultaneous inverse energy and enstrophy ranges present in spectra are not directly echoed in real-space moments of velocity difference. Nevertheless, the spectral ranges line up well with features of moment ratios, indicating that although the moments are not exhibiting unambiguous scaling, the behavior of the probability distribution functions is changing over short ranges of length scales. Implications for understanding weakly forced 2D turbulence with simultaneous inverse and direct cascades are discussed.
Equilibrium statistical mechanics
Jackson, E Atlee
2000-01-01
Ideal as an elementary introduction to equilibrium statistical mechanics, this volume covers both classical and quantum methodology for open and closed systems. Introductory chapters familiarize readers with probability and microscopic models of systems, while additional chapters describe the general derivation of the fundamental statistical mechanics relationships. The final chapter contains 16 sections, each dealing with a different application, ordered according to complexity, from classical through degenerate quantum statistical mechanics. Key features include an elementary introduction t
Applied statistics for economists
Lewis, Margaret
2012-01-01
This book is an undergraduate text that introduces students to commonly-used statistical methods in economics. Using examples based on contemporary economic issues and readily-available data, it not only explains the mechanics of the various methods, it also guides students to connect statistical results to detailed economic interpretations. Because the goal is for students to be able to apply the statistical methods presented, online sources for economic data and directions for performing each task in Excel are also included.
Mineral industry statistics 1975
Energy Technology Data Exchange (ETDEWEB)
1978-01-01
Production, consumption and marketing statistics are given for solid fuels (coal, peat), liquid fuels and gases (oil, natural gas), iron ore, bauxite and other minerals quarried in France, in 1975. Also accident statistics are included. Production statistics are presented of the Overseas Departments and territories (French Guiana, New Caledonia, New Hebrides). An account of modifications in the mining field in 1975 is given. Concessions, exploitation permits, and permits solely for prospecting for mineral products are discussed. (In French)
Lectures on statistical mechanics
Bowler, M G
1982-01-01
Anyone dissatisfied with the almost ritual dullness of many 'standard' texts in statistical mechanics will be grateful for the lucid explanation and generally reassuring tone. Aimed at securing firm foundations for equilibrium statistical mechanics, topics of great subtlety are presented transparently and enthusiastically. Very little mathematical preparation is required beyond elementary calculus and prerequisites in physics are limited to some elementary classical thermodynamics. Suitable as a basis for a first course in statistical mechanics, the book is an ideal supplement to more convent
Directory of Open Access Journals (Sweden)
Mirjam Nielen
2017-01-01
Full Text Available Always wondered why research papers often present rather complicated statistical analyses? Or wondered how to properly analyse the results of a pragmatic trial from your own practice? This talk will give an overview of basic statistical principles and focus on the why of statistics, rather than on the how.This is a podcast of Mirjam's talk at the Veterinary Evidence Today conference, Edinburgh November 2, 2016.
Equilibrium statistical mechanics
Mayer, J E
1968-01-01
The International Encyclopedia of Physical Chemistry and Chemical Physics, Volume 1: Equilibrium Statistical Mechanics covers the fundamental principles and the development of theoretical aspects of equilibrium statistical mechanics. Statistical mechanical is the study of the connection between the macroscopic behavior of bulk matter and the microscopic properties of its constituent atoms and molecules. This book contains eight chapters, and begins with a presentation of the master equation used for the calculation of the fundamental thermodynamic functions. The succeeding chapters highlight t
Mahalanobis, P C
1965-01-01
Contributions to Statistics focuses on the processes, methodologies, and approaches involved in statistics. The book is presented to Professor P. C. Mahalanobis on the occasion of his 70th birthday. The selection first offers information on the recovery of ancillary information and combinatorial properties of partially balanced designs and association schemes. Discussions focus on combinatorial applications of the algebra of association matrices, sample size analogy, association matrices and the algebra of association schemes, and conceptual statistical experiments. The book then examines latt
Simultaneous Tracking of Multiple Points Using a Wiimote
Skeffington, Alex; Scully, Kyle
2012-01-01
This paper reviews the construction of an inexpensive motion tracking and data logging system, which can be used for a wide variety of teaching experiments ranging from entry-level physics courses to advanced courses. The system utilizes an affordable infrared camera found in a Nintendo Wiimote to track IR LEDs mounted to the objects to be…
Boslaugh, Sarah
2008-01-01
Need to learn statistics as part of your job, or want some help passing a statistics course? Statistics in a Nutshell is a clear and concise introduction and reference that's perfect for anyone with no previous background in the subject. This book gives you a solid understanding of statistics without being too simple, yet without the numbing complexity of most college texts. You get a firm grasp of the fundamentals and a hands-on understanding of how to apply them before moving on to the more advanced material that follows. Each chapter presents you with easy-to-follow descriptions illustrat
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Annual Statistical Supplement, 2002
Social Security Administration — The Annual Statistical Supplement, 2002 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2010
Social Security Administration — The Annual Statistical Supplement, 2010 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2007
Social Security Administration — The Annual Statistical Supplement, 2007 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2001
Social Security Administration — The Annual Statistical Supplement, 2001 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2016
Social Security Administration — The Annual Statistical Supplement, 2016 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2011
Social Security Administration — The Annual Statistical Supplement, 2011 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2005
Social Security Administration — The Annual Statistical Supplement, 2005 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2015
Social Security Administration — The Annual Statistical Supplement, 2015 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2003
Social Security Administration — The Annual Statistical Supplement, 2003 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2017
Social Security Administration — The Annual Statistical Supplement, 2017 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...
Annual Statistical Supplement, 2008
Social Security Administration — The Annual Statistical Supplement, 2008 includes the most comprehensive data available on the Social Security and Supplemental Security Income programs. More than...