WorldWideScience

Sample records for multiple-point statistical prediction

  1. Multiple-point statistical prediction on fracture networks at Yucca Mountain

    International Nuclear Information System (INIS)

    Liu, X.Y; Zhang, C.Y.; Liu, Q.S.; Birkholzer, J.T.

    2009-01-01

    In many underground nuclear waste repository systems, such as at Yucca Mountain, water flow rate and amount of water seepage into the waste emplacement drifts are mainly determined by hydrological properties of fracture network in the surrounding rock mass. Natural fracture network system is not easy to describe, especially with respect to its connectivity which is critically important for simulating the water flow field. In this paper, we introduced a new method for fracture network description and prediction, termed multi-point-statistics (MPS). The process of the MPS method is to record multiple-point statistics concerning the connectivity patterns of a fracture network from a known fracture map, and to reproduce multiple-scale training fracture patterns in a stochastic manner, implicitly and directly. It is applied to fracture data to study flow field behavior at the Yucca Mountain waste repository system. First, the MPS method is used to create a fracture network with an original fracture training image from Yucca Mountain dataset. After we adopt a harmonic and arithmetic average method to upscale the permeability to a coarse grid, THM simulation is carried out to study near-field water flow in the surrounding waste emplacement drifts. Our study shows that connectivity or patterns of fracture networks can be grasped and reconstructed by MPS methods. In theory, it will lead to better prediction of fracture system characteristics and flow behavior. Meanwhile, we can obtain variance from flow field, which gives us a way to quantify model uncertainty even in complicated coupled THM simulations. It indicates that MPS can potentially characterize and reconstruct natural fracture networks in a fractured rock mass with advantages of quantifying connectivity of fracture system and its simulation uncertainty simultaneously.

  2. History Matching Through a Smooth Formulation of Multiple-Point Statistics

    DEFF Research Database (Denmark)

    Melnikova, Yulia; Zunino, Andrea; Lange, Katrine

    2014-01-01

    and the mismatch with multiple-point statistics. As a result, in the framework of the Bayesian approach, such a solution belongs to a high posterior region. The methodology, while applicable to any inverse problem with a training-image-based prior, is especially beneficial for problems which require expensive......We propose a smooth formulation of multiple-point statistics that enables us to solve inverse problems using gradient-based optimization techniques. We introduce a differentiable function that quantifies the mismatch between multiple-point statistics of a training image and of a given model. We...... show that, by minimizing this function, any continuous image can be gradually transformed into an image that honors the multiple-point statistics of the discrete training image. The solution to an inverse problem is then found by minimizing the sum of two mismatches: the mismatch with data...

  3. Variogram based and Multiple - Point Statistical simulation of shallow aquifer structures in the Upper Salzach valley, Austria

    Science.gov (United States)

    Jandrisevits, Carmen; Marschallinger, Robert

    2014-05-01

    Quarternary sediments in overdeepened alpine valleys and basins in the Eastern Alps bear substantial groundwater resources. The associated aquifer systems are generally geometrically complex with highly variable hydraulic properties. 3D geological models provide predictions of both geometry and properties of the subsurface required for subsequent modelling of groundwater flow and transport. In hydrology, geostatistical Kriging and Kriging based conditional simulations are widely used to predict the spatial distribution of hydrofacies. In the course of investigating the shallow aquifer structures in the Zell basin in the Upper Salzach valley (Salzburg, Austria), a benchmark of available geostatistical modelling and simulation methods was performed: traditional variogram based geostatistical methods, i.e. Indicator Kriging, Sequential Indicator Simulation and Sequential Indicator Co - Simulation were used as well as Multiple Point Statistics. The ~ 6 km2 investigation area is sampled by 56 drillings with depths of 5 to 50 m; in addition, there are 2 geophysical sections with lengths of 2 km and depths of 50 m. Due to clustered drilling sites, indicator Kriging models failed to consistently model the spatial variability of hydrofacies. Using classical variogram based geostatistical simulation (SIS), equally probable realizations were generated with differences among the realizations providing an uncertainty measure. The yielded models are unstructured from a geological point - they do not portray the shapes and lateral extensions of associated sedimentary units. Since variograms consider only two - point spatial correlations, they are unable to capture the spatial variability of complex geological structures. The Multiple Point Statistics approach overcomes these limitations of two point statistics as it uses a Training image instead of variograms. The 3D Training Image can be seen as a reference facies model where geological knowledge about depositional

  4. Pilot points method for conditioning multiple-point statistical facies simulation on flow data

    Science.gov (United States)

    Ma, Wei; Jafarpour, Behnam

    2018-05-01

    We propose a new pilot points method for conditioning discrete multiple-point statistical (MPS) facies simulation on dynamic flow data. While conditioning MPS simulation on static hard data is straightforward, their calibration against nonlinear flow data is nontrivial. The proposed method generates conditional models from a conceptual model of geologic connectivity, known as a training image (TI), by strategically placing and estimating pilot points. To place pilot points, a score map is generated based on three sources of information: (i) the uncertainty in facies distribution, (ii) the model response sensitivity information, and (iii) the observed flow data. Once the pilot points are placed, the facies values at these points are inferred from production data and then are used, along with available hard data at well locations, to simulate a new set of conditional facies realizations. While facies estimation at the pilot points can be performed using different inversion algorithms, in this study the ensemble smoother (ES) is adopted to update permeability maps from production data, which are then used to statistically infer facies types at the pilot point locations. The developed method combines the information in the flow data and the TI by using the former to infer facies values at selected locations away from the wells and the latter to ensure consistent facies structure and connectivity where away from measurement locations. Several numerical experiments are used to evaluate the performance of the developed method and to discuss its important properties.

  5. Multiple point statistical simulation using uncertain (soft) conditional data

    Science.gov (United States)

    Hansen, Thomas Mejer; Vu, Le Thanh; Mosegaard, Klaus; Cordua, Knud Skou

    2018-05-01

    Geostatistical simulation methods have been used to quantify spatial variability of reservoir models since the 80s. In the last two decades, state of the art simulation methods have changed from being based on covariance-based 2-point statistics to multiple-point statistics (MPS), that allow simulation of more realistic Earth-structures. In addition, increasing amounts of geo-information (geophysical, geological, etc.) from multiple sources are being collected. This pose the problem of integration of these different sources of information, such that decisions related to reservoir models can be taken on an as informed base as possible. In principle, though difficult in practice, this can be achieved using computationally expensive Monte Carlo methods. Here we investigate the use of sequential simulation based MPS simulation methods conditional to uncertain (soft) data, as a computational efficient alternative. First, it is demonstrated that current implementations of sequential simulation based on MPS (e.g. SNESIM, ENESIM and Direct Sampling) do not account properly for uncertain conditional information, due to a combination of using only co-located information, and a random simulation path. Then, we suggest two approaches that better account for the available uncertain information. The first make use of a preferential simulation path, where more informed model parameters are visited preferentially to less informed ones. The second approach involves using non co-located uncertain information. For different types of available data, these approaches are demonstrated to produce simulation results similar to those obtained by the general Monte Carlo based approach. These methods allow MPS simulation to condition properly to uncertain (soft) data, and hence provides a computationally attractive approach for integration of information about a reservoir model.

  6. Pseudo-dynamic source modelling with 1-point and 2-point statistics of earthquake source parameters

    KAUST Repository

    Song, S. G.

    2013-12-24

    Ground motion prediction is an essential element in seismic hazard and risk analysis. Empirical ground motion prediction approaches have been widely used in the community, but efficient simulation-based ground motion prediction methods are needed to complement empirical approaches, especially in the regions with limited data constraints. Recently, dynamic rupture modelling has been successfully adopted in physics-based source and ground motion modelling, but it is still computationally demanding and many input parameters are not well constrained by observational data. Pseudo-dynamic source modelling keeps the form of kinematic modelling with its computational efficiency, but also tries to emulate the physics of source process. In this paper, we develop a statistical framework that governs the finite-fault rupture process with 1-point and 2-point statistics of source parameters in order to quantify the variability of finite source models for future scenario events. We test this method by extracting 1-point and 2-point statistics from dynamically derived source models and simulating a number of rupture scenarios, given target 1-point and 2-point statistics. We propose a new rupture model generator for stochastic source modelling with the covariance matrix constructed from target 2-point statistics, that is, auto- and cross-correlations. Our sensitivity analysis of near-source ground motions to 1-point and 2-point statistics of source parameters provides insights into relations between statistical rupture properties and ground motions. We observe that larger standard deviation and stronger correlation produce stronger peak ground motions in general. The proposed new source modelling approach will contribute to understanding the effect of earthquake source on near-source ground motion characteristics in a more quantitative and systematic way.

  7. Common pitfalls in statistical analysis: The perils of multiple testing

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2016-01-01

    Multiple testing refers to situations where a dataset is subjected to statistical testing multiple times - either at multiple time-points or through multiple subgroups or for multiple end-points. This amplifies the probability of a false-positive finding. In this article, we look at the consequences of multiple testing and explore various methods to deal with this issue. PMID:27141478

  8. A MOSUM procedure for the estimation of multiple random change points

    OpenAIRE

    Eichinger, Birte; Kirch, Claudia

    2018-01-01

    In this work, we investigate statistical properties of change point estimators based on moving sum statistics. We extend results for testing in a classical situation with multiple deterministic change points by allowing for random exogenous change points that arise in Hidden Markov or regime switching models among others. To this end, we consider a multiple mean change model with possible time series errors and prove that the number and location of change points are estimated consistently by ...

  9. Analysis Code - Data Analysis in 'Leveraging Multiple Statistical Methods for Inverse Prediction in Nuclear Forensics Applications' (LMSMIPNFA) v. 1.0

    Energy Technology Data Exchange (ETDEWEB)

    2018-03-19

    R code that performs the analysis of a data set presented in the paper ‘Leveraging Multiple Statistical Methods for Inverse Prediction in Nuclear Forensics Applications’ by Lewis, J., Zhang, A., Anderson-Cook, C. It provides functions for doing inverse predictions in this setting using several different statistical methods. The data set is a publicly available data set from a historical Plutonium production experiment.

  10. Accelerating simulation for the multiple-point statistics algorithm using vector quantization

    Science.gov (United States)

    Zuo, Chen; Pan, Zhibin; Liang, Hao

    2018-03-01

    Multiple-point statistics (MPS) is a prominent algorithm to simulate categorical variables based on a sequential simulation procedure. Assuming training images (TIs) as prior conceptual models, MPS extracts patterns from TIs using a template and records their occurrences in a database. However, complex patterns increase the size of the database and require considerable time to retrieve the desired elements. In order to speed up simulation and improve simulation quality over state-of-the-art MPS methods, we propose an accelerating simulation for MPS using vector quantization (VQ), called VQ-MPS. First, a variable representation is presented to make categorical variables applicable for vector quantization. Second, we adopt a tree-structured VQ to compress the database so that stationary simulations are realized. Finally, a transformed template and classified VQ are used to address nonstationarity. A two-dimensional (2D) stationary channelized reservoir image is used to validate the proposed VQ-MPS. In comparison with several existing MPS programs, our method exhibits significantly better performance in terms of computational time, pattern reproductions, and spatial uncertainty. Further demonstrations consist of a 2D four facies simulation, two 2D nonstationary channel simulations, and a three-dimensional (3D) rock simulation. The results reveal that our proposed method is also capable of solving multifacies, nonstationarity, and 3D simulations based on 2D TIs.

  11. An efficient method for the prediction of deleterious multiple-point mutations in the secondary structure of RNAs using suboptimal folding solutions

    Directory of Open Access Journals (Sweden)

    Barash Danny

    2008-04-01

    Full Text Available Abstract Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3, for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary

  12. Modelling a real-world buried valley system with vertical non-stationarity using multiple-point statistics

    DEFF Research Database (Denmark)

    He, Xiulan; Sonnenborg, Torben; Jørgensen, Flemming

    2017-01-01

    -stationary geological system characterized by a network of connected buried valleys that incise deeply into layered Miocene sediments (case study in Denmark). The results show that, based on fragmented information of the formation boundaries, the MPS partition method is able to simulate a non-stationary system......Stationarity has traditionally been a requirement of geostatistical simulations. A common way to deal with non-stationarity is to divide the system into stationary sub-regions and subsequently merge the realizations for each region. Recently, the so-called partition approach that has...... the flexibility to model non-stationary systems directly was developed for multiple-point statistics simulation (MPS). The objective of this study is to apply the MPS partition method with conventional borehole logs and high-resolution airborne electromagnetic (AEM) data, for simulation of a real-world non...

  13. Improving the Pattern Reproducibility of Multiple-Point-Based Prior Models Using Frequency Matching

    DEFF Research Database (Denmark)

    Cordua, Knud Skou; Hansen, Thomas Mejer; Mosegaard, Klaus

    2014-01-01

    Some multiple-point-based sampling algorithms, such as the snesim algorithm, rely on sequential simulation. The conditional probability distributions that are used for the simulation are based on statistics of multiple-point data events obtained from a training image. During the simulation, data...... events with zero probability in the training image statistics may occur. This is handled by pruning the set of conditioning data until an event with non-zero probability is found. The resulting probability distribution sampled by such algorithms is a pruned mixture model. The pruning strategy leads...... to a probability distribution that lacks some of the information provided by the multiple-point statistics from the training image, which reduces the reproducibility of the training image patterns in the outcome realizations. When pruned mixture models are used as prior models for inverse problems, local re...

  14. Investigating lithological and geophysical relationships with applications to geological uncertainty analysis using Multiple-Point Statistical methods

    DEFF Research Database (Denmark)

    Barfod, Adrian

    The PhD thesis presents a new method for analyzing the relationship between resistivity and lithology, as well as a method for quantifying the hydrostratigraphic modeling uncertainty related to Multiple-Point Statistical (MPS) methods. Three-dimensional (3D) geological models are im...... is to improve analysis and research of the resistivity-lithology relationship and ensemble geological/hydrostratigraphic modeling. The groundwater mapping campaign in Denmark, beginning in the 1990’s, has resulted in the collection of large amounts of borehole and geophysical data. The data has been compiled...... in two publicly available databases, the JUPITER and GERDA databases, which contain borehole and geophysical data, respectively. The large amounts of available data provided a unique opportunity for studying the resistivity-lithology relationship. The method for analyzing the resistivity...

  15. The statistics of the points where nodal lines intersect a reference curve

    International Nuclear Information System (INIS)

    Aronovitch, Amit; Smilansky, Uzy

    2007-01-01

    We study the intersection points of a fixed planar curve Γ with the nodal set of a translationally invariant and isotropic Gaussian random field Ψ(r) and the zeros of its normal derivative across the curve. The intersection points form a discrete random process which is the object of this study. The field probability distribution function is completely specified by the correlation G(|r - r'|) = (Ψ(r)Ψ(r')). Given an arbitrary G(|r - r'|), we compute the two-point correlation function of the point process on the line, and derive other statistical measures (repulsion, rigidity) which characterize the short- and long-range correlations of the intersection points. We use these statistical measures to quantitatively characterize the complex patterns displayed by various kinds of nodal networks. We apply these statistics in particular to nodal patterns of random waves and of eigenfunctions of chaotic billiards. Of special interest is the observation that for monochromatic random waves, the number variance of the intersections with long straight segments grows like Lln L, as opposed to the linear growth predicted by the percolation model, which was successfully used to predict other long-range nodal properties of that field

  16. A location-based multiple point statistics method: modelling the reservoir with non-stationary characteristics

    Directory of Open Access Journals (Sweden)

    Yin Yanshu

    2017-12-01

    Full Text Available In this paper, a location-based multiple point statistics method is developed to model a non-stationary reservoir. The proposed method characterizes the relationship between the sedimentary pattern and the deposit location using the relative central position distance function, which alleviates the requirement that the training image and the simulated grids have the same dimension. The weights in every direction of the distance function can be changed to characterize the reservoir heterogeneity in various directions. The local integral replacements of data events, structured random path, distance tolerance and multi-grid strategy are applied to reproduce the sedimentary patterns and obtain a more realistic result. This method is compared with the traditional Snesim method using a synthesized 3-D training image of Poyang Lake and a reservoir model of Shengli Oilfield in China. The results indicate that the new method can reproduce the non-stationary characteristics better than the traditional method and is more suitable for simulation of delta-front deposits. These results show that the new method is a powerful tool for modelling a reservoir with non-stationary characteristics.

  17. Statistical Power in Evaluations That Investigate Effects on Multiple Outcomes: A Guide for Researchers

    Science.gov (United States)

    Porter, Kristin E.

    2018-01-01

    Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…

  18. Pairwise contact energy statistical potentials can help to find probability of point mutations.

    Science.gov (United States)

    Saravanan, K M; Suvaithenamudhan, S; Parthasarathy, S; Selvaraj, S

    2017-01-01

    To adopt a particular fold, a protein requires several interactions between its amino acid residues. The energetic contribution of these residue-residue interactions can be approximated by extracting statistical potentials from known high resolution structures. Several methods based on statistical potentials extracted from unrelated proteins are found to make a better prediction of probability of point mutations. We postulate that the statistical potentials extracted from known structures of similar folds with varying sequence identity can be a powerful tool to examine probability of point mutation. By keeping this in mind, we have derived pairwise residue and atomic contact energy potentials for the different functional families that adopt the (α/β) 8 TIM-Barrel fold. We carried out computational point mutations at various conserved residue positions in yeast Triose phosphate isomerase enzyme for which experimental results are already reported. We have also performed molecular dynamics simulations on a subset of point mutants to make a comparative study. The difference in pairwise residue and atomic contact energy of wildtype and various point mutations reveals probability of mutations at a particular position. Interestingly, we found that our computational prediction agrees with the experimental studies of Silverman et al. (Proc Natl Acad Sci 2001;98:3092-3097) and perform better prediction than i Mutant and Cologne University Protein Stability Analysis Tool. The present work thus suggests deriving pairwise contact energy potentials and molecular dynamics simulations of functionally important folds could help us to predict probability of point mutations which may ultimately reduce the time and cost of mutation experiments. Proteins 2016; 85:54-64. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

    Science.gov (United States)

    Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

    2017-01-01

    If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.

  20. FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants.

    Science.gov (United States)

    Bednar, David; Beerens, Koen; Sebestova, Eva; Bendl, Jaroslav; Khare, Sagar; Chaloupkova, Radka; Prokop, Zbynek; Brezovsky, Jan; Baker, David; Damborsky, Jiri

    2015-11-01

    There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C) by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.

  1. FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants.

    Directory of Open Access Journals (Sweden)

    David Bednar

    2015-11-01

    Full Text Available There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.

  2. Determination of shell correction energies at saddle point using pre-scission neutron multiplicities

    International Nuclear Information System (INIS)

    Golda, K.S.; Saxena, A.; Mittal, V.K.; Mahata, K.; Sugathan, P.; Jhingan, A.; Singh, V.; Sandal, R.; Goyal, S.; Gehlot, J.; Dhal, A.; Behera, B.R.; Bhowmik, R.K.; Kailas, S.

    2013-01-01

    Pre-scission neutron multiplicities have been measured for 12 C + 194, 198 Pt systems at matching excitation energies at near Coulomb barrier region. Statistical model analysis with a modified fission barrier and level density prescription have been carried out to fit the measured pre-scission neutron multiplicities and the available evaporation residue and fission cross sections simultaneously to constrain statistical model parameters. Simultaneous fitting of the pre-scission neutron multiplicities and cross section data requires shell correction at the saddle point

  3. Statistical Power in Evaluations That Investigate Effects on Multiple Outcomes: A Guide for Researchers

    Science.gov (United States)

    Porter, Kristin E.

    2016-01-01

    In education research and in many other fields, researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple…

  4. Multiplicative point process as a model of trading activity

    Science.gov (United States)

    Gontis, V.; Kaulakys, B.

    2004-11-01

    Signals consisting of a sequence of pulses show that inherent origin of the 1/ f noise is a Brownian fluctuation of the average interevent time between subsequent pulses of the pulse sequence. In this paper, we generalize the model of interevent time to reproduce a variety of self-affine time series exhibiting power spectral density S( f) scaling as a power of the frequency f. Furthermore, we analyze the relation between the power-law correlations and the origin of the power-law probability distribution of the signal intensity. We introduce a stochastic multiplicative model for the time intervals between point events and analyze the statistical properties of the signal analytically and numerically. Such model system exhibits power-law spectral density S( f)∼1/ fβ for various values of β, including β= {1}/{2}, 1 and {3}/{2}. Explicit expressions for the power spectra in the low-frequency limit and for the distribution density of the interevent time are obtained. The counting statistics of the events is analyzed analytically and numerically, as well. The specific interest of our analysis is related with the financial markets, where long-range correlations of price fluctuations largely depend on the number of transactions. We analyze the spectral density and counting statistics of the number of transactions. The model reproduces spectral properties of the real markets and explains the mechanism of power-law distribution of trading activity. The study provides evidence that the statistical properties of the financial markets are enclosed in the statistics of the time interval between trades. A multiplicative point process serves as a consistent model generating this statistics.

  5. THE GROWTH POINTS OF STATISTICAL METHODS

    OpenAIRE

    Orlov A. I.

    2014-01-01

    On the basis of a new paradigm of applied mathematical statistics, data analysis and economic-mathematical methods are identified; we have also discussed five topical areas in which modern applied statistics is developing as well as the other statistical methods, i.e. five "growth points" – nonparametric statistics, robustness, computer-statistical methods, statistics of interval data, statistics of non-numeric data

  6. Prediction of lacking control power in power plants using statistical models

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Mataji, B.; Stoustrup, Jakob

    2007-01-01

    Prediction of the performance of plants like power plants is of interest, since the plant operator can use these predictions to optimize the plant production. In this paper the focus is addressed on a special case where a combination of high coal moisture content and a high load limits the possible...... plant load, meaning that the requested plant load cannot be met. The available models are in this case uncertain. Instead statistical methods are used to predict upper and lower uncertainty bounds on the prediction. Two different methods are used. The first relies on statistics of recent prediction...... errors; the second uses operating point depending statistics of prediction errors. Using these methods on the previous mentioned case, it can be concluded that the second method can be used to predict the power plant performance, while the first method has problems predicting the uncertain performance...

  7. An explicit statistical model of learning lexical segmentation using multiple cues

    NARCIS (Netherlands)

    Çöltekin, Ça ̆grı; Nerbonne, John; Lenci, Alessandro; Padró, Muntsa; Poibeau, Thierry; Villavicencio, Aline

    2014-01-01

    This paper presents an unsupervised and incremental model of learning segmentation that combines multiple cues whose use by children and adults were attested by experimental studies. The cues we exploit in this study are predictability statistics, phonotactics, lexical stress and partial lexical

  8. Improving multiple-point-based a priori models for inverse problems by combining Sequential Simulation with the Frequency Matching Method

    DEFF Research Database (Denmark)

    Cordua, Knud Skou; Hansen, Thomas Mejer; Lange, Katrine

    In order to move beyond simplified covariance based a priori models, which are typically used for inverse problems, more complex multiple-point-based a priori models have to be considered. By means of marginal probability distributions ‘learned’ from a training image, sequential simulation has...... proven to be an efficient way of obtaining multiple realizations that honor the same multiple-point statistics as the training image. The frequency matching method provides an alternative way of formulating multiple-point-based a priori models. In this strategy the pattern frequency distributions (i.......e. marginals) of the training image and a subsurface model are matched in order to obtain a solution with the same multiple-point statistics as the training image. Sequential Gibbs sampling is a simulation strategy that provides an efficient way of applying sequential simulation based algorithms as a priori...

  9. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

    Science.gov (United States)

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C.; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin

    2018-01-01

    In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. PMID:29097376

  10. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems.

    Science.gov (United States)

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin

    2018-01-04

    In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment-trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. Copyright © 2018 Montesinos-Lopez et al.

  11. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

    Directory of Open Access Journals (Sweden)

    Osval A. Montesinos-López

    2018-01-01

    Full Text Available In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF and the matrix factorization algorithm (MF in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets.

  12. New England observed and predicted median July stream/river temperature points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted median July stream/river temperatures in New England based on a spatial statistical network...

  13. New England observed and predicted median August stream/river temperature points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted median August stream/river temperatures in New England based on a spatial statistical network...

  14. Modeling spatial variability of sand-lenses in clay till settings using transition probability and multiple-point geostatistics

    DEFF Research Database (Denmark)

    Kessler, Timo Christian; Nilsson, Bertel; Klint, Knud Erik

    2010-01-01

    (TPROGS) of alternating geological facies. The second method, multiple-point statistics, uses training images to estimate the conditional probability of sand-lenses at a certain location. Both methods respect field observations such as local stratigraphy, however, only the multiple-point statistics can...... of sand-lenses in clay till. Sand-lenses mainly account for horizontal transport and are prioritised in this study. Based on field observations, the distribution has been modeled using two different geostatistical approaches. One method uses a Markov chain model calculating the transition probabilities...

  15. Statistical aspects of determinantal point processes

    DEFF Research Database (Denmark)

    Lavancier, Frédéric; Møller, Jesper; Rubak, Ege

    The statistical aspects of determinantal point processes (DPPs) seem largely unexplored. We review the appealing properties of DDPs, demonstrate that they are useful models for repulsiveness, detail a simulation procedure, and provide freely available software for simulation and statistical infer...

  16. New England observed and predicted August stream/river temperature daily range points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted August stream/river temperature daily ranges in New England based on a spatial statistical...

  17. New England observed and predicted growing season maximum stream/river temperature points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted growing season maximum stream/river temperatures in New England based on a spatial statistical...

  18. A new statistical scission-point model fed with microscopic ingredients to predict fission fragments distributions

    International Nuclear Information System (INIS)

    Heinrich, S.

    2006-01-01

    Nucleus fission process is a very complex phenomenon and, even nowadays, no realistic models describing the overall process are available. The work presented here deals with a theoretical description of fission fragments distributions in mass, charge, energy and deformation. We have reconsidered and updated the B.D. Wilking Scission Point model. Our purpose was to test if this statistic model applied at the scission point and by introducing new results of modern microscopic calculations allows to describe quantitatively the fission fragments distributions. We calculate the surface energy available at the scission point as a function of the fragments deformations. This surface is obtained from a Hartree Fock Bogoliubov microscopic calculation which guarantee a realistic description of the potential dependence on the deformation for each fragment. The statistic balance is described by the level densities of the fragment. We have tried to avoid as much as possible the input of empirical parameters in the model. Our only parameter, the distance between each fragment at the scission point, is discussed by comparison with scission configuration obtained from full dynamical microscopic calculations. Also, the comparison between our results and experimental data is very satisfying and allow us to discuss the success and limitations of our approach. We finally proposed ideas to improve the model, in particular by applying dynamical corrections. (author)

  19. New England observed and predicted July stream/river temperature daily range points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted July stream/river temperature daily ranges in New England based on a spatial statistical network...

  20. Multiplicity of pre-scission charged particle emission by a statistical model

    International Nuclear Information System (INIS)

    Matsuse, Takehiro

    1996-01-01

    With introducing the limitation (E cut-off ) not to excite all statistically permitted scission parts in the phase integral at the scission point, we try to reproduce the multiplicity of pre-scission charged particle emission of 86 Kr (E lab =890 MeV)+ 27 Al by the cascade calculation of the extended Hauser-Feshbach method (EHM). The physical image is explained from a point of view of the life time for the statistical model of the compound nuclei. When E cut-off parameter is bout 80 MeV, the cross section of scission and the loss of pre-scission charged particle seemed to be reproduced. The average pre-scission time is about 1.7 x 10 -20 sec. The essential problem of the life time of compound nuclei is explained. (S.Y.)

  1. Pseudo-dynamic source modelling with 1-point and 2-point statistics of earthquake source parameters

    KAUST Repository

    Song, S. G.; Dalguer, L. A.; Mai, Paul Martin

    2013-01-01

    statistical framework that governs the finite-fault rupture process with 1-point and 2-point statistics of source parameters in order to quantify the variability of finite source models for future scenario events. We test this method by extracting 1-point

  2. Statistical aspects of determinantal point processes

    DEFF Research Database (Denmark)

    Lavancier, Frédéric; Møller, Jesper; Rubak, Ege Holger

    The statistical aspects of determinantal point processes (DPPs) seem largely unexplored. We review the appealing properties of DDPs, demonstrate that they are useful models for repulsiveness, detail a simulation procedure, and provide freely available software for simulation and statistical...... inference. We pay special attention to stationary DPPs, where we give a simple condition ensuring their existence, construct parametric models, describe how they can be well approximated so that the likelihood can be evaluated and realizations can be simulated, and discuss how statistical inference...

  3. Multiple Kernel Learning with Random Effects for Predicting Longitudinal Outcomes and Data Integration

    Science.gov (United States)

    Chen, Tianle; Zeng, Donglin

    2015-01-01

    Summary Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although kernel-based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data but not for longitudinal data. It is thus important to develop time-sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject-specific short-term and long-term latent effects through a designed kernel to account for within-subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data. PMID:26177419

  4. Parametric statistical change point analysis

    CERN Document Server

    Chen, Jie

    2000-01-01

    This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study

  5. Analysis of relationship between registration performance of point cloud statistical model and generation method of corresponding points

    International Nuclear Information System (INIS)

    Yamaoka, Naoto; Watanabe, Wataru; Hontani, Hidekata

    2010-01-01

    Most of the time when we construct statistical point cloud model, we need to calculate the corresponding points. Constructed statistical model will not be the same if we use different types of method to calculate the corresponding points. This article proposes the effect to statistical model of human organ made by different types of method to calculate the corresponding points. We validated the performance of statistical model by registering a surface of an organ in a 3D medical image. We compare two methods to calculate corresponding points. The first, the 'Generalized Multi-Dimensional Scaling (GMDS)', determines the corresponding points by the shapes of two curved surfaces. The second approach, the 'Entropy-based Particle system', chooses corresponding points by calculating a number of curved surfaces statistically. By these methods we construct the statistical models and using these models we conducted registration with the medical image. For the estimation, we use non-parametric belief propagation and this method estimates not only the position of the organ but also the probability density of the organ position. We evaluate how the two different types of method that calculates corresponding points affects the statistical model by change in probability density of each points. (author)

  6. Statistical models for expert judgement and wear prediction

    International Nuclear Information System (INIS)

    Pulkkinen, U.

    1994-01-01

    This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)

  7. Calorimetry end-point predictions

    International Nuclear Information System (INIS)

    Fox, M.A.

    1981-01-01

    This paper describes a portion of the work presently in progress at Rocky Flats in the field of calorimetry. In particular, calorimetry end-point predictions are outlined. The problems associated with end-point predictions and the progress made in overcoming these obstacles are discussed. The two major problems, noise and an accurate description of the heat function, are dealt with to obtain the most accurate results. Data are taken from an actual calorimeter and are processed by means of three different noise reduction techniques. The processed data are then utilized by one to four algorithms, depending on the accuracy desired to determined the end-point

  8. Capturing rogue waves by multi-point statistics

    International Nuclear Information System (INIS)

    Hadjihosseini, A; Wächter, Matthias; Peinke, J; Hoffmann, N P

    2016-01-01

    As an example of a complex system with extreme events, we investigate ocean wave states exhibiting rogue waves. We present a statistical method of data analysis based on multi-point statistics which for the first time allows the grasping of extreme rogue wave events in a highly satisfactory statistical manner. The key to the success of the approach is mapping the complexity of multi-point data onto the statistics of hierarchically ordered height increments for different time scales, for which we can show that a stochastic cascade process with Markov properties is governed by a Fokker–Planck equation. Conditional probabilities as well as the Fokker–Planck equation itself can be estimated directly from the available observational data. With this stochastic description surrogate data sets can in turn be generated, which makes it possible to work out arbitrary statistical features of the complex sea state in general, and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics. (paper)

  9. Impact of Genomics Platform and Statistical Filtering on Transcriptional Benchmark Doses (BMD and Multiple Approaches for Selection of Chemical Point of Departure (PoD.

    Directory of Open Access Journals (Sweden)

    A Francina Webster

    Full Text Available Many regulatory agencies are exploring ways to integrate toxicogenomic data into their chemical risk assessments. The major challenge lies in determining how to distill the complex data produced by high-content, multi-dose gene expression studies into quantitative information. It has been proposed that benchmark dose (BMD values derived from toxicogenomics data be used as point of departure (PoD values in chemical risk assessments. However, there is limited information regarding which genomics platforms are most suitable and how to select appropriate PoD values. In this study, we compared BMD values modeled from RNA sequencing-, microarray-, and qPCR-derived gene expression data from a single study, and explored multiple approaches for selecting a single PoD from these data. The strategies evaluated include several that do not require prior mechanistic knowledge of the compound for selection of the PoD, thus providing approaches for assessing data-poor chemicals. We used RNA extracted from the livers of female mice exposed to non-carcinogenic (0, 2 mg/kg/day, mkd and carcinogenic (4, 8 mkd doses of furan for 21 days. We show that transcriptional BMD values were consistent across technologies and highly predictive of the two-year cancer bioassay-based PoD. We also demonstrate that filtering data based on statistically significant changes in gene expression prior to BMD modeling creates more conservative BMD values. Taken together, this case study on mice exposed to furan demonstrates that high-content toxicogenomics studies produce robust data for BMD modelling that are minimally affected by inter-technology variability and highly predictive of cancer-based PoD doses.

  10. BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

    Science.gov (United States)

    Kaur, Harpreet; Raghava, G P S

    2002-03-01

    beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/

  11. Statistical predictions from anarchic field theory landscapes

    International Nuclear Information System (INIS)

    Balasubramanian, Vijay; Boer, Jan de; Naqvi, Asad

    2010-01-01

    Consistent coupling of effective field theories with a quantum theory of gravity appears to require bounds on the rank of the gauge group and the amount of matter. We consider landscapes of field theories subject to such to boundedness constraints. We argue that appropriately 'coarse-grained' aspects of the randomly chosen field theory in such landscapes, such as the fraction of gauge groups with ranks in a given range, can be statistically predictable. To illustrate our point we show how the uniform measures on simple classes of N=1 quiver gauge theories localize in the vicinity of theories with certain typical structures. Generically, this approach would predict a high energy theory with very many gauge factors, with the high rank factors largely decoupled from the low rank factors if we require asymptotic freedom for the latter.

  12. The Statistical point of view of Quality: the Lean Six Sigma methodology.

    Science.gov (United States)

    Bertolaccini, Luca; Viti, Andrea; Terzi, Alberto

    2015-04-01

    Six Sigma and Lean are two quality improvement methodologies. The Lean Six Sigma methodology is applicable to repetitive procedures. Therefore, the use of this methodology in the health-care arena has focused mainly on areas of business operations, throughput, and case management and has focused on efficiency outcomes. After the revision of methodology, the paper presents a brief clinical example of the use of Lean Six Sigma as a quality improvement method in the reduction of the complications during and after lobectomies. Using Lean Six Sigma methodology, the multidisciplinary teams could identify multiple modifiable points across the surgical process. These process improvements could be applied to different surgical specialties and could result in a measurement, from statistical point of view, of the surgical quality.

  13. Spatial statistics for predicting flow through a rock fracture

    International Nuclear Information System (INIS)

    Coakley, K.J.

    1989-03-01

    Fluid flow through a single rock fracture depends on the shape of the space between the upper and lower pieces of rock which define the fracture. In this thesis, the normalized flow through a fracture, i.e. the equivalent permeability of a fracture, is predicted in terms of spatial statistics computed from the arrangement of voids, i.e. open spaces, and contact areas within the fracture. Patterns of voids and contact areas, with complexity typical of experimental data, are simulated by clipping a correlated Gaussian process defined on a N by N pixel square region. The voids have constant aperture; the distance between the upper and lower surfaces which define the fracture is either zero or a constant. Local flow is assumed to be proportional to local aperture cubed times local pressure gradient. The flow through a pattern of voids and contact areas is solved using a finite-difference method. After solving for the flow through simulated 10 by 10 by 30 pixel patterns of voids and contact areas, a model to predict equivalent permeability is developed. The first model is for patterns with 80% voids where all voids have the same aperture. The equivalent permeability of a pattern is predicted in terms of spatial statistics computed from the arrangement of voids and contact areas within the pattern. Four spatial statistics are examined. The change point statistic measures how often adjacent pixel alternate from void to contact area (or vice versa ) in the rows of the patterns which are parallel to the overall flow direction. 37 refs., 66 figs., 41 tabs

  14. Statistical model of hadrons multiple production in space of total angular momentum and isotopic spin

    International Nuclear Information System (INIS)

    Gridneva, S.A.; Rus'kin, V.I.

    1980-01-01

    Basic features of the statistical model of multiple hadron production based on microcanonical distribution and taking into account the laws of conservation of total angular momentum, isotopic spin, p-, G-, C-eveness and Bose-Einstein statistics requirements are given. The model predictions are compared with experimental data on anti NN annihilation at rest and e + e - annihilation in hadrons at annihilation total energy from 2 to 3 GeV [ru

  15. Statistical experiments using the multiple regression research for prediction of proper hardness in areas of phosphorus cast-iron brake shoes manufacturing

    Science.gov (United States)

    Kiss, I.; Cioată, V. G.; Ratiu, S. A.; Rackov, M.; Penčić, M.

    2018-01-01

    Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. This article focuses on expressing the multiple linear regression model related to the hardness assurance by the chemical composition of the phosphorous cast irons destined to the brake shoes, having in view that the regression coefficients will illustrate the unrelated contributions of each independent variable towards predicting the dependent variable. In order to settle the multiple correlations between the hardness of the cast-iron brake shoes, and their chemical compositions several regression equations has been proposed. Is searched a mathematical solution which can determine the optimum chemical composition for the hardness desirable values. Starting from the above-mentioned affirmations two new statistical experiments are effectuated related to the values of Phosphorus [P], Manganese [Mn] and Silicon [Si]. Therefore, the regression equations, which describe the mathematical dependency between the above-mentioned elements and the hardness, are determined. As result, several correlation charts will be revealed.

  16. Statistical Analysis of CFD Solutions From the Fifth AIAA Drag Prediction Workshop

    Science.gov (United States)

    Morrison, Joseph H.

    2013-01-01

    A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from North America, Europe, Asia, and South America using a common grid sequence and multiple turbulence models for the June 2012 fifth Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was the Common Research Model subsonic transport wing-body previously used for the 4th Drag Prediction Workshop. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with previous workshops.

  17. Discussion of "Modern statistics for spatial point processes"

    DEFF Research Database (Denmark)

    Jensen, Eva Bjørn Vedel; Prokesová, Michaela; Hellmund, Gunnar

    2007-01-01

    ABSTRACT. The paper ‘Modern statistics for spatial point processes’ by Jesper Møller and Rasmus P. Waagepetersen is based on a special invited lecture given by the authors at the 21st Nordic Conference on Mathematical Statistics, held at Rebild, Denmark, in June 2006. At the conference, Antti...

  18. Investigating the Variation of Volatile Compound Composition in Maotai-Flavoured Liquor During Its Multiple Fermentation Steps Using Statistical Methods

    Directory of Open Access Journals (Sweden)

    Zheng-Yun Wu

    2016-01-01

    Full Text Available The use of multiple fermentations is one of the most specific characteristics of Maotai-flavoured liquor production. In this research, the variation of volatile composition of Maotai-flavoured liquor during its multiple fermentations is investigated using statistical approaches. Cluster analysis shows that the obtained samples are grouped mainly according to the fermentation steps rather than the distillery they originate from, and the samples from the first two fermentation steps show the greatest difference, suggesting that multiple fermentation and distillation steps result in the end in similar volatile composition of the liquor. Back-propagation neural network (BNN models were developed that satisfactorily predict the number of fermentation steps and the organoleptic evaluation scores of liquor samples from their volatile compositions. Mean impact value (MIV analysis shows that ethyl lactate, furfural and some high-boiling-point acids play important roles, while pyrazine contributes much less to the improvement of the flavour and taste of Maotai-flavoured liquor during its production. This study contributes to further understanding of the mechanisms of Maotai-flavoured liquor production.

  19. Some properties of point processes in statistical optics

    International Nuclear Information System (INIS)

    Picinbono, B.; Bendjaballah, C.

    2010-01-01

    The analysis of the statistical properties of the point process (PP) of photon detection times can be used to determine whether or not an optical field is classical, in the sense that its statistical description does not require the methods of quantum optics. This determination is, however, more difficult than ordinarily admitted and the first aim of this paper is to illustrate this point by using some results of the PP theory. For example, it is well known that the analysis of the photodetection of classical fields exhibits the so-called bunching effect. But this property alone cannot be used to decide the nature of a given optical field. Indeed, we have presented examples of point processes for which a bunching effect appears and yet they cannot be obtained from a classical field. These examples are illustrated by computer simulations. Similarly, it is often admitted that for fields with very low light intensity the bunching or antibunching can be described by using the statistical properties of the distance between successive events of the point process, which simplifies the experimental procedure. We have shown that, while this property is valid for classical PPs, it has no reason to be true for nonclassical PPs, and we have presented some examples of this situation also illustrated by computer simulations.

  20. Simultaneous reconstruction of multiple depth images without off-focus points in integral imaging using a graphics processing unit.

    Science.gov (United States)

    Yi, Faliu; Lee, Jieun; Moon, Inkyu

    2014-05-01

    The reconstruction of multiple depth images with a ray back-propagation algorithm in three-dimensional (3D) computational integral imaging is computationally burdensome. Further, a reconstructed depth image consists of a focus and an off-focus area. Focus areas are 3D points on the surface of an object that are located at the reconstructed depth, while off-focus areas include 3D points in free-space that do not belong to any object surface in 3D space. Generally, without being removed, the presence of an off-focus area would adversely affect the high-level analysis of a 3D object, including its classification, recognition, and tracking. Here, we use a graphics processing unit (GPU) that supports parallel processing with multiple processors to simultaneously reconstruct multiple depth images using a lookup table containing the shifted values along the x and y directions for each elemental image in a given depth range. Moreover, each 3D point on a depth image can be measured by analyzing its statistical variance with its corresponding samples, which are captured by the two-dimensional (2D) elemental images. These statistical variances can be used to classify depth image pixels as either focus or off-focus points. At this stage, the measurement of focus and off-focus points in multiple depth images is also implemented in parallel on a GPU. Our proposed method is conducted based on the assumption that there is no occlusion of the 3D object during the capture stage of the integral imaging process. Experimental results have demonstrated that this method is capable of removing off-focus points in the reconstructed depth image. The results also showed that using a GPU to remove the off-focus points could greatly improve the overall computational speed compared with using a CPU.

  1. Statistical representation of a spray as a point process

    International Nuclear Information System (INIS)

    Subramaniam, S.

    2000-01-01

    The statistical representation of a spray as a finite point process is investigated. One objective is to develop a better understanding of how single-point statistical information contained in descriptions such as the droplet distribution function (ddf), relates to the probability density functions (pdfs) associated with the droplets themselves. Single-point statistical information contained in the droplet distribution function (ddf) is shown to be related to a sequence of single surrogate-droplet pdfs, which are in general different from the physical single-droplet pdfs. It is shown that the ddf contains less information than the fundamental single-point statistical representation of the spray, which is also described. The analysis shows which events associated with the ensemble of spray droplets can be characterized by the ddf, and which cannot. The implications of these findings for the ddf approach to spray modeling are discussed. The results of this study also have important consequences for the initialization and evolution of direct numerical simulations (DNS) of multiphase flows, which are usually initialized on the basis of single-point statistics such as the droplet number density in physical space. If multiphase DNS are initialized in this way, this implies that even the initial representation contains certain implicit assumptions concerning the complete ensemble of realizations, which are invalid for general multiphase flows. Also the evolution of a DNS initialized in this manner is shown to be valid only if an as yet unproven commutation hypothesis holds true. Therefore, it is questionable to what extent DNS that are initialized in this manner constitute a direct simulation of the physical droplets. Implications of these findings for large eddy simulations of multiphase flows are also discussed. (c) 2000 American Institute of Physics

  2. Critical point predication device

    International Nuclear Information System (INIS)

    Matsumura, Kazuhiko; Kariyama, Koji.

    1996-01-01

    An operation for predicting a critical point by using a existent reverse multiplication method has been complicated, and an effective multiplication factor could not be plotted directly to degrade the accuracy for the prediction. The present invention comprises a detector counting memory section for memorizing the counting sent from a power detector which monitors the reactor power, a reverse multiplication factor calculation section for calculating the reverse multiplication factor based on initial countings and current countings of the power detector, and a critical point prediction section for predicting the criticality by the reverse multiplication method relative to effective multiplication factors corresponding to the state of the reactor core previously determined depending on the cases. In addition, a reactor core characteristic calculation section is added for analyzing an effective multiplication factor depending on the state of the reactor core. Then, if the margin up to the criticality is reduced to lower than a predetermined value during critical operation, an alarm is generated to stop the critical operation when generation of a period of more than a predetermined value predicted by succeeding critical operation. With such procedures, forecasting for the critical point can be easily predicted upon critical operation to greatly mitigate an operator's burden and improve handling for the operation. (N.H.)

  3. Statistical Approaches for Spatiotemporal Prediction of Low Flows

    Science.gov (United States)

    Fangmann, A.; Haberlandt, U.

    2017-12-01

    An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be

  4. Predicting Harmonic Distortion of Multiple Converters in a Power System

    Directory of Open Access Journals (Sweden)

    P. M. Ivry

    2017-01-01

    Full Text Available Various uncertainties arise in the operation and management of power systems containing Renewable Energy Sources (RES that affect the systems power quality. These uncertainties may arise due to system parameter changes or design parameter choice. In this work, the impact of uncertainties on the prediction of harmonics in a power system containing multiple Voltage Source Converters (VSCs is investigated. The study focuses on the prediction of harmonic distortion level in multiple VSCs when some system or design parameters are only known within certain constraints. The Univariate Dimension Reduction (UDR method was utilized in this study as an efficient predictive tool for the level of harmonic distortion of the VSCs measured at the Point of Common Coupling (PCC to the grid. Two case studies were considered and the UDR technique was also experimentally validated. The obtained results were compared with that of the Monte Carlo Simulation (MCS results.

  5. Supporting Multiple Pointing Devices in Microsoft Windows

    DEFF Research Database (Denmark)

    Westergaard, Michael

    2002-01-01

    In this paper the implementation of a Microsoft Windows driver including APIs supporting multiple pointing devices is presented. Microsoft Windows does not natively support multiple pointing devices controlling independent cursors, and a number of solutions to this have been implemented by us and...... and others. Here we motivate and describe a general solution, and how user applications can use it by means of a framework. The device driver and the supporting APIs will be made available free of charge. Interested parties can contact the author for more information....

  6. Statistical properties of several models of fractional random point processes

    Science.gov (United States)

    Bendjaballah, C.

    2011-08-01

    Statistical properties of several models of fractional random point processes have been analyzed from the counting and time interval statistics points of view. Based on the criterion of the reduced variance, it is seen that such processes exhibit nonclassical properties. The conditions for these processes to be treated as conditional Poisson processes are examined. Numerical simulations illustrate part of the theoretical calculations.

  7. Multiple-point statistical simulation for hydrogeological models: 3-D training image development and conditioning strategies

    Science.gov (United States)

    Høyer, Anne-Sophie; Vignoli, Giulio; Mejer Hansen, Thomas; Thanh Vu, Le; Keefer, Donald A.; Jørgensen, Flemming

    2017-12-01

    Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS) to hydrogeological modelling focus on relatively fine-scale models and concentrate on the estimation of facies-level structural uncertainty. Much less attention is paid to the use of input data and optimal construction of training images. For instance, even though the training image should capture a set of spatial geological characteristics to guide the simulations, the majority of the research still relies on 2-D or quasi-3-D training images. In the present study, we demonstrate a novel strategy for 3-D MPS modelling characterized by (i) realistic 3-D training images and (ii) an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers an area of 2810 km2 in the southern part of Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments) which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 million voxels with size 100 m × 100 m × 5 m. Data used for the modelling include water well logs, high-resolution seismic data, and a previously published 3-D geological model. We apply a series of different strategies for the simulations based on data quality, and develop a novel method to effectively create observed spatial trends. The training image is constructed as a relatively small 3-D voxel model covering an area of 90 km2. We use an iterative training image development strategy and find that even slight modifications in the training image create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical workflow to build the training image and

  8. Statistical prediction of Late Miocene climate

    Digital Repository Service at National Institute of Oceanography (India)

    Fernandes, A.A; Gupta, S.M.

    by making certain simplifying assumptions; for example in modelling ocean 4 currents, the geostrophic approximation is made. In case of statistical prediction no such a priori assumption need be made. statistical prediction comprises of using observed data... the number of equations. In this case the equations are overdetermined, and therefore one has to look for a solution that best fits the sample data in a least squares sense. To this end we express the sample .... (2.1)+ ry = y + data as follows: n L c. (x...

  9. Multiple commodities in statistical microeconomics: Model and market

    Science.gov (United States)

    Baaquie, Belal E.; Yu, Miao; Du, Xin

    2016-11-01

    A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.

  10. Generation of a statistical shape model with probabilistic point correspondences and the expectation maximization- iterative closest point algorithm

    International Nuclear Information System (INIS)

    Hufnagel, Heike; Pennec, Xavier; Ayache, Nicholas; Ehrhardt, Jan; Handels, Heinz

    2008-01-01

    Identification of point correspondences between shapes is required for statistical analysis of organ shapes differences. Since manual identification of landmarks is not a feasible option in 3D, several methods were developed to automatically find one-to-one correspondences on shape surfaces. For unstructured point sets, however, one-to-one correspondences do not exist but correspondence probabilities can be determined. A method was developed to compute a statistical shape model based on shapes which are represented by unstructured point sets with arbitrary point numbers. A fundamental problem when computing statistical shape models is the determination of correspondences between the points of the shape observations of the training data set. In the absence of landmarks, exact correspondences can only be determined between continuous surfaces, not between unstructured point sets. To overcome this problem, we introduce correspondence probabilities instead of exact correspondences. The correspondence probabilities are found by aligning the observation shapes with the affine expectation maximization-iterative closest points (EM-ICP) registration algorithm. In a second step, the correspondence probabilities are used as input to compute a mean shape (represented once again by an unstructured point set). Both steps are unified in a single optimization criterion which depe nds on the two parameters 'registration transformation' and 'mean shape'. In a last step, a variability model which best represents the variability in the training data set is computed. Experiments on synthetic data sets and in vivo brain structure data sets (MRI) are then designed to evaluate the performance of our algorithm. The new method was applied to brain MRI data sets, and the estimated point correspondences were compared to a statistical shape model built on exact correspondences. Based on established measures of ''generalization ability'' and ''specificity'', the estimates were very satisfactory

  11. Statistical Model Predictions for p+p and Pb+Pb Collisions at LHC

    CERN Document Server

    Kraus, I; Oeschler, H; Redlich, K; Wheaton, S

    2009-01-01

    Particle production in p+p and central Pb+Pb collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in detail, and some of them, which are particularly appropriate to determine the chemical freeze-out point experimentally, are indicated. Considering elementary interactions on the other hand, we focus on strangeness production and its possible suppression. Extrapolating the thermal parameters to LHC energy, we present predictions of the statistical model for particle yields in p+p collisions. We quantify the strangeness suppression by the correlation volume parameter and discuss its influence on particle production. We propose observables that can provide deeper insight into the mechanism of strangeness production and suppression at LHC.

  12. Monte Carlo full-waveform inversion of crosshole GPR data using multiple-point geostatistical a priori information

    DEFF Research Database (Denmark)

    Cordua, Knud Skou; Hansen, Thomas Mejer; Mosegaard, Klaus

    2012-01-01

    We present a general Monte Carlo full-waveform inversion strategy that integrates a priori information described by geostatistical algorithms with Bayesian inverse problem theory. The extended Metropolis algorithm can be used to sample the a posteriori probability density of highly nonlinear...... inverse problems, such as full-waveform inversion. Sequential Gibbs sampling is a method that allows efficient sampling of a priori probability densities described by geostatistical algorithms based on either two-point (e.g., Gaussian) or multiple-point statistics. We outline the theoretical framework......) Based on a posteriori realizations, complicated statistical questions can be answered, such as the probability of connectivity across a layer. (3) Complex a priori information can be included through geostatistical algorithms. These benefits, however, require more computing resources than traditional...

  13. Are abrupt climate changes predictable?

    Science.gov (United States)

    Ditlevsen, Peter

    2013-04-01

    It is taken for granted that the limited predictability in the initial value problem, the weather prediction, and the predictability of the statistics are two distinct problems. Lorenz (1975) dubbed this predictability of the first and the second kind respectively. Predictability of the first kind in a chaotic dynamical system is limited due to the well-known critical dependence on initial conditions. Predictability of the second kind is possible in an ergodic system, where either the dynamics is known and the phase space attractor can be characterized by simulation or the system can be observed for such long times that the statistics can be obtained from temporal averaging, assuming that the attractor does not change in time. For the climate system the distinction between predictability of the first and the second kind is fuzzy. This difficulty in distinction between predictability of the first and of the second kind is related to the lack of scale separation between fast and slow components of the climate system. The non-linear nature of the problem furthermore opens the possibility of multiple attractors, or multiple quasi-steady states. As the ice-core records show, the climate has been jumping between different quasi-stationary climates, stadials and interstadials through the Dansgaard-Oechger events. Such a jump happens very fast when a critical tipping point has been reached. The question is: Can such a tipping point be predicted? This is a new kind of predictability: the third kind. If the tipping point is reached through a bifurcation, where the stability of the system is governed by some control parameter, changing in a predictable way to a critical value, the tipping is predictable. If the sudden jump occurs because internal chaotic fluctuations, noise, push the system across a barrier, the tipping is as unpredictable as the triggering noise. In order to hint at an answer to this question, a careful analysis of the high temporal resolution NGRIP isotope

  14. Learning Predictive Statistics: Strategies and Brain Mechanisms.

    Science.gov (United States)

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

    2017-08-30

    When immersed in a new environment, we are challenged to decipher initially incomprehensible streams of sensory information. However, quite rapidly, the brain finds structure and meaning in these incoming signals, helping us to predict and prepare ourselves for future actions. This skill relies on extracting the statistics of event streams in the environment that contain regularities of variable complexity from simple repetitive patterns to complex probabilistic combinations. Here, we test the brain mechanisms that mediate our ability to adapt to the environment's statistics and predict upcoming events. By combining behavioral training and multisession fMRI in human participants (male and female), we track the corticostriatal mechanisms that mediate learning of temporal sequences as they change in structure complexity. We show that learning of predictive structures relates to individual decision strategy; that is, selecting the most probable outcome in a given context (maximizing) versus matching the exact sequence statistics. These strategies engage distinct human brain regions: maximizing engages dorsolateral prefrontal, cingulate, sensory-motor regions, and basal ganglia (dorsal caudate, putamen), whereas matching engages occipitotemporal regions (including the hippocampus) and basal ganglia (ventral caudate). Our findings provide evidence for distinct corticostriatal mechanisms that facilitate our ability to extract behaviorally relevant statistics to make predictions. SIGNIFICANCE STATEMENT Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. Past work has studied how humans identify repetitive patterns and associative pairings. However, the natural environment contains regularities that vary in complexity from simple repetition to complex probabilistic combinations. Here, we combine behavior and multisession fMRI to track the brain mechanisms that mediate our ability to adapt to

  15. Downscaling remotely sensed imagery using area-to-point cokriging and multiple-point geostatistical simulation

    Science.gov (United States)

    Tang, Yunwei; Atkinson, Peter M.; Zhang, Jingxiong

    2015-03-01

    A cross-scale data integration method was developed and tested based on the theory of geostatistics and multiple-point geostatistics (MPG). The goal was to downscale remotely sensed images while retaining spatial structure by integrating images at different spatial resolutions. During the process of downscaling, a rich spatial correlation model in the form of a training image was incorporated to facilitate reproduction of similar local patterns in the simulated images. Area-to-point cokriging (ATPCK) was used as locally varying mean (LVM) (i.e., soft data) to deal with the change of support problem (COSP) for cross-scale integration, which MPG cannot achieve alone. Several pairs of spectral bands of remotely sensed images were tested for integration within different cross-scale case studies. The experiment shows that MPG can restore the spatial structure of the image at a fine spatial resolution given the training image and conditioning data. The super-resolution image can be predicted using the proposed method, which cannot be realised using most data integration methods. The results show that ATPCK-MPG approach can achieve greater accuracy than methods which do not account for the change of support issue.

  16. Multi-lane detection based on multiple vanishing points detection

    Science.gov (United States)

    Li, Chuanxiang; Nie, Yiming; Dai, Bin; Wu, Tao

    2015-03-01

    Lane detection plays a significant role in Advanced Driver Assistance Systems (ADAS) for intelligent vehicles. In this paper we present a multi-lane detection method based on multiple vanishing points detection. A new multi-lane model assumes that a single lane, which has two approximately parallel boundaries, may not parallel to others on road plane. Non-parallel lanes associate with different vanishing points. A biological plausibility model is used to detect multiple vanishing points and fit lane model. Experimental results show that the proposed method can detect both parallel lanes and non-parallel lanes.

  17. Comparing Weighted and Unweighted Grade Point Averages in Predicting College Success of Diverse and Low-Income College Students

    Science.gov (United States)

    Warne, Russell T.; Nagaishi, Chanel; Slade, Michael K.; Hermesmeyer, Paul; Peck, Elizabeth Kimberli

    2014-01-01

    While research has shown the statistical significance of high school grade point averages (HSGPAs) in predicting future academic outcomes, the systems with which HSGPAs are calculated vary drastically across schools. Some schools employ unweighted grades that carry the same point value regardless of the course in which they are earned; other…

  18. Publicly available models to predict normal boiling point of organic compounds

    International Nuclear Information System (INIS)

    Oprisiu, Ioana; Marcou, Gilles; Horvath, Dragos; Brunel, Damien Bernard; Rivollet, Fabien; Varnek, Alexandre

    2013-01-01

    Quantitative structure–property models to predict the normal boiling point (T b ) of organic compounds were developed using non-linear ASNNs (associative neural networks) as well as multiple linear regression – ISIDA-MLR and SQS (stochastic QSAR sampler). Models were built on a diverse set of 2098 organic compounds with T b varying in the range of 185–491 K. In ISIDA-MLR and ASNN calculations, fragment descriptors were used, whereas fragment, FPTs (fuzzy pharmacophore triplets), and ChemAxon descriptors were employed in SQS models. Prediction quality of the models has been assessed in 5-fold cross validation. Obtained models were implemented in the on-line ISIDA predictor at (http://infochim.u-strasbg.fr/webserv/VSEngine.html)

  19. Hourly predictive Levenberg-Marquardt ANN and multi linear regression models for predicting of dew point temperature

    Science.gov (United States)

    Zounemat-Kermani, Mohammad

    2012-08-01

    In this study, the ability of two models of multi linear regression (MLR) and Levenberg-Marquardt (LM) feed-forward neural network was examined to estimate the hourly dew point temperature. Dew point temperature is the temperature at which water vapor in the air condenses into liquid. This temperature can be useful in estimating meteorological variables such as fog, rain, snow, dew, and evapotranspiration and in investigating agronomical issues as stomatal closure in plants. The availability of hourly records of climatic data (air temperature, relative humidity and pressure) which could be used to predict dew point temperature initiated the practice of modeling. Additionally, the wind vector (wind speed magnitude and direction) and conceptual input of weather condition were employed as other input variables. The three quantitative standard statistical performance evaluation measures, i.e. the root mean squared error, mean absolute error, and absolute logarithmic Nash-Sutcliffe efficiency coefficient ( {| {{{Log}}({{NS}})} |} ) were employed to evaluate the performances of the developed models. The results showed that applying wind vector and weather condition as input vectors along with meteorological variables could slightly increase the ANN and MLR predictive accuracy. The results also revealed that LM-NN was superior to MLR model and the best performance was obtained by considering all potential input variables in terms of different evaluation criteria.

  20. Evaluation of multiple emission point facilities

    International Nuclear Information System (INIS)

    Miltenberger, R.P.; Hull, A.P.; Strachan, S.; Tichler, J.

    1988-01-01

    In 1970, the New York State Department of Environmental Conservation (NYSDEC) assumed responsibility for the environmental aspect of the state's regulatory program for by-product, source, and special nuclear material. The major objective of this study was to provide consultation to NYSDEC and the US NRC to assist NYSDEC in determining if broad-based licensed facilities with multiple emission points were in compliance with NYCRR Part 380. Under this contract, BNL would evaluate a multiple emission point facility, identified by NYSDEC, as a case study. The review would be a nonbinding evaluation of the facility to determine likely dispersion characteristics, compliance with specified release limits, and implementation of the ALARA philosophy regarding effluent release practices. From the data collected, guidance as to areas of future investigation and the impact of new federal regulations were to be developed. Reported here is the case study for the University of Rochester, Strong Memorial Medical Center and Riverside Campus

  1. Multiple-point statistical simulation for hydrogeological models: 3-D training image development and conditioning strategies

    Directory of Open Access Journals (Sweden)

    A.-S. Høyer

    2017-12-01

    Full Text Available Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS to hydrogeological modelling focus on relatively fine-scale models and concentrate on the estimation of facies-level structural uncertainty. Much less attention is paid to the use of input data and optimal construction of training images. For instance, even though the training image should capture a set of spatial geological characteristics to guide the simulations, the majority of the research still relies on 2-D or quasi-3-D training images. In the present study, we demonstrate a novel strategy for 3-D MPS modelling characterized by (i realistic 3-D training images and (ii an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers an area of 2810 km2 in the southern part of Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 million voxels with size 100 m  ×  100 m  ×  5 m. Data used for the modelling include water well logs, high-resolution seismic data, and a previously published 3-D geological model. We apply a series of different strategies for the simulations based on data quality, and develop a novel method to effectively create observed spatial trends. The training image is constructed as a relatively small 3-D voxel model covering an area of 90 km2. We use an iterative training image development strategy and find that even slight modifications in the training image create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical

  2. Estimating Predictive Variance for Statistical Gas Distribution Modelling

    International Nuclear Information System (INIS)

    Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo

    2009-01-01

    Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.

  3. Prediction and Migration of Surface-related Resonant Multiples

    KAUST Repository

    Guo, Bowen

    2015-08-19

    Surface-related resonant multiples can be migrated to achieve better resolution than migrating primary reflections. We now derive the formula for migrating surface-related resonant multiples, and show its super-resolution characteristics. Moreover, a method is proposed to predict surface-related resonant multiples with zero-offset primary reflections. The prediction can be used to indentify and extract the true resonant multiple from other events. Both synthetic and field data are used to validate this prediction.

  4. Encryption of covert information into multiple statistical distributions

    International Nuclear Information System (INIS)

    Venkatesan, R.C.

    2007-01-01

    A novel strategy to encrypt covert information (code) via unitary projections into the null spaces of ill-conditioned eigenstructures of multiple host statistical distributions, inferred from incomplete constraints, is presented. The host pdf's are inferred using the maximum entropy principle. The projection of the covert information is dependent upon the pdf's of the host statistical distributions. The security of the encryption/decryption strategy is based on the extreme instability of the encoding process. A self-consistent procedure to derive keys for both symmetric and asymmetric cryptography is presented. The advantages of using a multiple pdf model to achieve encryption of covert information are briefly highlighted. Numerical simulations exemplify the efficacy of the model

  5. The value of model averaging and dynamical climate model predictions for improving statistical seasonal streamflow forecasts over Australia

    Science.gov (United States)

    Pokhrel, Prafulla; Wang, Q. J.; Robertson, David E.

    2013-10-01

    Seasonal streamflow forecasts are valuable for planning and allocation of water resources. In Australia, the Bureau of Meteorology employs a statistical method to forecast seasonal streamflows. The method uses predictors that are related to catchment wetness at the start of a forecast period and to climate during the forecast period. For the latter, a predictor is selected among a number of lagged climate indices as candidates to give the "best" model in terms of model performance in cross validation. This study investigates two strategies for further improvement in seasonal streamflow forecasts. The first is to combine, through Bayesian model averaging, multiple candidate models with different lagged climate indices as predictors, to take advantage of different predictive strengths of the multiple models. The second strategy is to introduce additional candidate models, using rainfall and sea surface temperature predictions from a global climate model as predictors. This is to take advantage of the direct simulations of various dynamic processes. The results show that combining forecasts from multiple statistical models generally yields more skillful forecasts than using only the best model and appears to moderate the worst forecast errors. The use of rainfall predictions from the dynamical climate model marginally improves the streamflow forecasts when viewed over all the study catchments and seasons, but the use of sea surface temperature predictions provide little additional benefit.

  6. Joint probability of statistical success of multiple phase III trials.

    Science.gov (United States)

    Zhang, Jianliang; Zhang, Jenny J

    2013-01-01

    In drug development, after completion of phase II proof-of-concept trials, the sponsor needs to make a go/no-go decision to start expensive phase III trials. The probability of statistical success (PoSS) of the phase III trials based on data from earlier studies is an important factor in that decision-making process. Instead of statistical power, the predictive power of a phase III trial, which takes into account the uncertainty in the estimation of treatment effect from earlier studies, has been proposed to evaluate the PoSS of a single trial. However, regulatory authorities generally require statistical significance in two (or more) trials for marketing licensure. We show that the predictive statistics of two future trials are statistically correlated through use of the common observed data from earlier studies. Thus, the joint predictive power should not be evaluated as a simplistic product of the predictive powers of the individual trials. We develop the relevant formulae for the appropriate evaluation of the joint predictive power and provide numerical examples. Our methodology is further extended to the more complex phase III development scenario comprising more than two (K > 2) trials, that is, the evaluation of the PoSS of at least k₀ (k₀≤ K) trials from a program of K total trials. Copyright © 2013 John Wiley & Sons, Ltd.

  7. Predicting future protection of respirator users: Statistical approaches and practical implications.

    Science.gov (United States)

    Hu, Chengcheng; Harber, Philip; Su, Jing

    2016-01-01

    The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.

  8. Exploring Foundation Concepts in Introductory Statistics Using Dynamic Data Points

    Science.gov (United States)

    Ekol, George

    2015-01-01

    This paper analyses introductory statistics students' verbal and gestural expressions as they interacted with a dynamic sketch (DS) designed using "Sketchpad" software. The DS involved numeric data points built on the number line whose values changed as the points were dragged along the number line. The study is framed on aggregate…

  9. Improving Gastric Cancer Outcome Prediction Using Single Time-Point Artificial Neural Network Models

    Science.gov (United States)

    Nilsaz-Dezfouli, Hamid; Abu-Bakar, Mohd Rizam; Arasan, Jayanthi; Adam, Mohd Bakri; Pourhoseingholi, Mohamad Amin

    2017-01-01

    In cancer studies, the prediction of cancer outcome based on a set of prognostic variables has been a long-standing topic of interest. Current statistical methods for survival analysis offer the possibility of modelling cancer survivability but require unrealistic assumptions about the survival time distribution or proportionality of hazard. Therefore, attention must be paid in developing nonlinear models with less restrictive assumptions. Artificial neural network (ANN) models are primarily useful in prediction when nonlinear approaches are required to sift through the plethora of available information. The applications of ANN models for prognostic and diagnostic classification in medicine have attracted a lot of interest. The applications of ANN models in modelling the survival of patients with gastric cancer have been discussed in some studies without completely considering the censored data. This study proposes an ANN model for predicting gastric cancer survivability, considering the censored data. Five separate single time-point ANN models were developed to predict the outcome of patients after 1, 2, 3, 4, and 5 years. The performance of ANN model in predicting the probabilities of death is consistently high for all time points according to the accuracy and the area under the receiver operating characteristic curve. PMID:28469384

  10. MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY

    OpenAIRE

    Chayalakshmi C.L

    2018-01-01

    MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY ABSTRACT Calculation of boiler efficiency is essential if its parameters need to be controlled for either maintaining or enhancing its efficiency. But determination of boiler efficiency using conventional method is time consuming and very expensive. Hence, it is not recommended to find boiler efficiency frequently. The work presented in this paper deals with establishing the statistical mo...

  11. Prediction of slant path rain attenuation statistics at various locations

    Science.gov (United States)

    Goldhirsh, J.

    1977-01-01

    The paper describes a method for predicting slant path attenuation statistics at arbitrary locations for variable frequencies and path elevation angles. The method involves the use of median reflectivity factor-height profiles measured with radar as well as the use of long-term point rain rate data and assumed or measured drop size distributions. The attenuation coefficient due to cloud liquid water in the presence of rain is also considered. Absolute probability fade distributions are compared for eight cases: Maryland (15 GHz), Texas (30 GHz), Slough, England (19 and 37 GHz), Fayetteville, North Carolina (13 and 18 GHz), and Cambridge, Massachusetts (13 and 18 GHz).

  12. The collapsed cone algorithm for (192)Ir dosimetry using phantom-size adaptive multiple-scatter point kernels.

    Science.gov (United States)

    Tedgren, Åsa Carlsson; Plamondon, Mathieu; Beaulieu, Luc

    2015-07-07

    The aim of this work was to investigate how dose distributions calculated with the collapsed cone (CC) algorithm depend on the size of the water phantom used in deriving the point kernel for multiple scatter. A research version of the CC algorithm equipped with a set of selectable point kernels for multiple-scatter dose that had initially been derived in water phantoms of various dimensions was used. The new point kernels were generated using EGSnrc in spherical water phantoms of radii 5 cm, 7.5 cm, 10 cm, 15 cm, 20 cm, 30 cm and 50 cm. Dose distributions derived with CC in water phantoms of different dimensions and in a CT-based clinical breast geometry were compared to Monte Carlo (MC) simulations using the Geant4-based brachytherapy specific MC code Algebra. Agreement with MC within 1% was obtained when the dimensions of the phantom used to derive the multiple-scatter kernel were similar to those of the calculation phantom. Doses are overestimated at phantom edges when kernels are derived in larger phantoms and underestimated when derived in smaller phantoms (by around 2% to 7% depending on distance from source and phantom dimensions). CC agrees well with MC in the high dose region of a breast implant and is superior to TG43 in determining skin doses for all multiple-scatter point kernel sizes. Increased agreement between CC and MC is achieved when the point kernel is comparable to breast dimensions. The investigated approximation in multiple scatter dose depends on the choice of point kernel in relation to phantom size and yields a significant fraction of the total dose only at distances of several centimeters from a source/implant which correspond to volumes of low doses. The current implementation of the CC algorithm utilizes a point kernel derived in a comparatively large (radius 20 cm) water phantom. A fixed point kernel leads to predictable behaviour of the algorithm with the worst case being a source/implant located well within a patient

  13. Learning predictive statistics from temporal sequences: Dynamics and strategies.

    Science.gov (United States)

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

    2017-10-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.

  14. Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction.

    Science.gov (United States)

    He, Dan; Kuhn, David; Parida, Laxmi

    2016-06-15

    Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. dhe@us.ibm.com. © The Author 2016. Published by Oxford University Press.

  15. Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution.

    Science.gov (United States)

    Gangnon, Ronald E

    2012-03-01

    The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, whereas rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset. © 2011, The International Biometric Society.

  16. Tripled Fixed Point in Ordered Multiplicative Metric Spaces

    Directory of Open Access Journals (Sweden)

    Laishram Shanjit

    2017-06-01

    Full Text Available In this paper, we present some triple fixed point theorems in partially ordered multiplicative metric spaces depended on another function. Our results generalise the results of [6] and [5].

  17. Predicting acid dew point with a semi-empirical model

    International Nuclear Information System (INIS)

    Xiang, Baixiang; Tang, Bin; Wu, Yuxin; Yang, Hairui; Zhang, Man; Lu, Junfu

    2016-01-01

    Highlights: • The previous semi-empirical models are systematically studied. • An improved thermodynamic correlation is derived. • A semi-empirical prediction model is proposed. • The proposed semi-empirical model is validated. - Abstract: Decreasing the temperature of exhaust flue gas in boilers is one of the most effective ways to further improve the thermal efficiency, electrostatic precipitator efficiency and to decrease the water consumption of desulfurization tower, while, when this temperature is below the acid dew point, the fouling and corrosion will occur on the heating surfaces in the second pass of boilers. So, the knowledge on accurately predicting the acid dew point is essential. By investigating the previous models on acid dew point prediction, an improved thermodynamic correlation formula between the acid dew point and its influencing factors is derived first. And then, a semi-empirical prediction model is proposed, which is validated with the data both in field test and experiment, and comparing with the previous models.

  18. Reduction of bias in neutron multiplicity assay using a weighted point model

    Energy Technology Data Exchange (ETDEWEB)

    Geist, W. H. (William H.); Krick, M. S. (Merlyn S.); Mayo, D. R. (Douglas R.)

    2004-01-01

    Accurate assay of most common plutonium samples was the development goal for the nondestructive assay technique of neutron multiplicity counting. Over the past 20 years the technique has been proven for relatively pure oxides and small metal items. Unfortunately, the technique results in large biases when assaying large metal items. Limiting assumptions, such as unifoh multiplication, in the point model used to derive the multiplicity equations causes these biases for large dense items. A weighted point model has been developed to overcome some of the limitations in the standard point model. Weighting factors are detemiined from Monte Carlo calculations using the MCNPX code. Monte Carlo calculations give the dependence of the weighting factors on sample mass and geometry, and simulated assays using Monte Carlo give the theoretical accuracy of the weighted-point-model assay. Measured multiplicity data evaluated with both the standard and weighted point models are compared to reference values to give the experimental accuracy of the assay. Initial results show significant promise for the weighted point model in reducing or eliminating biases in the neutron multiplicity assay of metal items. The negative biases observed in the assay of plutonium metal samples are caused by variations in the neutron multiplication for neutrons originating in various locations in the sample. The bias depends on the mass and shape of the sample and depends on the amount and energy distribution of the ({alpha},n) neutrons in the sample. When the standard point model is used, this variable-multiplication bias overestimates the multiplication and alpha values of the sample, and underestimates the plutonium mass. The weighted point model potentially can provide assay accuracy of {approx}2% (1 {sigma}) for cylindrical plutonium metal samples < 4 kg with {alpha} < 1 without knowing the exact shape of the samples, provided that the ({alpha},n) source is uniformly distributed throughout the

  19. Statistical point of view on nucleus excited states and fluctuations of differential polarization of particles emitted during nuclear reactions

    International Nuclear Information System (INIS)

    Dumazet, Gerard

    1965-01-01

    As previous works notably performed by Ericson outlined the fact that the compound nucleus model resulted in variations of efficient cross sections about average values and that these variations were not negligible at all as it had been previously admitted, this research thesis aims at establishing theoretical predictions and at showing that Ericson's predictions can be extended to polarization. After having qualitatively and quantitatively recalled the underlying concepts used in the compound nucleus and direct interaction models, the author shows the relevance of a statistical point of view on nuclei which must not be confused with the statistical model itself. Then, after a recall of results obtained by Ericson, the author reports the study of the fluctuations of differential polarization, addresses the experimental aspect of fluctuations, and shows which are the main factors for this kind of study [fr

  20. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances.

    Science.gov (United States)

    Abut, Fatih; Akay, Mehmet Fatih

    2015-01-01

    Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.

  1. Multiplicity of the statistical γ-rays following (HI,xn) reactions

    International Nuclear Information System (INIS)

    Sie, S.H.; Newton, J.O.; Leigh, J.R.; Diamond, R.M.

    1981-02-01

    The multiplicity Msub(S) of the statistical γ-rays following several ( 16 O,xn) reactions has been measured. A correlation between the total γ-multiplicity Msub(T) and Msub(S) was found. This effect is attributed to an increase in the mean excitation energy above the yrast line with increasing angular momentum input

  2. Statistical Basis for Predicting Technological Progress

    Science.gov (United States)

    Nagy, Béla; Farmer, J. Doyne; Bui, Quan M.; Trancik, Jessika E.

    2013-01-01

    Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation. PMID:23468837

  3. Statistical basis for predicting technological progress.

    Directory of Open Access Journals (Sweden)

    Béla Nagy

    Full Text Available Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.

  4. Event-based stochastic point rainfall resampling for statistical replication and climate projection of historical rainfall series

    DEFF Research Database (Denmark)

    Thorndahl, Søren; Korup Andersen, Aske; Larsen, Anders Badsberg

    2017-01-01

    Continuous and long rainfall series are a necessity in rural and urban hydrology for analysis and design purposes. Local historical point rainfall series often cover several decades, which makes it possible to estimate rainfall means at different timescales, and to assess return periods of extreme...... includes climate changes projected to a specific future period. This paper presents a framework for resampling of historical point rainfall series in order to generate synthetic rainfall series, which has the same statistical properties as an original series. Using a number of key target predictions...... for the future climate, such as winter and summer precipitation, and representation of extreme events, the resampled historical series are projected to represent rainfall properties in a future climate. Climate-projected rainfall series are simulated by brute force randomization of model parameters, which leads...

  5. Model output statistics applied to wind power prediction

    Energy Technology Data Exchange (ETDEWEB)

    Joensen, A; Giebel, G; Landberg, L [Risoe National Lab., Roskilde (Denmark); Madsen, H; Nielsen, H A [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)

    1999-03-01

    Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.

  6. Single-electron multiplication statistics as a combination of Poissonian pulse height distributions using constraint regression methods

    International Nuclear Information System (INIS)

    Ballini, J.-P.; Cazes, P.; Turpin, P.-Y.

    1976-01-01

    Analysing the histogram of anode pulse amplitudes allows a discussion of the hypothesis that has been proposed to account for the statistical processes of secondary multiplication in a photomultiplier. In an earlier work, good agreement was obtained between experimental and reconstructed spectra, assuming a first dynode distribution including two Poisson distributions of distinct mean values. This first approximation led to a search for a method which could give the weights of several Poisson distributions of distinct mean values. Three methods have been briefly exposed: classical linear regression, constraint regression (d'Esopo's method), and regression on variables subject to error. The use of these methods gives an approach of the frequency function which represents the dispersion of the punctual mean gain around the whole first dynode mean gain value. Comparison between this function and the one employed in Polya distribution allows the statement that the latter is inadequate to describe the statistical process of secondary multiplication. Numerous spectra obtained with two kinds of photomultiplier working under different physical conditions have been analysed. Then two points are discussed: - Does the frequency function represent the dynode structure and the interdynode collection process. - Is the model (the multiplication process of all dynodes but the first one, is Poissonian) valid whatever the photomultiplier and the utilization conditions. (Auth.)

  7. Risk prediction model: Statistical and artificial neural network approach

    Science.gov (United States)

    Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

    2017-04-01

    Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.

  8. A new statistical scission-point model fed with microscopic ingredients to predict fission fragments distributions; Developpement d'un nouveau modele de point de scission base sur des ingredients microscopiques

    Energy Technology Data Exchange (ETDEWEB)

    Heinrich, S

    2006-07-01

    Nucleus fission process is a very complex phenomenon and, even nowadays, no realistic models describing the overall process are available. The work presented here deals with a theoretical description of fission fragments distributions in mass, charge, energy and deformation. We have reconsidered and updated the B.D. Wilking Scission Point model. Our purpose was to test if this statistic model applied at the scission point and by introducing new results of modern microscopic calculations allows to describe quantitatively the fission fragments distributions. We calculate the surface energy available at the scission point as a function of the fragments deformations. This surface is obtained from a Hartree Fock Bogoliubov microscopic calculation which guarantee a realistic description of the potential dependence on the deformation for each fragment. The statistic balance is described by the level densities of the fragment. We have tried to avoid as much as possible the input of empirical parameters in the model. Our only parameter, the distance between each fragment at the scission point, is discussed by comparison with scission configuration obtained from full dynamical microscopic calculations. Also, the comparison between our results and experimental data is very satisfying and allow us to discuss the success and limitations of our approach. We finally proposed ideas to improve the model, in particular by applying dynamical corrections. (author)

  9. Fatigue crack initiation and growth life prediction with statistical consideration

    International Nuclear Information System (INIS)

    Kwon, J.D.; Choi, S.H.; Kwak, S.G.; Chun, K.O.

    1991-01-01

    Life prediction or residual life prediction of structures or machines is one of the most strongly world wide needed problems as requirement in the stage of slowly developing economy which comes after rapidly and highly developing stage. For the purpose of statistical life prediction, fatigue test was conducted under the 3 stress levels, and for each stress level, 20 specimens are used. The statistical properties of the crack growth parameter m and C in the fatigue crack growth law of da/dN = C(ΔK) m , and the relationship between m and C, and the statistical distribution pattern of fatigue crack initiation, growth and fracture lives can be obtained by experimental results

  10. Statistical methods for change-point detection in surface temperature records

    Science.gov (United States)

    Pintar, A. L.; Possolo, A.; Zhang, N. F.

    2013-09-01

    We describe several statistical methods to detect possible change-points in a time series of values of surface temperature measured at a meteorological station, and to assess the statistical significance of such changes, taking into account the natural variability of the measured values, and the autocorrelations between them. These methods serve to determine whether the record may suffer from biases unrelated to the climate signal, hence whether there may be a need for adjustments as considered by M. J. Menne and C. N. Williams (2009) "Homogenization of Temperature Series via Pairwise Comparisons", Journal of Climate 22 (7), 1700-1717. We also review methods to characterize patterns of seasonality (seasonal decomposition using monthly medians or robust local regression), and explain the role they play in the imputation of missing values, and in enabling robust decompositions of the measured values into a seasonal component, a possible climate signal, and a station-specific remainder. The methods for change-point detection that we describe include statistical process control, wavelet multi-resolution analysis, adaptive weights smoothing, and a Bayesian procedure, all of which are applicable to single station records.

  11. Statistical Analysis of Reactor Pressure Vessel Fluence Calculation Benchmark Data Using Multiple Regression Techniques

    International Nuclear Information System (INIS)

    Carew, John F.; Finch, Stephen J.; Lois, Lambros

    2003-01-01

    The calculated >1-MeV pressure vessel fluence is used to determine the fracture toughness and integrity of the reactor pressure vessel. It is therefore of the utmost importance to ensure that the fluence prediction is accurate and unbiased. In practice, this assurance is provided by comparing the predictions of the calculational methodology with an extensive set of accurate benchmarks. A benchmarking database is used to provide an estimate of the overall average measurement-to-calculation (M/C) bias in the calculations ( ). This average is used as an ad-hoc multiplicative adjustment to the calculations to correct for the observed calculational bias. However, this average only provides a well-defined and valid adjustment of the fluence if the M/C data are homogeneous; i.e., the data are statistically independent and there is no correlation between subsets of M/C data.Typically, the identification of correlations between the errors in the database M/C values is difficult because the correlation is of the same magnitude as the random errors in the M/C data and varies substantially over the database. In this paper, an evaluation of a reactor dosimetry benchmark database is performed to determine the statistical validity of the adjustment to the calculated pressure vessel fluence. Physical mechanisms that could potentially introduce a correlation between the subsets of M/C ratios are identified and included in a multiple regression analysis of the M/C data. Rigorous statistical criteria are used to evaluate the homogeneity of the M/C data and determine the validity of the adjustment.For the database evaluated, the M/C data are found to be strongly correlated with dosimeter response threshold energy and dosimeter location (e.g., cavity versus in-vessel). It is shown that because of the inhomogeneity in the M/C data, for this database, the benchmark data do not provide a valid basis for adjusting the pressure vessel fluence.The statistical criteria and methods employed in

  12. Parallel point-multiplication architecture using combined group operations for high-speed cryptographic applications.

    Directory of Open Access Journals (Sweden)

    Md Selim Hossain

    Full Text Available In this paper, we propose a novel parallel architecture for fast hardware implementation of elliptic curve point multiplication (ECPM, which is the key operation of an elliptic curve cryptography processor. The point multiplication over binary fields is synthesized on both FPGA and ASIC technology by designing fast elliptic curve group operations in Jacobian projective coordinates. A novel combined point doubling and point addition (PDPA architecture is proposed for group operations to achieve high speed and low hardware requirements for ECPM. It has been implemented over the binary field which is recommended by the National Institute of Standards and Technology (NIST. The proposed ECPM supports two Koblitz and random curves for the key sizes 233 and 163 bits. For group operations, a finite-field arithmetic operation, e.g. multiplication, is designed on a polynomial basis. The delay of a 233-bit point multiplication is only 3.05 and 3.56 μs, in a Xilinx Virtex-7 FPGA, for Koblitz and random curves, respectively, and 0.81 μs in an ASIC 65-nm technology, which are the fastest hardware implementation results reported in the literature to date. In addition, a 163-bit point multiplication is also implemented in FPGA and ASIC for fair comparison which takes around 0.33 and 0.46 μs, respectively. The area-time product of the proposed point multiplication is very low compared to similar designs. The performance ([Formula: see text] and Area × Time × Energy (ATE product of the proposed design are far better than the most significant studies found in the literature.

  13. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances

    Directory of Open Access Journals (Sweden)

    Abut F

    2015-08-01

    Full Text Available Fatih Abut, Mehmet Fatih AkayDepartment of Computer Engineering, Çukurova University, Adana, TurkeyAbstract: Maximal oxygen uptake (VO2max indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance

  14. Saccadic gain adaptation is predicted by the statistics of natural fluctuations in oculomotor function

    Directory of Open Access Journals (Sweden)

    Mark V Albert

    2012-12-01

    Full Text Available Due to multiple factors such as fatigue, muscle strengthening, and neural plasticity, the responsiveness of the motor apparatus to neural commands changes over time. To enable precise movements the nervous system must adapt to compensate for these changes. Recent models of motor adaptation derive from assumptions about the way the motor apparatus changes. Characterizing these changes is difficult because motor adaptation happens at the same time, masking most of the effects of ongoing changes. Here, we analyze eye movements of monkeys with lesions to the posterior cerebellar vermis that impair adaptation. Their fluctuations better reveal the underlying changes of the motor system over time. When these measured, unadapted changes are used to derive optimal motor adaptation rules the prediction precision significantly improves. Among three models that similarly fit single-day adaptation results, the model that also matches the temporal correlations of the nonadapting saccades most accurately predicts multiple day adaptation. Saccadic gain adaptation is well matched to the natural statistics of fluctuations of the oculomotor plant.

  15. Statistical short-term earthquake prediction.

    Science.gov (United States)

    Kagan, Y Y; Knopoff, L

    1987-06-19

    A statistical procedure, derived from a theoretical model of fracture growth, is used to identify a foreshock sequence while it is in progress. As a predictor, the procedure reduces the average uncertainty in the rate of occurrence for a future strong earthquake by a factor of more than 1000 when compared with the Poisson rate of occurrence. About one-third of all main shocks with local magnitude greater than or equal to 4.0 in central California can be predicted in this way, starting from a 7-year database that has a lower magnitude cut off of 1.5. The time scale of such predictions is of the order of a few hours to a few days for foreshocks in the magnitude range from 2.0 to 5.0.

  16. BPP: a sequence-based algorithm for branch point prediction.

    Science.gov (United States)

    Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing

    2017-10-15

    Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  17. Statistical Methods in Integrative Genomics

    Science.gov (United States)

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  18. Waste generated in high-rise buildings construction: a quantification model based on statistical multiple regression.

    Science.gov (United States)

    Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

    2015-05-01

    Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. The use of machine learning and nonlinear statistical tools for ADME prediction.

    Science.gov (United States)

    Sakiyama, Yojiro

    2009-02-01

    Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.

  20. Statistics and methodology of multiple cell upset characterization under heavy ion irradiation

    International Nuclear Information System (INIS)

    Zebrev, G.I.; Gorbunov, M.S.; Useinov, R.G.; Emeliyanov, V.V.; Ozerov, A.I.; Anashin, V.S.; Kozyukov, A.E.; Zemtsov, K.S.

    2015-01-01

    Mean and partial cross-section concepts and their connections to multiplicity and statistics of multiple cell upsets (MCUs) in highly-scaled digital memories are introduced and discussed. The important role of the experimental determination of the upset statistics is emphasized. It was found that MCU may lead to quasi-linear dependence of cross-sections on linear energy transfer (LET). A new form of function for interpolation of mean cross-section dependences on LET has been proposed

  1. Case study on prediction of remaining methane potential of landfilled municipal solid waste by statistical analysis of waste composition data.

    Science.gov (United States)

    Sel, İlker; Çakmakcı, Mehmet; Özkaya, Bestamin; Suphi Altan, H

    2016-10-01

    Main objective of this study was to develop a statistical model for easier and faster Biochemical Methane Potential (BMP) prediction of landfilled municipal solid waste by analyzing waste composition of excavated samples from 12 sampling points and three waste depths representing different landfilling ages of closed and active sections of a sanitary landfill site located in İstanbul, Turkey. Results of Principal Component Analysis (PCA) were used as a decision support tool to evaluation and describe the waste composition variables. Four principal component were extracted describing 76% of data set variance. The most effective components were determined as PCB, PO, T, D, W, FM, moisture and BMP for the data set. Multiple Linear Regression (MLR) models were built by original compositional data and transformed data to determine differences. It was observed that even residual plots were better for transformed data the R(2) and Adjusted R(2) values were not improved significantly. The best preliminary BMP prediction models consisted of D, W, T and FM waste fractions for both versions of regressions. Adjusted R(2) values of the raw and transformed models were determined as 0.69 and 0.57, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Accurate corresponding point search using sphere-attribute-image for statistical bone model generation

    International Nuclear Information System (INIS)

    Saito, Toki; Nakajima, Yoshikazu; Sugita, Naohiko; Mitsuishi, Mamoru; Hashizume, Hiroyuki; Kuramoto, Kouichi; Nakashima, Yosio

    2011-01-01

    Statistical deformable model based two-dimensional/three-dimensional (2-D/3-D) registration is a promising method for estimating the position and shape of patient bone in the surgical space. Since its accuracy depends on the statistical model capacity, we propose a method for accurately generating a statistical bone model from a CT volume. Our method employs the Sphere-Attribute-Image (SAI) and has improved the accuracy of corresponding point search in statistical model generation. At first, target bone surfaces are extracted as SAIs from the CT volume. Then the textures of SAIs are classified to some regions using Maximally-stable-extremal-regions methods. Next, corresponding regions are determined using Normalized cross-correlation (NCC). Finally, corresponding points in each corresponding region are determined using NCC. The application of our method to femur bone models was performed, and worked well in the experiments. (author)

  3. Flash-Point prediction for binary partially miscible aqueous-organic mixtures

    OpenAIRE

    Liaw, Horng-Jang; Chen, Chien Tsun; Gerbaud, Vincent

    2008-01-01

    Flash point is the most important variable used to characterize fire and explosion hazard of liquids. Herein, partially miscible mixtures are presented within the context of liquid-liquid extraction processes and heterogeneous distillation processes. This paper describes development of a model for predicting the flash point of binary partially miscible mixtures of aqueous-organic system. To confirm the predictive efficiency of the derived flash points, the model was verified by comparing the ...

  4. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  5. The validity of multiphase DNS initialized on the basis of single--point statistics

    Science.gov (United States)

    Subramaniam, Shankar

    1999-11-01

    A study of the point--process statistical representation of a spray reveals that single--point statistical information contained in the droplet distribution function (ddf) is related to a sequence of single surrogate--droplet pdf's, which are in general different from the physical single--droplet pdf's. The results of this study have important consequences for the initialization and evolution of direct numerical simulations (DNS) of multiphase flows, which are usually initialized on the basis of single--point statistics such as the average number density in physical space. If multiphase DNS are initialized in this way, this implies that even the initial representation contains certain implicit assumptions concerning the complete ensemble of realizations, which are invalid for general multiphase flows. Also the evolution of a DNS initialized in this manner is shown to be valid only if an as yet unproven commutation hypothesis holds true. Therefore, it is questionable to what extent DNS that are initialized in this manner constitute a direct simulation of the physical droplets.

  6. The Statistical Value Chain - a Benchmarking Checklist for Decision Makers to Evaluate Decision Support Seen from a Statistical Point-Of-View

    DEFF Research Database (Denmark)

    Herrmann, Ivan Tengbjerg; Henningsen, Geraldine; Wood, Christian D.

    2013-01-01

    quantitative methods exist for evaluating uncertainty—for example, Monte Carlo simulation—and such methods work very well when the AN is in full control of the data collection and model-building processes. In many cases, however, the AN is not in control of these processes. In this article we develop a simple...... method that a DM can employ in order to evaluate the process of decision support from a statistical point-of-view. We call this approach the “Statistical Value Chain” (SVC): a consecutive benchmarking checklist with eight steps that can be used to evaluate decision support seen from a statistical point-of-view....

  7. Gaussian point count statistics for families of curves over a fixed finite field

    OpenAIRE

    Kurlberg, Par; Wigman, Igor

    2010-01-01

    We produce a collection of families of curves, whose point count statistics over F_p becomes Gaussian for p fixed. In particular, the average number of F_p points on curves in these families tends to infinity.

  8. Factors predicting work outcome in Japanese patients with schizophrenia: role of multiple functioning levels.

    Science.gov (United States)

    Sumiyoshi, Chika; Harvey, Philip D; Takaki, Manabu; Okahisa, Yuko; Sato, Taku; Sora, Ichiro; Nuechterlein, Keith H; Subotnik, Kenneth L; Sumiyoshi, Tomiki

    2015-09-01

    Functional outcomes in individuals with schizophrenia suggest recovery of cognitive, everyday, and social functioning. Specifically improvement of work status is considered to be most important for their independent living and self-efficacy. The main purposes of the present study were 1) to identify which outcome factors predict occupational functioning, quantified as work hours, and 2) to provide cut-offs on the scales for those factors to attain better work status. Forty-five Japanese patients with schizophrenia and 111 healthy controls entered the study. Cognition, capacity for everyday activities, and social functioning were assessed by the Japanese versions of the MATRICS Cognitive Consensus Battery (MCCB), the UCSD Performance-based Skills Assessment-Brief (UPSA-B), and the Social Functioning Scale Individuals' version modified for the MATRICS-PASS (Modified SFS for PASS), respectively. Potential factors for work outcome were estimated by multiple linear regression analyses (predicting work hours directly) and a multiple logistic regression analyses (predicting dichotomized work status based on work hours). ROC curve analyses were performed to determine cut-off points for differentiating between the better- and poor work status. The results showed that a cognitive component, comprising visual/verbal learning and emotional management, and a social functioning component, comprising independent living and vocational functioning, were potential factors for predicting work hours/status. Cut-off points obtained in ROC analyses indicated that 60-70% achievements on the measures of those factors were expected to maintain the better work status. Our findings suggest that improvement on specific aspects of cognitive and social functioning are important for work outcome in patients with schizophrenia.

  9. The role of point-of-care assessment of platelet function in predicting postoperative bleeding and transfusion requirements after coronary artery bypass grafting.

    Science.gov (United States)

    Mishra, Pankaj Kumar; Thekkudan, Joyce; Sahajanandan, Raj; Gravenor, Mike; Lakshmanan, Suresh; Fayaz, Khazi Mohammed; Luckraz, Heyman

    2015-01-01

    OBJECTIVE platelet function assessment after cardiac surgery can predict postoperative blood loss, guide transfusion requirements and discriminate the need for surgical re-exploration. We conducted this study to assess the predictive value of point-of-care testing platelet function using the Multiplate® device. Patients undergoing isolated coronary artery bypass grafting were prospectively recruited ( n = 84). Group A ( n = 42) patients were on anti-platelet therapy until surgery; patients in Group B ( n = 42) stopped anti-platelet treatment at least 5 days preoperatively. Multiplate® and thromboelastography (TEG) tests were performed in the perioperative period. Primary end-point was excessive bleeding (>2.5 ml/kg/h) within first 3 h postoperative. Secondary end-points included transfusion requirements, re-exploration rates, intensive care unit and in-hospital stays. Patients in Group A had excessive bleeding (59% vs. 33%, P = 0.02), higher re-exploration rates (14% vs. 0%, P function testing was the most significant predictor of excessive bleeding (odds ratio [OR]: 2.3, P = 0.08), need for blood (OR: 5.5, P functional assessment with Multiplate® was the strongest predictor for bleeding and transfusion requirements in patients on anti-platelet therapy until the time of surgery.

  10. Statistical tests for equal predictive ability across multiple forecasting methods

    DEFF Research Database (Denmark)

    Borup, Daniel; Thyrsgaard, Martin

    We develop a multivariate generalization of the Giacomini-White tests for equal conditional predictive ability. The tests are applicable to a mixture of nested and non-nested models, incorporate estimation uncertainty explicitly, and allow for misspecification of the forecasting model as well as ...

  11. Burst Pressure Prediction of Multiple Cracks in Pipelines

    International Nuclear Information System (INIS)

    Razak, N A; Alang, N A; Murad, M A

    2013-01-01

    Available industrial code such as ASME B1G, modified ASME B1G and DNV RP-F101 to assess pipeline defects appear more conservative for multiple crack like- defects than single crack-like defects. Thus, this paper presents burst pressure prediction of pipe with multiple cracks like defects. A finite element model was developed and the burst pressure prediction was compared with the available code. The model was used to investigate the effect of the distance between the cracks and the crack length. The coalescence diagram was also developed to evaluate the burst pressure of the multiple cracks. It was found as the distance between crack increases, the interaction effect comes to fade away and multiple cracks behave like two independent single cracks

  12. A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

    Science.gov (United States)

    Kosinski, Andrzej S

    2013-03-15

    Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.

  13. Quantitative structure-property relationships for prediction of boiling point, vapor pressure, and melting point.

    Science.gov (United States)

    Dearden, John C

    2003-08-01

    Boiling point, vapor pressure, and melting point are important physicochemical properties in the modeling of the distribution and fate of chemicals in the environment. However, such data often are not available, and therefore must be estimated. Over the years, many attempts have been made to calculate boiling points, vapor pressures, and melting points by using quantitative structure-property relationships, and this review examines and discusses the work published in this area, and concentrates particularly on recent studies. A number of software programs are commercially available for the calculation of boiling point, vapor pressure, and melting point, and these have been tested for their predictive ability with a test set of 100 organic chemicals.

  14. Statistical Surface Recovery: A Study on Ear Canals

    DEFF Research Database (Denmark)

    Jensen, Rasmus Ramsbøl; Olesen, Oline Vinter; Paulsen, Rasmus Reinhold

    2012-01-01

    We present a method for surface recovery in partial surface scans based on a statistical model. The framework is based on multivariate point prediction, where the distribution of the points are learned from an annotated data set. The training set consist of surfaces with dense correspondence...... that are Procrustes aligned. The average shape and point covariances can be estimated from this set. It is shown how missing data in a new given shape can be predicted using the learned statistics. The method is evaluated on a data set of 29 scans of ear canal impressions. By using a leave-one-out approach we...

  15. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2004-01-01

    Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric

  16. Wind speed prediction using statistical regression and neural network

    Indian Academy of Sciences (India)

    Prediction of wind speed in the atmospheric boundary layer is important for wind energy assess- ment,satellite launching and aviation,etc.There are a few techniques available for wind speed prediction,which require a minimum number of input parameters.Four different statistical techniques,viz.,curve fitting,Auto Regressive ...

  17. Factors predicting work outcome in Japanese patients with schizophrenia: role of multiple functioning levels

    Directory of Open Access Journals (Sweden)

    Chika Sumiyoshi

    2015-09-01

    Full Text Available Functional outcomes in individuals with schizophrenia suggest recovery of cognitive, everyday, and social functioning. Specifically improvement of work status is considered to be most important for their independent living and self-efficacy. The main purposes of the present study were 1 to identify which outcome factors predict occupational functioning, quantified as work hours, and 2 to provide cut-offs on the scales for those factors to attain better work status. Forty-five Japanese patients with schizophrenia and 111 healthy controls entered the study. Cognition, capacity for everyday activities, and social functioning were assessed by the Japanese versions of the MATRICS Cognitive Consensus Battery (MCCB, the UCSD Performance-based Skills Assessment-Brief (UPSA-B, and the Social Functioning Scale Individuals’ version modified for the MATRICS-PASS (Modified SFS for PASS, respectively. Potential factors for work outcome were estimated by multiple linear regression analyses (predicting work hours directly and a multiple logistic regression analyses (predicting dichotomized work status based on work hours. ROC curve analyses were performed to determine cut-off points for differentiating between the better- and poor work status. The results showed that a cognitive component, comprising visual/verbal learning and emotional management, and a social functioning component, comprising independent living and vocational functioning, were potential factors for predicting work hours/status. Cut-off points obtained in ROC analyses indicated that 60–70% achievements on the measures of those factors were expected to maintain the better work status. Our findings suggest that improvement on specific aspects of cognitive and social functioning are important for work outcome in patients with schizophrenia.

  18. Statistical model for prediction of hearing loss in patients receiving cisplatin chemotherapy.

    Science.gov (United States)

    Johnson, Andrew; Tarima, Sergey; Wong, Stuart; Friedland, David R; Runge, Christina L

    2013-03-01

    This statistical model might be used to predict cisplatin-induced hearing loss, particularly in patients undergoing concomitant radiotherapy. To create a statistical model based on pretreatment hearing thresholds to provide an individual probability for hearing loss from cisplatin therapy and, secondarily, to investigate the use of hearing classification schemes as predictive tools for hearing loss. Retrospective case-control study. Tertiary care medical center. A total of 112 subjects receiving chemotherapy and audiometric evaluation were evaluated for the study. Of these subjects, 31 met inclusion criteria for analysis. The primary outcome measurement was a statistical model providing the probability of hearing loss following the use of cisplatin chemotherapy. Fifteen of the 31 subjects had significant hearing loss following cisplatin chemotherapy. American Academy of Otolaryngology-Head and Neck Society and Gardner-Robertson hearing classification schemes revealed little change in hearing grades between pretreatment and posttreatment evaluations for subjects with or without hearing loss. The Chang hearing classification scheme could effectively be used as a predictive tool in determining hearing loss with a sensitivity of 73.33%. Pretreatment hearing thresholds were used to generate a statistical model, based on quadratic approximation, to predict hearing loss (C statistic = 0.842, cross-validated = 0.835). The validity of the model improved when only subjects who received concurrent head and neck irradiation were included in the analysis (C statistic = 0.91). A calculated cutoff of 0.45 for predicted probability has a cross-validated sensitivity and specificity of 80%. Pretreatment hearing thresholds can be used as a predictive tool for cisplatin-induced hearing loss, particularly with concomitant radiotherapy.

  19. Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research.

    Science.gov (United States)

    Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann

    2003-01-01

    Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.

  20. Multiple types of motives don't multiply the motivation of West Point cadets.

    Science.gov (United States)

    Wrzesniewski, Amy; Schwartz, Barry; Cong, Xiangyu; Kane, Michael; Omar, Audrey; Kolditz, Thomas

    2014-07-29

    Although people often assume that multiple motives for doing something will be more powerful and effective than a single motive, research suggests that different types of motives for the same action sometimes compete. More specifically, research suggests that instrumental motives, which are extrinsic to the activities at hand, can weaken internal motives, which are intrinsic to the activities at hand. We tested whether holding both instrumental and internal motives yields negative outcomes in a field context in which various motives occur naturally and long-term educational and career outcomes are at stake. We assessed the impact of the motives of over 10,000 West Point cadets over the period of a decade on whether they would become commissioned officers, extend their officer service beyond the minimum required period, and be selected for early career promotions. For each outcome, motivation internal to military service itself predicted positive outcomes; a relationship that was negatively affected when instrumental motives were also in evidence. These results suggest that holding multiple motives damages persistence and performance in educational and occupational contexts over long periods of time.

  1. A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction.

    Science.gov (United States)

    Qiu, Shibin; Lane, Terran

    2009-01-01

    The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.

  2. IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics.

    Science.gov (United States)

    Hoyt, Robert Eugene; Snider, Dallas; Thompson, Carla; Mantravadi, Sarita

    2016-10-11

    We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix

  3. ERROR DISTRIBUTION EVALUATION OF THE THIRD VANISHING POINT BASED ON RANDOM STATISTICAL SIMULATION

    Directory of Open Access Journals (Sweden)

    C. Li

    2012-07-01

    Full Text Available POS, integrated by GPS / INS (Inertial Navigation Systems, has allowed rapid and accurate determination of position and attitude of remote sensing equipment for MMS (Mobile Mapping Systems. However, not only does INS have system error, but also it is very expensive. Therefore, in this paper error distributions of vanishing points are studied and tested in order to substitute INS for MMS in some special land-based scene, such as ground façade where usually only two vanishing points can be detected. Thus, the traditional calibration approach based on three orthogonal vanishing points is being challenged. In this article, firstly, the line clusters, which parallel to each others in object space and correspond to the vanishing points, are detected based on RANSAC (Random Sample Consensus and parallelism geometric constraint. Secondly, condition adjustment with parameters is utilized to estimate nonlinear error equations of two vanishing points (VX, VY. How to set initial weights for the adjustment solution of single image vanishing points is presented. Solving vanishing points and estimating their error distributions base on iteration method with variable weights, co-factor matrix and error ellipse theory. Thirdly, under the condition of known error ellipses of two vanishing points (VX, VY and on the basis of the triangle geometric relationship of three vanishing points, the error distribution of the third vanishing point (VZ is calculated and evaluated by random statistical simulation with ignoring camera distortion. Moreover, Monte Carlo methods utilized for random statistical estimation are presented. Finally, experimental results of vanishing points coordinate and their error distributions are shown and analyzed.

  4. Error Distribution Evaluation of the Third Vanishing Point Based on Random Statistical Simulation

    Science.gov (United States)

    Li, C.

    2012-07-01

    POS, integrated by GPS / INS (Inertial Navigation Systems), has allowed rapid and accurate determination of position and attitude of remote sensing equipment for MMS (Mobile Mapping Systems). However, not only does INS have system error, but also it is very expensive. Therefore, in this paper error distributions of vanishing points are studied and tested in order to substitute INS for MMS in some special land-based scene, such as ground façade where usually only two vanishing points can be detected. Thus, the traditional calibration approach based on three orthogonal vanishing points is being challenged. In this article, firstly, the line clusters, which parallel to each others in object space and correspond to the vanishing points, are detected based on RANSAC (Random Sample Consensus) and parallelism geometric constraint. Secondly, condition adjustment with parameters is utilized to estimate nonlinear error equations of two vanishing points (VX, VY). How to set initial weights for the adjustment solution of single image vanishing points is presented. Solving vanishing points and estimating their error distributions base on iteration method with variable weights, co-factor matrix and error ellipse theory. Thirdly, under the condition of known error ellipses of two vanishing points (VX, VY) and on the basis of the triangle geometric relationship of three vanishing points, the error distribution of the third vanishing point (VZ) is calculated and evaluated by random statistical simulation with ignoring camera distortion. Moreover, Monte Carlo methods utilized for random statistical estimation are presented. Finally, experimental results of vanishing points coordinate and their error distributions are shown and analyzed.

  5. Tracking Multiple Statistics: Simultaneous Learning of Object Names and Categories in English and Mandarin Speakers.

    Science.gov (United States)

    Chen, Chi-Hsin; Gershkoff-Stowe, Lisa; Wu, Chih-Yi; Cheung, Hintat; Yu, Chen

    2017-08-01

    Two experiments were conducted to examine adult learners' ability to extract multiple statistics in simultaneously presented visual and auditory input. Experiment 1 used a cross-situational learning paradigm to test whether English speakers were able to use co-occurrences to learn word-to-object mappings and concurrently form object categories based on the commonalities across training stimuli. Experiment 2 replicated the first experiment and further examined whether speakers of Mandarin, a language in which final syllables of object names are more predictive of category membership than English, were able to learn words and form object categories when trained with the same type of structures. The results indicate that both groups of learners successfully extracted multiple levels of co-occurrence and used them to learn words and object categories simultaneously. However, marked individual differences in performance were also found, suggesting possible interference and competition in processing the two concurrent streams of regularities. Copyright © 2016 Cognitive Science Society, Inc.

  6. Multiplicity: discussion points from the Statisticians in the Pharmaceutical Industry multiplicity expert group.

    Science.gov (United States)

    Phillips, Alan; Fletcher, Chrissie; Atkinson, Gary; Channon, Eddie; Douiri, Abdel; Jaki, Thomas; Maca, Jeff; Morgan, David; Roger, James Henry; Terrill, Paul

    2013-01-01

    In May 2012, the Committee of Health and Medicinal Products issued a concept paper on the need to review the points to consider document on multiplicity issues in clinical trials. In preparation for the release of the updated guidance document, Statisticians in the Pharmaceutical Industry held a one-day expert group meeting in January 2013. Topics debated included multiplicity and the drug development process, the usefulness and limitations of newly developed strategies to deal with multiplicity, multiplicity issues arising from interim decisions and multiregional development, and the need for simultaneous confidence intervals (CIs) corresponding to multiple test procedures. A clear message from the meeting was that multiplicity adjustments need to be considered when the intention is to make a formal statement about efficacy or safety based on hypothesis tests. Statisticians have a key role when designing studies to assess what adjustment really means in the context of the research being conducted. More thought during the planning phase needs to be given to multiplicity adjustments for secondary endpoints given these are increasing in importance in differentiating products in the market place. No consensus was reached on the role of simultaneous CIs in the context of superiority trials. It was argued that unadjusted intervals should be employed as the primary purpose of the intervals is estimation, while the purpose of hypothesis testing is to formally establish an effect. The opposing view was that CIs should correspond to the test decision whenever possible. Copyright © 2013 John Wiley & Sons, Ltd.

  7. Common Fixed Points of Generalized Rational Type Cocyclic Mappings in Multiplicative Metric Spaces

    Directory of Open Access Journals (Sweden)

    Mujahid Abbas

    2015-01-01

    Full Text Available The aim of this paper is to present fixed point result of mappings satisfying a generalized rational contractive condition in the setup of multiplicative metric spaces. As an application, we obtain a common fixed point of a pair of weakly compatible mappings. Some common fixed point results of pair of rational contractive types mappings involved in cocyclic representation of a nonempty subset of a multiplicative metric space are also obtained. Some examples are presented to support the results proved herein. Our results generalize and extend various results in the existing literature.

  8. Multiple Monte Carlo Testing with Applications in Spatial Point Processes

    DEFF Research Database (Denmark)

    Mrkvička, Tomáš; Myllymäki, Mari; Hahn, Ute

    with a function as the test statistic, 3) several Monte Carlo tests with functions as test statistics. The rank test has correct (global) type I error in each case and it is accompanied with a p-value and with a graphical interpretation which shows which subtest or which distances of the used test function......(s) lead to the rejection at the prescribed significance level of the test. Examples of null hypothesis from point process and random set statistics are used to demonstrate the strength of the rank envelope test. The examples include goodness-of-fit test with several test functions, goodness-of-fit test...

  9. Estimates of statistical significance for comparison of individual positions in multiple sequence alignments

    Directory of Open Access Journals (Sweden)

    Sadreyev Ruslan I

    2004-08-01

    Full Text Available Abstract Background Profile-based analysis of multiple sequence alignments (MSA allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1 MSA position and a set of predicted residue frequencies, and (2 between two MSA positions. These problems are important for (i evaluation and optimization of methods predicting residue occurrence at protein positions; (ii detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii detection of sites that determine functional or structural specificity in two related families. Results For problems (1 and (2, we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. Conclusion The proposed computational method is of significant potential value for the analysis of protein families.

  10. Drivers and seasonal predictability of extreme wind speeds in the ECMWF System 4 and a statistical model

    Science.gov (United States)

    Walz, M. A.; Donat, M.; Leckebusch, G. C.

    2017-12-01

    As extreme wind speeds are responsible for large socio-economic losses in Europe, a skillful prediction would be of great benefit for disaster prevention as well as for the actuarial community. Here we evaluate patterns of large-scale atmospheric variability and the seasonal predictability of extreme wind speeds (e.g. >95th percentile) in the European domain in the dynamical seasonal forecast system ECMWF System 4, and compare to the predictability based on a statistical prediction model. The dominant patterns of atmospheric variability show distinct differences between reanalysis and ECMWF System 4, with most patterns in System 4 extended downstream in comparison to ERA-Interim. The dissimilar manifestations of the patterns within the two models lead to substantially different drivers associated with the occurrence of extreme winds in the respective model. While the ECMWF System 4 is shown to provide some predictive power over Scandinavia and the eastern Atlantic, only very few grid cells in the European domain have significant correlations for extreme wind speeds in System 4 compared to ERA-Interim. In contrast, a statistical model predicts extreme wind speeds during boreal winter in better agreement with the observations. Our results suggest that System 4 does not seem to capture the potential predictability of extreme winds that exists in the real world, and therefore fails to provide reliable seasonal predictions for lead months 2-4. This is likely related to the unrealistic representation of large-scale patterns of atmospheric variability. Hence our study points to potential improvements of dynamical prediction skill by improving the simulation of large-scale atmospheric dynamics.

  11. Joint Clustering and Component Analysis of Correspondenceless Point Sets: Application to Cardiac Statistical Modeling.

    Science.gov (United States)

    Gooya, Ali; Lekadir, Karim; Alba, Xenia; Swift, Andrew J; Wild, Jim M; Frangi, Alejandro F

    2015-01-01

    Construction of Statistical Shape Models (SSMs) from arbitrary point sets is a challenging problem due to significant shape variation and lack of explicit point correspondence across the training data set. In medical imaging, point sets can generally represent different shape classes that span healthy and pathological exemplars. In such cases, the constructed SSM may not generalize well, largely because the probability density function (pdf) of the point sets deviates from the underlying assumption of Gaussian statistics. To this end, we propose a generative model for unsupervised learning of the pdf of point sets as a mixture of distinctive classes. A Variational Bayesian (VB) method is proposed for making joint inferences on the labels of point sets, and the principal modes of variations in each cluster. The method provides a flexible framework to handle point sets with no explicit point-to-point correspondences. We also show that by maximizing the marginalized likelihood of the model, the optimal number of clusters of point sets can be determined. We illustrate this work in the context of understanding the anatomical phenotype of the left and right ventricles in heart. To this end, we use a database containing hearts of healthy subjects, patients with Pulmonary Hypertension (PH), and patients with Hypertrophic Cardiomyopathy (HCM). We demonstrate that our method can outperform traditional PCA in both generalization and specificity measures.

  12. Statistical γ-ray multiplicity distributions in Dy and Yb nuclei

    International Nuclear Information System (INIS)

    Tveter, T.S.; Bergholt, L.; Guttormsen, M.; Rekstad, J.

    1994-03-01

    The statistical γ-ray multiplicity distributions following the reactions 163 Dy( 3 He,αxn) 162-x Dy and 173 Yb( 3 He,αxn) 172-x Yb have been studied. The mean value and standard deviation have been extracted as functions of excitation energy. The method is based on the probability distribution of k-fold events, where an α-particle is observed in coincidence with signals in k γ-ray detectors. Techniques for isolating statistical γ-rays and subtracting random background, cross-talk and neutron contributions are discussed. 22 refs., 10 figs., 3 tabs

  13. Evaluation of multiple protein docking structures using correctly predicted pairwise subunits

    Directory of Open Access Journals (Sweden)

    Esquivel-Rodríguez Juan

    2012-03-01

    Full Text Available Abstract Background Many functionally important proteins in a cell form complexes with multiple chains. Therefore, computational prediction of multiple protein complexes is an important task in bioinformatics. In the development of multiple protein docking methods, it is important to establish a metric for evaluating prediction results in a reasonable and practical fashion. However, since there are only few works done in developing methods for multiple protein docking, there is no study that investigates how accurate structural models of multiple protein complexes should be to allow scientists to gain biological insights. Methods We generated a series of predicted models (decoys of various accuracies by our multiple protein docking pipeline, Multi-LZerD, for three multi-chain complexes with 3, 4, and 6 chains. We analyzed the decoys in terms of the number of correctly predicted pair conformations in the decoys. Results and conclusion We found that pairs of chains with the correct mutual orientation exist even in the decoys with a large overall root mean square deviation (RMSD to the native. Therefore, in addition to a global structure similarity measure, such as the global RMSD, the quality of models for multiple chain complexes can be better evaluated by using the local measurement, the number of chain pairs with correct mutual orientation. We termed the fraction of correctly predicted pairs (RMSD at the interface of less than 4.0Å as fpair and propose to use it for evaluation of the accuracy of multiple protein docking.

  14. Statistical theory of dislocation configurations in a random array of point obstacles

    International Nuclear Information System (INIS)

    Labusch, R.

    1977-01-01

    The stable configurations of a dislocation in an infinite random array of point obstacles are analyzed using the mathematical methods of statistical mechanics. The theory provides exact distribution functions of the forces on pinning points and of the link lengths between points on the line. The expected number of stable configurations is a function of the applied stress. This number drops to zero at the critical stress. Due to a degeneracy problem in the line count, the value of the flow stress cannot be determined rigorously, but we can give a good approximation that is very close to the empirical value

  15. Statistical Diagnosis Method of Conductor Motions in Superconducting Magnets to Predict their Quench Performance

    CERN Document Server

    Khomenko, B A; Rijllart, A; Sanfilippo, S; Siemko, A

    2001-01-01

    Premature training quenches are usually caused by the transient energy released within the magnet coil as it is energised. Two distinct varieties of disturbances exist. They are thought to be electrical and mechanical in origin. The first type of disturbance comes from non-uniform current distribution in superconducting cables whereas the second one usually originates from conductor motions or micro-fractures of insulating materials under the action of Lorentz forces. All of these mechanical events produce in general a rapid variation of the voltages in the so-called quench antennas and across the magnet coil, called spikes. A statistical method to treat the spatial localisation and the time occurrence of spikes will be presented. It allows identification of the mechanical weak points in the magnet without need to increase the current to provoke a quench. The prediction of the quench level from detailed analysis of the spike statistics can be expected.

  16. Statistical Analysis of Coherent Ultrashort Light Pulse CDMA With Multiple Optical Amplifiers Using Additive Noise Model

    Science.gov (United States)

    Jamshidi, Kambiz; Salehi, Jawad A.

    2005-05-01

    This paper describes a study of the performance of various configurations for placing multiple optical amplifiers in a typical coherent ultrashort light pulse code-division multiple access (CULP-CDMA) communication system using the additive noise model. For this study, a comprehensive performance analysis was developed that takes into account multiple-access noise, noise due to optical amplifiers, and thermal noise using the saddle-point approximation technique. Prior to obtaining the overall system performance, the input/output statistical models for different elements of the system such as encoders/decoders,star coupler, and optical amplifiers were obtained. Performance comparisons between an ideal and lossless quantum-limited case and a typical CULP-CDMA with various losses exhibit more than 30 dB more power requirement to obtain the same bit-error rate (BER). Considering the saturation effect of optical amplifiers, this paper discusses an algorithm for amplifiers' gain setting in various stages of the network in order to overcome the nonlinear effects on signal modulation in optical amplifiers. Finally, using this algorithm,various configurations of multiple optical amplifiers in CULP-CDMA are discussed and the rules for the required optimum number of amplifiers are shown with their corresponding optimum locations to be implemented along the CULP-CDMA system.

  17. Recall of Point-of-Sale Marketing Predicts Cigar and E-Cigarette Use among Texas Youth.

    Science.gov (United States)

    Pasch, Keryn E; Nicksic, Nicole E; Opara, Samuel C; Jackson, Christian; Harrell, Melissa B; Perry, Cheryl L

    2017-10-23

    While research has documented associations between recall of point-of sale tobacco marketing and youth tobacco use, much of the research is cross-sectional and focused on cigarettes. The present longitudinal study examined recall of tobacco marketing at the point-of-sale and multiple types of tobacco use six months later. The Texas Adolescent Tobacco Advertising and Marketing Surveillance System (TATAMS) is a large-scale, representative study of 6th, 8th, and 10th graders in 79 middle and high schools in five counties in Texas. Weighted logistic regression examined associations between recall of tobacco advertisements and products on display at baseline and ever use, current use, and susceptibility to use for cigarette, e-cigarette, cigar, and smokeless products six months later. Students' recall of signs marketing e-cigarettes at baseline predicted ever e-cigarette use and increased susceptibility to use e-cigarettes at follow-up across all store types. Recall of e-cigarette displays only predicted susceptibility to use e-cigarettes at follow-up, across all store types. Both recall of signs marketing cigars and cigar product displays predicted current and ever cigar smoking and increased susceptibility to smoking cigars at follow-up, across all store types. Recall of cigarette and smokeless product marketing and displays was not associated with tobacco use measures. The point-of-sale environment continues to be an important influence on youth tobacco use. Restrictions on point-of-sale marketing, particularly around schools, are warranted. Cross-sectional studies have shown that exposure to point-of-sale cigarette marketing is associated with use of cigarettes among youth, though longitudinal evidence of the same is sparse and mixed. Cross-sectional studies have found that recall of cigars, smokeless product, and e-cigarette tobacco marketing at point-of-sale is associated with curiosity about tobacco use or intentions to use tobacco among youth, but limited

  18. Vitamin D Levels Predict Multiple Sclerosis Progression

    Science.gov (United States)

    ... Research Matters NIH Research Matters February 3, 2014 Vitamin D Levels Predict Multiple Sclerosis Progression Among people ... sclerosis (MS), those with higher blood levels of vitamin D had better outcomes during 5 years of ...

  19. Assessment of saddle-point-mass predictions for astrophysical applications

    Energy Technology Data Exchange (ETDEWEB)

    Kelic, A.; Schmidt, K.H.

    2005-07-01

    Using available experimental data on fission barriers and ground-state masses, a detailed study on the predictions of different models concerning the isospin dependence of saddle-point masses is performed. Evidence is found that several macroscopic models yield unrealistic saddle-point masses for very neutron-rich nuclei, which are relevant for the r-process nucleosynthesis. (orig.)

  20. Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation.

    Science.gov (United States)

    Pearce, Marcus T

    2018-05-11

    Music perception depends on internal psychological models derived through exposure to a musical culture. It is hypothesized that this musical enculturation depends on two cognitive processes: (1) statistical learning, in which listeners acquire internal cognitive models of statistical regularities present in the music to which they are exposed; and (2) probabilistic prediction based on these learned models that enables listeners to organize and process their mental representations of music. To corroborate these hypotheses, I review research that uses a computational model of probabilistic prediction based on statistical learning (the information dynamics of music (IDyOM) model) to simulate data from empirical studies of human listeners. The results show that a broad range of psychological processes involved in music perception-expectation, emotion, memory, similarity, segmentation, and meter-can be understood in terms of a single, underlying process of probabilistic prediction using learned statistical models. Furthermore, IDyOM simulations of listeners from different musical cultures demonstrate that statistical learning can plausibly predict causal effects of differential cultural exposure to musical styles, providing a quantitative model of cultural distance. Understanding the neural basis of musical enculturation will benefit from close coordination between empirical neuroimaging and computational modeling of underlying mechanisms, as outlined here. © 2018 The Authors. Annals of the New York Academy of Sciences published by Wiley Periodicals, Inc. on behalf of New York Academy of Sciences.

  1. Multiple Suboptimal Solutions for Prediction Rules in Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Osamu Komori

    2013-01-01

    Full Text Available This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence.

  2. Full counting statistics of multiple Andreev reflections in incoherent diffusive superconducting junctions

    International Nuclear Information System (INIS)

    Samuelsson, P.

    2007-01-01

    We present a theory for the full distribution of current fluctuations in incoherent diffusive superconducting junctions, subjected to a voltage bias. This theory of full counting statistics of incoherent multiple Andreev reflections is valid for an arbitrary applied voltage. We present a detailed discussion of the properties of the first four cumulants as well as the low and high voltage regimes of the full counting statistics. (orig.)

  3. Analysis and prediction of Multiple-Site Damage (MSD) fatigue crack growth

    Science.gov (United States)

    Dawicke, D. S.; Newman, J. C., Jr.

    1992-08-01

    A technique was developed to calculate the stress intensity factor for multiple interacting cracks. The analysis was verified through comparison with accepted methods of calculating stress intensity factors. The technique was incorporated into a fatigue crack growth prediction model and used to predict the fatigue crack growth life for multiple-site damage (MSD). The analysis was verified through comparison with experiments conducted on uniaxially loaded flat panels with multiple cracks. Configuration with nearly equal and unequal crack distribution were examined. The fatigue crack growth predictions agreed within 20 percent of the experimental lives for all crack configurations considered.

  4. Analysis and prediction of Multiple-Site Damage (MSD) fatigue crack growth

    Science.gov (United States)

    Dawicke, D. S.; Newman, J. C., Jr.

    1992-01-01

    A technique was developed to calculate the stress intensity factor for multiple interacting cracks. The analysis was verified through comparison with accepted methods of calculating stress intensity factors. The technique was incorporated into a fatigue crack growth prediction model and used to predict the fatigue crack growth life for multiple-site damage (MSD). The analysis was verified through comparison with experiments conducted on uniaxially loaded flat panels with multiple cracks. Configuration with nearly equal and unequal crack distribution were examined. The fatigue crack growth predictions agreed within 20 percent of the experimental lives for all crack configurations considered.

  5. A neighborhood statistics model for predicting stream pathogen indicator levels.

    Science.gov (United States)

    Pandey, Pramod K; Pasternack, Gregory B; Majumder, Mahbubul; Soupir, Michelle L; Kaiser, Mark S

    2015-03-01

    Because elevated levels of water-borne Escherichia coli in streams are a leading cause of water quality impairments in the U.S., water-quality managers need tools for predicting aqueous E. coli levels. Presently, E. coli levels may be predicted using complex mechanistic models that have a high degree of unchecked uncertainty or simpler statistical models. To assess spatio-temporal patterns of instream E. coli levels, herein we measured E. coli, a pathogen indicator, at 16 sites (at four different times) within the Squaw Creek watershed, Iowa, and subsequently, the Markov Random Field model was exploited to develop a neighborhood statistics model for predicting instream E. coli levels. Two observed covariates, local water temperature (degrees Celsius) and mean cross-sectional depth (meters), were used as inputs to the model. Predictions of E. coli levels in the water column were compared with independent observational data collected from 16 in-stream locations. The results revealed that spatio-temporal averages of predicted and observed E. coli levels were extremely close. Approximately 66 % of individual predicted E. coli concentrations were within a factor of 2 of the observed values. In only one event, the difference between prediction and observation was beyond one order of magnitude. The mean of all predicted values at 16 locations was approximately 1 % higher than the mean of the observed values. The approach presented here will be useful while assessing instream contaminations such as pathogen/pathogen indicator levels at the watershed scale.

  6. Prediction of the Flash Point of Binary and Ternary Straight-Chain Alkane Mixtures

    Directory of Open Access Journals (Sweden)

    X. Li

    2014-01-01

    Full Text Available The flash point is an important physical property used to estimate the fire hazard of a flammable liquid. To avoid the occurrence of fire or explosion, many models are used to predict the flash point; however, these models are complex, and the calculation process is cumbersome. For pure flammable substances, the research for predicting the flash point is systematic and comprehensive. For multicomponent mixtures, especially a hydrocarbon mixture, the current research is insufficient to predict the flash point. In this study, a model was developed to predict the flash point of straight-chain alkane mixtures using a simple calculation process. The pressure, activity coefficient, and other associated physicochemical parameters are not required for the calculation in the proposed model. A series of flash points of binary and ternary mixtures of straight-chain alkanes were determined. The results of the model present consistent experimental results with an average absolute deviation for the binary mixtures of 0.7% or lower and an average absolute deviation for the ternary mixtures of 1.03% or lower.

  7. EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to Improve Prediction Accuracy.

    Science.gov (United States)

    Zhou, Xianxiao; Wang, Minghui; Katsyv, Igor; Irie, Hanna; Zhang, Bin

    2018-04-24

    Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods. We first designed an expression weighted cosine method (EWCos) to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed EMUDRA (Ensemble of Multiple Drug Repositioning Approaches) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs. The EMUDRA R package is available at doi:10.7303/syn11510888. bin.zhang@mssm.edu or zhangb@hotmail.com. Supplementary data are available at Bioinformatics online.

  8. Error analysis of dimensionless scaling experiments with multiple points using linear regression

    International Nuclear Information System (INIS)

    Guercan, Oe.D.; Vermare, L.; Hennequin, P.; Bourdelle, C.

    2010-01-01

    A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)

  9. Active control on high-order coherence and statistic characterization on random phase fluctuation of two classical point sources.

    Science.gov (United States)

    Hong, Peilong; Li, Liming; Liu, Jianji; Zhang, Guoquan

    2016-03-29

    Young's double-slit or two-beam interference is of fundamental importance to understand various interference effects, in which the stationary phase difference between two beams plays the key role in the first-order coherence. Different from the case of first-order coherence, in the high-order optical coherence the statistic behavior of the optical phase will play the key role. In this article, by employing a fundamental interfering configuration with two classical point sources, we showed that the high- order optical coherence between two classical point sources can be actively designed by controlling the statistic behavior of the relative phase difference between two point sources. Synchronous position Nth-order subwavelength interference with an effective wavelength of λ/M was demonstrated, in which λ is the wavelength of point sources and M is an integer not larger than N. Interestingly, we found that the synchronous position Nth-order interference fringe fingerprints the statistic trace of random phase fluctuation of two classical point sources, therefore, it provides an effective way to characterize the statistic properties of phase fluctuation for incoherent light sources.

  10. Mirrored pyramidal wells for simultaneous multiple vantage point microscopy.

    Science.gov (United States)

    Seale, K T; Reiserer, R S; Markov, D A; Ges, I A; Wright, C; Janetopoulos, C; Wikswo, J P

    2008-10-01

    We report a novel method for obtaining simultaneous images from multiple vantage points of a microscopic specimen using size-matched microscopic mirrors created from anisotropically etched silicon. The resulting pyramidal wells enable bright-field and fluorescent side-view images, and when combined with z-sectioning, provide additional information for 3D reconstructions of the specimen. We have demonstrated the 3D localization and tracking over time of the centrosome of a live Dictyostelium discoideum. The simultaneous acquisition of images from multiple perspectives also provides a five-fold increase in the theoretical collection efficiency of emitted photons, a property which may be useful for low-light imaging modalities such as bioluminescence, or low abundance surface-marker labelling.

  11. Effects of (α,n) contaminants and sample multiplication on statistical neutron correlation measurements

    International Nuclear Information System (INIS)

    Dowdy, E.J.; Hansen, G.E.; Robba, A.A.; Pratt, J.C.

    1980-01-01

    The complete formalism for the use of statistical neutron fluctuation measurements for the nondestructive assay of fissionable materials has been developed. This formalism includes the effect of detector deadtime, neutron multiplicity, random neutron pulse contributions from (α,n) contaminants in the sample, and the sample multiplication of both fission-related and background neutrons

  12. Prediction of boiling points of organic compounds by QSPR tools.

    Science.gov (United States)

    Dai, Yi-min; Zhu, Zhi-ping; Cao, Zhong; Zhang, Yue-fei; Zeng, Ju-lan; Li, Xun

    2013-07-01

    The novel electro-negativity topological descriptors of YC, WC were derived from molecular structure by equilibrium electro-negativity of atom and relative bond length of molecule. The quantitative structure-property relationships (QSPR) between descriptors of YC, WC as well as path number parameter P3 and the normal boiling points of 80 alkanes, 65 unsaturated hydrocarbons and 70 alcohols were obtained separately. The high-quality prediction models were evidenced by coefficient of determination (R(2)), the standard error (S), average absolute errors (AAE) and predictive parameters (Qext(2),RCV(2),Rm(2)). According to the regression equations, the influences of the length of carbon backbone, the size, the degree of branching of a molecule and the role of functional groups on the normal boiling point were analyzed. Comparison results with reference models demonstrated that novel topological descriptors based on the equilibrium electro-negativity of atom and the relative bond length were useful molecular descriptors for predicting the normal boiling points of organic compounds. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Nonparametric Bayesian predictive distributions for future order statistics

    Science.gov (United States)

    Richard A. Johnson; James W. Evans; David W. Green

    1999-01-01

    We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...

  14. Statistical Multiplicity in Systematic Reviews of Anaesthesia Interventions: A Quantification and Comparison between Cochrane and Non-Cochrane Reviews

    DEFF Research Database (Denmark)

    Imberger, Georgina; Vejlby, Alexandra Hedvig Damgaard; Hansen, Sara Bohnstedt

    2011-01-01

    Systematic reviews with meta-analyses often contain many statistical tests. This multiplicity may increase the risk of type I error. Few attempts have been made to address the problem of statistical multiplicity in systematic reviews. Before the implications are properly considered, the size...... of systematic reviews and aimed to assess whether this quantity is different in Cochrane and non-Cochrane reviews....... of the issue deserves clarification. Because of the emphasis on bias evaluation and because of the editorial processes involved, Cochrane reviews may contain more multiplicity than their non-Cochrane counterparts. This study measured the quantity of statistical multiplicity present in a population...

  15. Systematic prediction of new ferroelectric inorganic materials in point group 6

    International Nuclear Information System (INIS)

    Abrahams, S.C.

    1990-01-01

    A total of seven new families and sixteen structurally different inorganic materials with point group 6 are shown to satisfy the criteria presented previously by the present author for predicting ferroelectricity. In case each prediction is experimentally verified, the 183 individual entries for point group 6 listed in the Inorganic Crystal Structure Database will result in over 80 new ferroelectrics, of which about 30 are rare-earth isomorphs. The total number of 'pure'

  16. A feature point identification method for positron emission particle tracking with multiple tracers

    Energy Technology Data Exchange (ETDEWEB)

    Wiggins, Cody, E-mail: cwiggin2@vols.utk.edu [University of Tennessee-Knoxville, Department of Physics and Astronomy, 1408 Circle Drive, Knoxville, TN 37996 (United States); Santos, Roque [University of Tennessee-Knoxville, Department of Nuclear Engineering (United States); Escuela Politécnica Nacional, Departamento de Ciencias Nucleares (Ecuador); Ruggles, Arthur [University of Tennessee-Knoxville, Department of Nuclear Engineering (United States)

    2017-01-21

    A novel detection algorithm for Positron Emission Particle Tracking (PEPT) with multiple tracers based on optical feature point identification (FPI) methods is presented. This new method, the FPI method, is compared to a previous multiple PEPT method via analyses of experimental and simulated data. The FPI method outperforms the older method in cases of large particle numbers and fine time resolution. Simulated data show the FPI method to be capable of identifying 100 particles at 0.5 mm average spatial error. Detection error is seen to vary with the inverse square root of the number of lines of response (LORs) used for detection and increases as particle separation decreases. - Highlights: • A new approach to positron emission particle tracking is presented. • Using optical feature point identification analogs, multiple particle tracking is achieved. • Method is compared to previous multiple particle method. • Accuracy and applicability of method is explored.

  17. Brain atrophy and lesion load predict long term disability in multiple sclerosis

    DEFF Research Database (Denmark)

    Popescu, Veronica; Agosta, Federica; Hulst, Hanneke E

    2013-01-01

    To determine whether brain atrophy and lesion volumes predict subsequent 10 year clinical evolution in multiple sclerosis (MS).......To determine whether brain atrophy and lesion volumes predict subsequent 10 year clinical evolution in multiple sclerosis (MS)....

  18. Systems near a critical point under multiplicative noise and the concept of effective potential

    Science.gov (United States)

    Shapiro, V. E.

    1993-07-01

    This paper presents a general approach to and elucidates the main features of the effective potential, friction, and diffusion exerted by systems near a critical point due to nonlinear influence of noise. The model is that of a general many-dimensional system of coupled nonlinear oscillators of finite damping under frequently alternating influences, multiplicative or additive, and arbitrary form of the power spectrum, provided the time scales of the system's drift due to noise are large compared to the scales of unperturbed relaxation behavior. The conventional statistical approach and the widespread deterministic effective potential concept use the assumptions about a small parameter which are particular cases of the considered. We show close correspondence between the asymptotic methods of these approaches and base the analysis on this. The results include an analytical treatment of the system's long-time behavior as a function of the noise covering all the range of its table- and bell-shaped spectra, from the monochromatic limit to white noise. The trend is considered both in the coordinate momentum and in the coordinate system's space. Particular attention is paid to the stabilization behavior forced by multiplicative noise. An intermittency, in a broad area of the control parameter space, is shown to be an intrinsic feature of these phenomena.

  19. Prediction of failure enthalpy and reliability of irradiated fuel rod under reactivity-initiated accidents by means of statistical approach

    International Nuclear Information System (INIS)

    Nam, Cheol; Choi, Byeong Kwon; Jeong, Yong Hwan; Jung, Youn Ho

    2001-01-01

    During the last decade, the failure behavior of high-burnup fuel rods under RIA has been an extensive concern since observations of fuel rod failures at low enthalpy. Of great importance is placed on failure prediction of fuel rod in the point of licensing criteria and safety in extending burnup achievement. To address the issue, a statistics-based methodology is introduced to predict failure probability of irradiated fuel rods. Based on RIA simulation results in literature, a failure enthalpy correlation for irradiated fuel rod is constructed as a function of oxide thickness, fuel burnup, and pulse width. From the failure enthalpy correlation, a single damage parameter, equivalent enthalpy, is defined to reflect the effects of the three primary factors as well as peak fuel enthalpy. Moreover, the failure distribution function with equivalent enthalpy is derived, applying a two-parameter Weibull statistical model. Using these equations, the sensitivity analysis is carried out to estimate the effects of burnup, corrosion, peak fuel enthalpy, pulse width and cladding materials used

  20. Sensitivity study of experimental measures for the nuclear liquid-gas phase transition in the statistical multifragmentation model

    Science.gov (United States)

    Lin, W.; Ren, P.; Zheng, H.; Liu, X.; Huang, M.; Wada, R.; Qu, G.

    2018-05-01

    The experimental measures of the multiplicity derivatives—the moment parameters, the bimodal parameter, the fluctuation of maximum fragment charge number (normalized variance of Zmax, or NVZ), the Fisher exponent (τ ), and the Zipf law parameter (ξ )—are examined to search for the liquid-gas phase transition in nuclear multifragmention processes within the framework of the statistical multifragmentation model (SMM). The sensitivities of these measures are studied. All these measures predict a critical signature at or near to the critical point both for the primary and secondary fragments. Among these measures, the total multiplicity derivative and the NVZ provide accurate measures for the critical point from the final cold fragments as well as the primary fragments. The present study will provide a guide for future experiments and analyses in the study of the nuclear liquid-gas phase transition.

  1. Shape-correlated deformation statistics for respiratory motion prediction in 4D lung

    Science.gov (United States)

    Liu, Xiaoxiao; Oguz, Ipek; Pizer, Stephen M.; Mageras, Gig S.

    2010-02-01

    4D image-guided radiation therapy (IGRT) for free-breathing lungs is challenging due to the complicated respiratory dynamics. Effective modeling of respiratory motion is crucial to account for the motion affects on the dose to tumors. We propose a shape-correlated statistical model on dense image deformations for patient-specic respiratory motion estimation in 4D lung IGRT. Using the shape deformations of the high-contrast lungs as the surrogate, the statistical model trained from the planning CTs can be used to predict the image deformation during delivery verication time, with the assumption that the respiratory motion at both times are similar for the same patient. Dense image deformation fields obtained by diffeomorphic image registrations characterize the respiratory motion within one breathing cycle. A point-based particle optimization algorithm is used to obtain the shape models of lungs with group-wise surface correspondences. Canonical correlation analysis (CCA) is adopted in training to maximize the linear correlation between the shape variations of the lungs and the corresponding dense image deformations. Both intra- and inter-session CT studies are carried out on a small group of lung cancer patients and evaluated in terms of the tumor location accuracies. The results suggest potential applications using the proposed method.

  2. A note on the statistical analysis of point judgment matrices

    Directory of Open Access Journals (Sweden)

    MG Kabera

    2013-06-01

    Full Text Available The Analytic Hierarchy Process is a multicriteria decision making technique developed by Saaty in the 1970s. The core of the approach is the pairwise comparison of objects according to a single criterion using a 9-point ratio scale and the estimation of weights associated with these objects based on the resultant judgment matrix. In the present paper some statistical approaches to extracting the weights of objects from a judgment matrix are reviewed and new ideas which are rooted in the traditional method of paired comparisons are introduced.

  3. Toward a community ecology of landscapes: predicting multiple predator-prey interactions across geographic space.

    Science.gov (United States)

    Schmitz, Oswald J; Miller, Jennifer R B; Trainor, Anne M; Abrahms, Briana

    2017-09-01

    Community ecology was traditionally an integrative science devoted to studying interactions between species and their abiotic environments in order to predict species' geographic distributions and abundances. Yet for philosophical and methodological reasons, it has become divided into two enterprises: one devoted to local experimentation on species interactions to predict community dynamics; the other devoted to statistical analyses of abiotic and biotic information to describe geographic distribution. Our goal here is to instigate thinking about ways to reconnect the two enterprises and thereby return to a tradition to do integrative science. We focus specifically on the community ecology of predators and prey, which is ripe for integration. This is because there is active, simultaneous interest in experimentally resolving the nature and strength of predator-prey interactions as well as explaining patterns across landscapes and seascapes. We begin by describing a conceptual theory rooted in classical analyses of non-spatial food web modules used to predict species interactions. We show how such modules can be extended to consideration of spatial context using the concept of habitat domain. Habitat domain describes the spatial extent of habitat space that predators and prey use while foraging, which differs from home range, the spatial extent used by an animal to meet all of its daily needs. This conceptual theory can be used to predict how different spatial relations of predators and prey could lead to different emergent multiple predator-prey interactions such as whether predator consumptive or non-consumptive effects should dominate, and whether intraguild predation, predator interference or predator complementarity are expected. We then review the literature on studies of large predator-prey interactions that make conclusions about the nature of multiple predator-prey interactions. This analysis reveals that while many studies provide sufficient information

  4. Prediction model for initial point of net vapor generation for low-flow boiling

    International Nuclear Information System (INIS)

    Sun Qi; Zhao Hua; Yang Ruichang

    2003-01-01

    The prediction of the initial point of net vapor generation is significant for the calculation of phase distribution in sub-cooled boiling. However, most of the investigations were developed in high-flow boiling, and there is no common model that could be successfully applied for the low-flow boiling. A predictive model for the initial point of net vapor generation for low-flow forced convection and natural circulation is established here, by the analysis of evaporation and condensation heat transfer. The comparison between experimental data and calculated results shows that this model can predict the net vapor generation point successfully in low-flow sub-cooled boiling

  5. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  6. Using machine learning, neural networks and statistics to predict bankruptcy

    NARCIS (Netherlands)

    Pompe, P.P.M.; Feelders, A.J.; Feelders, A.J.

    1997-01-01

    Recent literature strongly suggests that machine learning approaches to classification outperform "classical" statistical methods. We make a comparison between the performance of linear discriminant analysis, classification trees, and neural networks in predicting corporate bankruptcy. Linear

  7. A Simple Statistical Thermodynamics Experiment

    Science.gov (United States)

    LoPresto, Michael C.

    2010-01-01

    Comparing the predicted and actual rolls of combinations of both two and three dice can help to introduce many of the basic concepts of statistical thermodynamics, including multiplicity, probability, microstates, and macrostates, and demonstrate that entropy is indeed a measure of randomness, that disordered states (those of higher entropy) are…

  8. Controlling cyclic combustion timing variations using a symbol-statistics predictive approach in an HCCI engine

    International Nuclear Information System (INIS)

    Ghazimirsaied, Ahmad; Koch, Charles Robert

    2012-01-01

    Highlights: ► Misfire reduction in a combustion engine based on chaotic theory methods. ► Chaotic theory analysis of cyclic variation of a HCCI engine near misfire. ► Symbol sequence approach is used to predict ignition timing one cycle-ahead. ► Prediction is combined with feedback control to lower HCCI combustion variation. ► Feedback control extends the HCCI operating range into the misfire region. -- Abstract: Cyclic variation of a Homogeneous Charge Compression Ignition (HCCI) engine near misfire is analyzed using chaotic theory methods and feedback control is used to stabilize high cyclic variations. Variation of consecutive cycles of θ Pmax (the crank angle of maximum cylinder pressure over an engine cycle) for a Primary Reference Fuel engine is analyzed near misfire operation for five test points with similar conditions but different octane numbers. The return map of the time series of θ Pmax at each combustion cycle reveals the deterministic and random portions of the dynamics near misfire for this HCCI engine. A symbol-statistic approach is used to predict θ Pmax one cycle-ahead. Predicted θ Pmax has similar dynamical behavior to the experimental measurements. Based on this cycle ahead prediction, and using fuel octane as the input, feedback control is used to stabilize the instability of θ Pmax variations at this engine condition near misfire.

  9. Multiple-shock initiation via statistical crack mechanics

    Energy Technology Data Exchange (ETDEWEB)

    Dienes, J.K.; Kershner, J.D.

    1998-12-31

    Statistical Crack Mechanics (SCRAM) is a theoretical approach to the behavior of brittle materials that accounts for the behavior of an ensemble of microcracks, including their opening, shear, growth, and coalescence. Mechanical parameters are based on measured strain-softening behavior. In applications to explosive and propellant sensitivity it is assumed that closed cracks act as hot spots, and that the heating due to interfacial friction initiates reactions which are modeled as one-dimensional heat flow with an Arrhenius source term, and computed in a subscale grid. Post-ignition behavior of hot spots is treated with the burn model of Ward, Son and Brewster. Numerical calculations using SCRAM-HYDROX are compared with the multiple-shock experiments of Mulford et al. in which the particle velocity in PBX 9501 is measured with embedded wires, and reactions are initiated and quenched.

  10. Generalized statistical criterion for distinguishing random optical groupings from physical multiple systems

    International Nuclear Information System (INIS)

    Anosova, Z.P.

    1988-01-01

    A statistical criterion is proposed for distinguishing between random and physical groupings of stars and galaxies. The criterion is applied to nearby wide multiple stars, triplets of galaxies in the list of Karachentsev, Karachentseva, and Shcherbanovskii, and double galaxies in the list of Dahari, in which the principal components are Seyfert galaxies. Systems that are almost certainly physical, probably physical, probably optical, and almost certainly optical are identified. The limiting difference between the radial velocities of the components of physical multiple galaxies is estimated

  11. Developing points-based risk-scoring systems in the presence of competing risks.

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S; D'Agostino, Ralph B; Fine, Jason P

    2016-09-30

    Predicting the occurrence of an adverse event over time is an important issue in clinical medicine. Clinical prediction models and associated points-based risk-scoring systems are popular statistical methods for summarizing the relationship between a multivariable set of patient risk factors and the risk of the occurrence of an adverse event. Points-based risk-scoring systems are popular amongst physicians as they permit a rapid assessment of patient risk without the use of computers or other electronic devices. The use of such points-based risk-scoring systems facilitates evidence-based clinical decision making. There is a growing interest in cause-specific mortality and in non-fatal outcomes. However, when considering these types of outcomes, one must account for competing risks whose occurrence precludes the occurrence of the event of interest. We describe how points-based risk-scoring systems can be developed in the presence of competing events. We illustrate the application of these methods by developing risk-scoring systems for predicting cardiovascular mortality in patients hospitalized with acute myocardial infarction. Code in the R statistical programming language is provided for the implementation of the described methods. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  12. Dynamic Bus Travel Time Prediction Models on Road with Multiple Bus Routes.

    Science.gov (United States)

    Bai, Cong; Peng, Zhong-Ren; Lu, Qing-Chang; Sun, Jian

    2015-01-01

    Accurate and real-time travel time information for buses can help passengers better plan their trips and minimize waiting times. A dynamic travel time prediction model for buses addressing the cases on road with multiple bus routes is proposed in this paper, based on support vector machines (SVMs) and Kalman filtering-based algorithm. In the proposed model, the well-trained SVM model predicts the baseline bus travel times from the historical bus trip data; the Kalman filtering-based dynamic algorithm can adjust bus travel times with the latest bus operation information and the estimated baseline travel times. The performance of the proposed dynamic model is validated with the real-world data on road with multiple bus routes in Shenzhen, China. The results show that the proposed dynamic model is feasible and applicable for bus travel time prediction and has the best prediction performance among all the five models proposed in the study in terms of prediction accuracy on road with multiple bus routes.

  13. Dynamic Bus Travel Time Prediction Models on Road with Multiple Bus Routes

    Science.gov (United States)

    Bai, Cong; Peng, Zhong-Ren; Lu, Qing-Chang; Sun, Jian

    2015-01-01

    Accurate and real-time travel time information for buses can help passengers better plan their trips and minimize waiting times. A dynamic travel time prediction model for buses addressing the cases on road with multiple bus routes is proposed in this paper, based on support vector machines (SVMs) and Kalman filtering-based algorithm. In the proposed model, the well-trained SVM model predicts the baseline bus travel times from the historical bus trip data; the Kalman filtering-based dynamic algorithm can adjust bus travel times with the latest bus operation information and the estimated baseline travel times. The performance of the proposed dynamic model is validated with the real-world data on road with multiple bus routes in Shenzhen, China. The results show that the proposed dynamic model is feasible and applicable for bus travel time prediction and has the best prediction performance among all the five models proposed in the study in terms of prediction accuracy on road with multiple bus routes. PMID:26294903

  14. Can we predict podiatric medical school grade point average using an admission screen?

    Science.gov (United States)

    Shaw, Graham P; Velis, Evelio; Molnar, David

    2012-01-01

    Most medical school admission committees use cognitive and noncognitive measures to inform their final admission decisions. We evaluated using admission data to predict academic success for podiatric medical students using first-semester grade point average (GPA) and cumulative GPA at graduation as outcome measures. In this study, we used linear multiple regression to examine the predictive power of an admission screen. A cross-validation technique was used to assess how the results of the regression model would generalize to an independent data set. Undergraduate GPA and Medical College Admission Test score accounted for only 22% of the variance in cumulative GPA at graduation. Undergraduate GPA, Medical College Admission Test score, and a time trend variable accounted for only 24% of the variance in first-semester GPA. Seventy-five percent of the individual variation in cumulative GPA at graduation and first-semester GPA remains unaccounted for by admission screens that rely on only cognitive measures, such as undergraduate GPA and Medical College Admission Test score. A reevaluation of admission screens is warranted, and medical educators should consider broadening the criteria used to select the podiatric physicians of the future.

  15. Daily Suspended Sediment Discharge Prediction Using Multiple Linear Regression and Artificial Neural Network

    Science.gov (United States)

    Uca; Toriman, Ekhwan; Jaafar, Othman; Maru, Rosmini; Arfan, Amal; Saleh Ahmar, Ansari

    2018-01-01

    Prediction of suspended sediment discharge in a catchments area is very important because it can be used to evaluation the erosion hazard, management of its water resources, water quality, hydrology project management (dams, reservoirs, and irrigation) and to determine the extent of the damage that occurred in the catchments. Multiple Linear Regression analysis and artificial neural network can be used to predict the amount of daily suspended sediment discharge. Regression analysis using the least square method, whereas artificial neural networks using Radial Basis Function (RBF) and feedforward multilayer perceptron with three learning algorithms namely Levenberg-Marquardt (LM), Scaled Conjugate Descent (SCD) and Broyden-Fletcher-Goldfarb-Shanno Quasi-Newton (BFGS). The number neuron of hidden layer is three to sixteen, while in output layer only one neuron because only one output target. The mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2 ) and coefficient of efficiency (CE) of the multiple linear regression (MLRg) value Model 2 (6 input variable independent) has the lowest the value of MAE and RMSE (0.0000002 and 13.6039) and highest R2 and CE (0.9971 and 0.9971). When compared between LM, SCG and RBF, the BFGS model structure 3-7-1 is the better and more accurate to prediction suspended sediment discharge in Jenderam catchment. The performance value in testing process, MAE and RMSE (13.5769 and 17.9011) is smallest, meanwhile R2 and CE (0.9999 and 0.9998) is the highest if it compared with the another BFGS Quasi-Newton model (6-3-1, 9-10-1 and 12-12-1). Based on the performance statistics value, MLRg, LM, SCG, BFGS and RBF suitable and accurately for prediction by modeling the non-linear complex behavior of suspended sediment responses to rainfall, water depth and discharge. The comparison between artificial neural network (ANN) and MLRg, the MLRg Model 2 accurately for to prediction suspended sediment discharge (kg

  16. Statistical analysis in the design of nuclear fuel cells and training of a neural network to predict safety parameters for reactors BWR

    International Nuclear Information System (INIS)

    Jauregui Ch, V.

    2013-01-01

    In this work the obtained results for a statistical analysis are shown, with the purpose of studying the performance of the fuel lattice, taking into account the frequency of the pins that were used. For this objective, different statistical distributions were used; one approximately to normal, another type X 2 but in an inverse form and a random distribution. Also, the prediction of some parameters of the nuclear reactor in a fuel reload was made through a neuronal network, which was trained. The statistical analysis was made using the parameters of the fuel lattice, which was generated through three heuristic techniques: Ant Colony Optimization System, Neuronal Networks and a hybrid among Scatter Search and Path Re linking. The behavior of the local power peak factor was revised in the fuel lattice with the use of different frequencies of enrichment uranium pines, using the three techniques mentioned before, in the same way the infinite multiplication factor of neutrons was analyzed (k..), to determine within what range this factor in the reactor is. Taking into account all the information, which was obtained through the statistical analysis, a neuronal network was trained; that will help to predict the behavior of some parameters of the nuclear reactor, considering a fixed fuel reload with their respective control rods pattern. In the same way, the quality of the training was evaluated using different fuel lattices. The neuronal network learned to predict the next parameters: Shutdown Margin (SDM), the pin burn peaks for two different fuel batches, Thermal Limits and the Effective Neutron Multiplication Factor (k eff ). The results show that the fuel lattices in which the frequency, which the inverted form of the X 2 distribution, was used revealed the best values of local power peak factor. Additionally it is shown that the performance of a fuel lattice could be enhanced controlling the frequency of the uranium enrichment rods and the variety of the gadolinium

  17. A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data

    Directory of Open Access Journals (Sweden)

    Olson James M

    2006-04-01

    Full Text Available Abstract Background Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip® that uses multiple oligonucleotide probes (i.e. probe set, since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip® was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip® gene expression array data. Results We developed a two-step approach to predict alternative splicing from GeneChip® data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip® Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic

  18. Application of statistical classification methods for predicting the acceptability of well-water quality

    Science.gov (United States)

    Cameron, Enrico; Pilla, Giorgio; Stella, Fabio A.

    2018-01-01

    The application of statistical classification methods is investigated—in comparison also to spatial interpolation methods—for predicting the acceptability of well-water quality in a situation where an effective quantitative model of the hydrogeological system under consideration cannot be developed. In the example area in northern Italy, in particular, the aquifer is locally affected by saline water and the concentration of chloride is the main indicator of both saltwater occurrence and groundwater quality. The goal is to predict if the chloride concentration in a water well will exceed the allowable concentration so that the water is unfit for the intended use. A statistical classification algorithm achieved the best predictive performances and the results of the study show that statistical classification methods provide further tools for dealing with groundwater quality problems concerning hydrogeological systems that are too difficult to describe analytically or to simulate effectively.

  19. Output from Statistical Predictive Models as Input to eLearning Dashboards

    Directory of Open Access Journals (Sweden)

    Marlene A. Smith

    2015-06-01

    Full Text Available We describe how statistical predictive models might play an expanded role in educational analytics by giving students automated, real-time information about what their current performance means for eventual success in eLearning environments. We discuss how an online messaging system might tailor information to individual students using predictive analytics. The proposed system would be data-driven and quantitative; e.g., a message might furnish the probability that a student will successfully complete the certificate requirements of a massive open online course. Repeated messages would prod underperforming students and alert instructors to those in need of intervention. Administrators responsible for accreditation or outcomes assessment would have ready documentation of learning outcomes and actions taken to address unsatisfactory student performance. The article’s brief introduction to statistical predictive models sets the stage for a description of the messaging system. Resources and methods needed to develop and implement the system are discussed.

  20. Risk Prediction Models for Other Cancers or Multiple Sites

    Science.gov (United States)

    Developing statistical models that estimate the probability of developing other multiple cancers over a defined period of time will help clinicians identify individuals at higher risk of specific cancers, allowing for earlier or more frequent screening and counseling of behavioral changes to decrease risk.

  1. Predicting statistical properties of open reading frames in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Katharina Mir

    Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.

  2. Statistical timing for parametric yield prediction of digital integrated circuits

    NARCIS (Netherlands)

    Jess, J.A.G.; Kalafala, K.; Naidu, S.R.; Otten, R.H.J.M.; Visweswariah, C.

    2006-01-01

    Uncertainty in circuit performance due to manufacturing and environmental variations is increasing with each new generation of technology. It is therefore important to predict the performance of a chip as a probabilistic quantity. This paper proposes three novel path-based algorithms for statistical

  3. Predictive value of ventilatory inflection points determined under field conditions.

    Science.gov (United States)

    Heyde, Christian; Mahler, Hubert; Roecker, Kai; Gollhofer, Albert

    2016-01-01

    The aim of this study was to evaluate the predictive potential provided by two ventilatory inflection points (VIP1 and VIP2) examined in field without using gas analysis systems and uncomfortable facemasks. A calibrated respiratory inductance plethysmograph (RIP) and a computerised routine were utilised, respectively, to derive ventilation and to detect VIP1 and VIP2 during a standardised field ramp test on a 400 m running track on 81 participants. In addition, average running speed of a competitive 1000 m run (S1k) was observed as criterion. The predictive value of running speed at VIP1 (SVIP1) and the speed range between VIP1 and VIP2 in relation to VIP2 (VIPSPAN) was analysed via regression analysis. VIPSPAN rather than running speed at VIP2 (SVIP2) was operationalised as a predictor to consider the covariance between SVIP1 and SVIP2. SVIP1 and VIPSPAN, respectively, provided 58.9% and 22.9% of explained variance in regard to S1k. Considering covariance, the timing of two ventilatory inflection points provides predictive value in regard to a competitive 1000 m run. This is the first study to apply computerised detection of ventilatory inflection points in a field setting independent on measurements of the respiratory gas exchange and without using any facemasks.

  4. Bayesian models based on test statistics for multiple hypothesis testing problems.

    Science.gov (United States)

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  5. Statistical spatial properties of speckle patterns generated by multiple laser beams

    International Nuclear Information System (INIS)

    Le Cain, A.; Sajer, J. M.; Riazuelo, G.

    2011-01-01

    This paper investigates hot spot characteristics generated by the superposition of multiple laser beams. First, properties of speckle statistics are studied in the context of only one laser beam by computing the autocorrelation function. The case of multiple laser beams is then considered. In certain conditions, it is shown that speckles have an ellipsoidal shape. Analytical expressions of hot spot radii generated by multiple laser beams are derived and compared to numerical estimates made from the autocorrelation function. They are also compared to numerical simulations performed within the paraxial approximation. Excellent agreement is found for the speckle width as well as for the speckle length. Application to the speckle patterns generated in the Laser MegaJoule configuration in the zone where all the beams overlap is presented. Influence of polarization on the size of the speckles as well as on their abundance is studied.

  6. How Do Users Map Points Between Dissimilar Shapes?

    KAUST Repository

    Hecher, Michael

    2017-07-25

    Finding similar points in globally or locally similar shapes has been studied extensively through the use of various point descriptors or shape-matching methods. However, little work exists on finding similar points in dissimilar shapes. In this paper, we present the results of a study where users were given two dissimilar two-dimensional shapes and asked to map a given point in the first shape to the point in the second shape they consider most similar. We find that user mappings in this study correlate strongly with simple geometric relationships between points and shapes. To predict the probability distribution of user mappings between any pair of simple two-dimensional shapes, two distinct statistical models are defined using these relationships. We perform a thorough validation of the accuracy of these predictions and compare our models qualitatively and quantitatively to well-known shape-matching methods. Using our predictive models, we propose an approach to map objects or procedural content between different shapes in different design scenarios.

  7. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    Science.gov (United States)

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  8. Detecting Change-Point via Saddlepoint Approximations

    Institute of Scientific and Technical Information of China (English)

    Zhaoyuan LI; Maozai TIAN

    2017-01-01

    It's well-known that change-point problem is an important part of model statistical analysis.Most of the existing methods are not robust to criteria of the evaluation of change-point problem.In this article,we consider "mean-shift" problem in change-point studies.A quantile test of single quantile is proposed based on saddlepoint approximation method.In order to utilize the information at different quantile of the sequence,we further construct a "composite quantile test" to calculate the probability of every location of the sequence to be a change-point.The location of change-point can be pinpointed rather than estimated within a interval.The proposed tests make no assumptions about the functional forms of the sequence distribution and work sensitively on both large and small size samples,the case of change-point in the tails,and multiple change-points situation.The good performances of the tests are confirmed by simulations and real data analysis.The saddlepoint approximation based distribution of the test statistic that is developed in the paper is of independent interest and appealing.This finding may be of independent interest to the readers in this research area.

  9. Predicting Protein Function via Semantic Integration of Multiple Networks.

    Science.gov (United States)

    Yu, Guoxian; Fu, Guangyuan; Wang, Jun; Zhu, Hailong

    2016-01-01

    Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet.

  10. Prediction on dielectric strength and boiling point of gaseous molecules for replacement of SF6.

    Science.gov (United States)

    Yu, Xiaojuan; Hou, Hua; Wang, Baoshan

    2017-04-15

    Developing the environment-friendly insulation gases to replace sulfur hexafluoride (SF 6 ) has attracted considerable experimental and theoretical attentions but without success. A computational methodology was presented herein for prediction on dielectric strength and boiling point of arbitrary gaseous molecules in the purpose of molecular design and screening. New structure-activity relationship (SAR) models have been established by combining the density-dependent properties of the electrostatic potential surface, including surface area and the statistical variance of the surface potentials, with the molecular properties including polarizability, electronegativity, and hardness. All the descriptors in the SAR models were calculated using density functional theory. The substitution effect of SF 6 by various functional groups was studied systematically. It was found that CF 3 is the most effective functional group to improve the dielectric strength due to the large surface area and polarizability. However, all the substitutes exhibit higher boiling points than SF 6 because the molecular hardness decreases. The balance between E r and T b could be achieved by minimizing the local polarity of the molecules. SF 5 CN and SF 5 CFO were found to be the potent candidates to replace SF 6 in view of their large dielectric strengths and low boiling points. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  11. REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH

    Directory of Open Access Journals (Sweden)

    R. SATISHKUMAR

    2017-01-01

    Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.

  12. Four points function fitted and first derivative procedure for determining the end points in potentiometric titration curves: statistical analysis and method comparison.

    Science.gov (United States)

    Kholeif, S A

    2001-06-01

    A new method that belongs to the differential category for determining the end points from potentiometric titration curves is presented. It uses a preprocess to find first derivative values by fitting four data points in and around the region of inflection to a non-linear function, and then locate the end point, usually as a maximum or minimum, using an inverse parabolic interpolation procedure that has an analytical solution. The behavior and accuracy of the sigmoid and cumulative non-linear functions used are investigated against three factors. A statistical evaluation of the new method using linear least-squares method validation and multifactor data analysis are covered. The new method is generally applied to symmetrical and unsymmetrical potentiometric titration curves, and the end point is calculated using numerical procedures only. It outperforms the "parent" regular differential method in almost all factors levels and gives accurate results comparable to the true or estimated true end points. Calculated end points from selected experimental titration curves compatible with the equivalence point category of methods, such as Gran or Fortuin, are also compared with the new method.

  13. Point defect characterization in HAADF-STEM images using multivariate statistical analysis

    International Nuclear Information System (INIS)

    Sarahan, Michael C.; Chi, Miaofang; Masiel, Daniel J.; Browning, Nigel D.

    2011-01-01

    Quantitative analysis of point defects is demonstrated through the use of multivariate statistical analysis. This analysis consists of principal component analysis for dimensional estimation and reduction, followed by independent component analysis to obtain physically meaningful, statistically independent factor images. Results from these analyses are presented in the form of factor images and scores. Factor images show characteristic intensity variations corresponding to physical structure changes, while scores relate how much those variations are present in the original data. The application of this technique is demonstrated on a set of experimental images of dislocation cores along a low-angle tilt grain boundary in strontium titanate. A relationship between chemical composition and lattice strain is highlighted in the analysis results, with picometer-scale shifts in several columns measurable from compositional changes in a separate column. -- Research Highlights: → Multivariate analysis of HAADF-STEM images. → Distinct structural variations among SrTiO 3 dislocation cores. → Picometer atomic column shifts correlated with atomic column population changes.

  14. QSPR using MOLGEN-QSPR: the challenge of fluoroalkane boiling points.

    Science.gov (United States)

    Rücker, Christoph; Meringer, Markus; Kerber, Adalbert

    2005-01-01

    By means of the new software MOLGEN-QSPR, a multilinear regression model for the boiling points of lower fluoroalkanes is established. The model is based exclusively on simple descriptors derived directly from molecular structure and nevertheless describes a broader set of data more precisely than previous attempts that used either more demanding (quantum chemical) descriptors or more demanding (nonlinear) statistical methods such as neural networks. The model's internal consistency was confirmed by leave-one-out cross-validation. The model was used to predict all unknown boiling points of fluorobutanes, and the quality of predictions was estimated by means of comparison with boiling point predictions for fluoropentanes.

  15. Estimation of Multiple Point Sources for Linear Fractional Order Systems Using Modulating Functions

    KAUST Repository

    Belkhatir, Zehor; Laleg-Kirati, Taous-Meriem

    2017-01-01

    This paper proposes an estimation algorithm for the characterization of multiple point inputs for linear fractional order systems. First, using polynomial modulating functions method and a suitable change of variables the problem of estimating

  16. A statistical model for predicting muscle performance

    Science.gov (United States)

    Byerly, Diane Leslie De Caix

    The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing

  17. Short-Term Power Load Point Prediction Based on the Sharp Degree and Chaotic RBF Neural Network

    Directory of Open Access Journals (Sweden)

    Dongxiao Niu

    2015-01-01

    Full Text Available In order to realize the predicting and positioning of short-term load inflection point, this paper made reference to related research in the field of computer image recognition. It got a load sharp degree sequence by the transformation of the original load sequence based on the algorithm of sharp degree. Then this paper designed a forecasting model based on the chaos theory and RBF neural network. It predicted the load sharp degree sequence based on the forecasting model to realize the positioning of short-term load inflection point. Finally, in the empirical example analysis, this paper predicted the daily load point of a region using the actual load data of the certain region to verify the effectiveness and applicability of this method. Prediction results showed that most of the test sample load points could be accurately predicted.

  18. Development of a Whole-Body Haptic Sensor with Multiple Supporting Points and Its Application to a Manipulator

    Science.gov (United States)

    Hanyu, Ryosuke; Tsuji, Toshiaki

    This paper proposes a whole-body haptic sensing system that has multiple supporting points between the body frame and the end-effector. The system consists of an end-effector and multiple force sensors. Using this mechanism, the position of a contact force on the surface can be calculated without any sensor array. A haptic sensing system with a single supporting point structure has previously been developed by the present authors. However, the system has drawbacks such as low stiffness and low strength. Therefore, in this study, a mechanism with multiple supporting points was proposed and its performance was verified. In this paper, the basic concept of the mechanism is first introduced. Next, an evaluation of the proposed method, performed by conducting some experiments, is presented.

  19. MULTIPLE ACCESS POINTS WITHIN THE ONLINE CLASSROOM: WHERE STUDENTS LOOK FOR INFORMATION

    Directory of Open Access Journals (Sweden)

    John STEELE

    2017-01-01

    Full Text Available The purpose of this study is to examine the impact of information placement within the confines of the online classroom architecture. Also reviewed was the impact of other variables such as course design, teaching presence and student patterns in looking for information. The sample population included students from a major online university in their first year course sequence. Students were tasked with completing a survey at the end of the course, indicating their preference for accessing information within the online classroom. The qualitative data indicated that student preference is to receive information from multiple access points and sources within the online classroom architecture. Students also expressed a desire to have information delivered through the usage of technology such as email and text messaging. In addition to receiving information from multiple sources, the qualitative data indicated students were satisfied overall, with the current ways in which they received and accessed information within the online classroom setting. Major findings suggest that instructors teaching within the online classroom should have multiple data access points within the classroom architecture. Furthermore, instructors should use a variety of communication venues to enhance the ability for students to access and receive information pertinent to the course.

  20. Method of Check of Statistical Hypotheses for Revealing of “Fraud” Point of Sale

    Directory of Open Access Journals (Sweden)

    T. M. Bolotskaya

    2011-06-01

    Full Text Available Application method checking of statistical hypotheses fraud Point of Sale working with purchasing cards and suspected of accomplishment of unauthorized operations is analyzed. On the basis of the received results the algorithm is developed, allowing receive an assessment of works of terminals in regime off-line.

  1. Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop

    Science.gov (United States)

    Morrison, Joseph H.

    2010-01-01

    A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.

  2. Predictive performance models and multiple task performance

    Science.gov (United States)

    Wickens, Christopher D.; Larish, Inge; Contorer, Aaron

    1989-01-01

    Five models that predict how performance of multiple tasks will interact in complex task scenarios are discussed. The models are shown in terms of the assumptions they make about human operator divided attention. The different assumptions about attention are then empirically validated in a multitask helicopter flight simulation. It is concluded from this simulation that the most important assumption relates to the coding of demand level of different component tasks.

  3. Multiple nano elements of SCC--transition from phenomenology to predictive mechanistics

    International Nuclear Information System (INIS)

    Staehle, R.W.

    2009-01-01

    Full text of publication follows: Predicting the occurrence and rate of stress corrosion cracking in materials of construction is one of the most critical pathways for assuring the reliability of light water nuclear reactor plants. It is the general intention of operators of nuclear plants that they continue performing satisfactorily for times of 60 to 80 years at least. Such times are beyond existing experience, and there are no bases for choosing credible predictions. Present bases for predicting SCC rely on anecdotal experience for predicting what materials sustain SCC in specified environments and on phenomenological correlations using such parameters as K (stress intensity), 1/T (temperature), E(corr) (corrosion potential), pH, [x] a (concentration), other established quantities, and statistical correlations. While these phenomenological correlations have served the industry well in the past, they have also allowed grievous mistakes. Further, such correlations are flawed in their fundamental credibility. Predicting SCC in aqueous solutions means to predict its dependence upon the seven primary variables: potential, pH, species, alloy composition, alloy structure, stress and temperature. A serious prediction of SCC upon these seven primary variables can only be achieved by moving to fundamental nano elements. Unfortunately, useful predictability from the nano approach cannot be achieved quickly or easily; thus, it will continue to be necessary to rely on existing phenomenology. However, as the nano approach evolves, it can contribute increasingly to the quantitative capacity of the phenomenological approach. The nano approach will require quite different talents and thinking than are now applied to the prediction of SCC; while some of the boundary conditions of phenomenology must continue to be applied, elements of the nano approach will include accounting for at least, typically, the following multiple elements as they apply at the sites of initiation and at

  4. Statistical signatures of a targeted search by bacteria

    Science.gov (United States)

    Jashnsaz, Hossein; Anderson, Gregory G.; Pressé, Steve

    2017-12-01

    Chemoattractant gradients are rarely well-controlled in nature and recent attention has turned to bacterial chemotaxis toward typical bacterial food sources such as food patches or even bacterial prey. In environments with localized food sources reminiscent of a bacterium’s natural habitat, striking phenomena—such as the volcano effect or banding—have been predicted or expected to emerge from chemotactic models. However, in practice, from limited bacterial trajectory data it is difficult to distinguish targeted searches from an untargeted search strategy for food sources. Here we use a theoretical model to identify statistical signatures of a targeted search toward point food sources, such as prey. Our model is constructed on the basis that bacteria use temporal comparisons to bias their random walk, exhibit finite memory and are subject to random (Brownian) motion as well as signaling noise. The advantage with using a stochastic model-based approach is that a stochastic model may be parametrized from individual stochastic bacterial trajectories but may then be used to generate a very large number of simulated trajectories to explore average behaviors obtained from stochastic search strategies. For example, our model predicts that a bacterium’s diffusion coefficient increases as it approaches the point source and that, in the presence of multiple sources, bacteria may take substantially longer to locate their first source giving the impression of an untargeted search strategy.

  5. Method of Fusion Diagnosis for Dam Service Status Based on Joint Distribution Function of Multiple Points

    Directory of Open Access Journals (Sweden)

    Zhenxiang Jiang

    2016-01-01

    Full Text Available The traditional methods of diagnosing dam service status are always suitable for single measuring point. These methods also reflect the local status of dams without merging multisource data effectively, which is not suitable for diagnosing overall service. This study proposes a new method involving multiple points to diagnose dam service status based on joint distribution function. The function, including monitoring data of multiple points, can be established with t-copula function. Therefore, the possibility, which is an important fusing value in different measuring combinations, can be calculated, and the corresponding diagnosing criterion is established with typical small probability theory. Engineering case study indicates that the fusion diagnosis method can be conducted in real time and the abnormal point can be detected, thereby providing a new early warning method for engineering safety.

  6. Predictability of the recent slowdown and subsequent recovery of large-scale surface warming using statistical methods

    Science.gov (United States)

    Mann, Michael E.; Steinman, Byron A.; Miller, Sonya K.; Frankcombe, Leela M.; England, Matthew H.; Cheung, Anson H.

    2016-04-01

    The temporary slowdown in large-scale surface warming during the early 2000s has been attributed to both external and internal sources of climate variability. Using semiempirical estimates of the internal low-frequency variability component in Northern Hemisphere, Atlantic, and Pacific surface temperatures in concert with statistical hindcast experiments, we investigate whether the slowdown and its recent recovery were predictable. We conclude that the internal variability of the North Pacific, which played a critical role in the slowdown, does not appear to have been predictable using statistical forecast methods. An additional minor contribution from the North Atlantic, by contrast, appears to exhibit some predictability. While our analyses focus on combining semiempirical estimates of internal climatic variability with statistical hindcast experiments, possible implications for initialized model predictions are also discussed.

  7. Generating and executing programs for a floating point single instruction multiple data instruction set architecture

    Science.gov (United States)

    Gschwind, Michael K

    2013-04-16

    Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.

  8. A Point Kinetics Model for Estimating Neutron Multiplication of Bare Uranium Metal in Tagged Neutron Measurements

    Science.gov (United States)

    Tweardy, Matthew C.; McConchie, Seth; Hayward, Jason P.

    2017-07-01

    An extension of the point kinetics model is developed to describe the neutron multiplicity response of a bare uranium object under interrogation by an associated particle imaging deuterium-tritium (D-T) measurement system. This extended model is used to estimate the total neutron multiplication of the uranium. Both MCNPX-PoliMi simulations and data from active interrogation measurements of highly enriched and depleted uranium geometries are used to evaluate the potential of this method and to identify the sources of systematic error. The detection efficiency correction for measured coincidence response is identified as a large source of systematic error. If the detection process is not considered, results suggest that the method can estimate total multiplication to within 13% of the simulated value. Values for multiplicity constants in the point kinetics equations are sensitive to enrichment due to (n, xn) interactions by D-T neutrons and can introduce another significant source of systematic bias. This can theoretically be corrected if isotopic composition is known a priori. The spatial dependence of multiplication is also suspected of introducing further systematic bias for high multiplication uranium objects.

  9. Uncertainty propagation for statistical impact prediction of space debris

    Science.gov (United States)

    Hoogendoorn, R.; Mooij, E.; Geul, J.

    2018-01-01

    Predictions of the impact time and location of space debris in a decaying trajectory are highly influenced by uncertainties. The traditional Monte Carlo (MC) method can be used to perform accurate statistical impact predictions, but requires a large computational effort. A method is investigated that directly propagates a Probability Density Function (PDF) in time, which has the potential to obtain more accurate results with less computational effort. The decaying trajectory of Delta-K rocket stages was used to test the methods using a six degrees-of-freedom state model. The PDF of the state of the body was propagated in time to obtain impact-time distributions. This Direct PDF Propagation (DPP) method results in a multi-dimensional scattered dataset of the PDF of the state, which is highly challenging to process. No accurate results could be obtained, because of the structure of the DPP data and the high dimensionality. Therefore, the DPP method is less suitable for practical uncontrolled entry problems and the traditional MC method remains superior. Additionally, the MC method was used with two improved uncertainty models to obtain impact-time distributions, which were validated using observations of true impacts. For one of the two uncertainty models, statistically more valid impact-time distributions were obtained than in previous research.

  10. A perceptual space of local image statistics.

    Science.gov (United States)

    Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M

    2015-12-01

    Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Point-counterpoint in physics: theoretical prediction and experimental discovery of elementary particles

    International Nuclear Information System (INIS)

    Leite Lopes, J.

    1984-01-01

    A report is given on the theoretical prediction and the experimental discovery of elementary particles from the electron to the weak intermediate vector bosons. The work of Lattes, Occhialini and Powell which put in evidence the pions predicted by Yukawa was the starting point of the modern experimental particle physics

  12. Predicting radiotherapy outcomes using statistical learning techniques

    International Nuclear Information System (INIS)

    El Naqa, Issam; Bradley, Jeffrey D; Deasy, Joseph O; Lindsay, Patricia E; Hope, Andrew J

    2009-01-01

    Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model

  13. The shooting method and multiple solutions of two/multi-point BVPs of second-order ODE

    Directory of Open Access Journals (Sweden)

    Man Kam Kwong

    2006-06-01

    Full Text Available Within the last decade, there has been growing interest in the study of multiple solutions of two- and multi-point boundary value problems of nonlinear ordinary differential equations as fixed points of a cone mapping. Undeniably many good results have emerged. The purpose of this paper is to point out that, in the special case of second-order equations, the shooting method can be an effective tool, sometimes yielding better results than those obtainable via fixed point techniques.

  14. AIR POLLUITON INDEX PREDICTION USING MULTIPLE NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    Zainal Ahmad

    2017-05-01

    Full Text Available Air quality monitoring and forecasting tools are necessary for the purpose of taking precautionary measures against air pollution, such as reducing the effect of a predicted air pollution peak on the surrounding population and ecosystem. In this study a single Feed-forward Artificial Neural Network (FANN is shown to be able to predict the Air Pollution Index (API with a Mean Squared Error (MSE and coefficient determination, R2, of 0.1856 and 0.7950 respectively. However, due to the non-robust nature of single FANN, a selective combination of Multiple Neural Networks (MNN is introduced using backward elimination and a forward selection method. The results show that both selective combination methods can improve the robustness and performance of the API prediction with the MSE and R2 of 0.1614 and 0.8210 respectively. This clearly shows that it is possible to reduce the number of networks combined in MNN for API prediction, without losses of any information in terms of the performance of the final API prediction model.

  15. An analytical model for predicting dryout point in bilaterally heated vertical narrow annuli

    International Nuclear Information System (INIS)

    Aye Myint; Tian Wenxi; Jia Dounan; Li Zhihui, Li Hao

    2005-02-01

    Based on the the droplet-diffusion model by Kirillov and Smogalev (1969, 1972), a new analytical model of dryout point prediction in the steam-water flow for bilaterally and uniformly heated narrow annular gap was developed. Comparison of the present model predictions with experimental results indicated that a good agreement in accuracy for the experimental parametric range (pressure from 0.8 to 3.5 MPa, mass flux of 60.39 to 135.6 kg· -2 ·s -1 and the heat flus of 50 kW·m -2 . Prediction of dryout point was experimentally investigated with deionized water upflowing through narrow annular channel with 1.0 mm and 1.5 mm gap heated by AC power supply. (author)

  16. Influence of Immersion Conditions on The Tensile Strength of Recycled Kevlar®/Polyester/Low-Melting-Point Polyester Nonwoven Geotextiles through Applying Statistical Analyses

    Directory of Open Access Journals (Sweden)

    Jing-Chzi Hsieh

    2016-05-01

    Full Text Available The recycled Kevlar®/polyester/low-melting-point polyester (recycled Kevlar®/PET/LPET nonwoven geotextiles are immersed in neutral, strong acid, and strong alkali solutions, respectively, at different temperatures for four months. Their tensile strength is then tested according to various immersion periods at various temperatures, in order to determine their durability to chemicals. For the purpose of analyzing the possible factors that influence mechanical properties of geotextiles under diverse environmental conditions, the experimental results and statistical analyses are incorporated in this study. Therefore, influences of the content of recycled Kevlar® fibers, implementation of thermal treatment, and immersion periods on the tensile strength of recycled Kevlar®/PET/LPET nonwoven geotextiles are examined, after which their influential levels are statistically determined by performing multiple regression analyses. According to the results, the tensile strength of nonwoven geotextiles can be enhanced by adding recycled Kevlar® fibers and thermal treatment.

  17. Point-counterpoint in physics: theoretical prediction and experimental discovery of elementary particles

    International Nuclear Information System (INIS)

    Lopes, J.L.

    1984-01-01

    A report is given on the theoretical prediction and the experimental discovery of elementary particles from the electron to the weak intermediate vector bosons. The work of Lattes, Occhialini and Powell which put in evidence the pions predicted by Yukawa was the starting point of the modern experimental particle physics. (Author) [pt

  18. A comparison of artificial neural networks with other statistical approaches for the prediction of true metabolizable energy of meat and bone meal.

    Science.gov (United States)

    Perai, A H; Nassiri Moghaddam, H; Asadpour, S; Bahrampour, J; Mansoori, Gh

    2010-07-01

    There has been a considerable and continuous interest to develop equations for rapid and accurate prediction of the ME of meat and bone meal. In this study, an artificial neural network (ANN), a partial least squares (PLS), and a multiple linear regression (MLR) statistical method were used to predict the TME(n) of meat and bone meal based on its CP, ether extract, and ash content. The accuracy of the models was calculated by R(2) value, MS error, mean absolute percentage error, mean absolute deviation, bias, and Theil's U. The predictive ability of an ANN was compared with a PLS and a MLR model using the same training data sets. The squared regression coefficients of prediction for the MLR, PLS, and ANN models were 0.38, 0.36, and 0.94, respectively. The results revealed that ANN produced more accurate predictions of TME(n) as compared with PLS and MLR methods. Based on the results of this study, ANN could be used as a promising approach for rapid prediction of nutritive value of meat and bone meal.

  19. Statistical Analysis of Clinical Data on a Pocket Calculator, Part 2 Statistics on a Pocket Calculator, Part 2

    CERN Document Server

    Cleophas, Ton J

    2012-01-01

    The first part of this title contained all statistical tests relevant to starting clinical investigations, and included tests for continuous and binary data, power, sample size, multiple testing, variability, confounding, interaction, and reliability. The current part 2 of this title reviews methods for handling missing data, manipulated data, multiple confounders, predictions beyond observation, uncertainty of diagnostic tests, and the problems of outliers. Also robust tests, non-linear modeling , goodness of fit testing, Bhatacharya models, item response modeling, superiority testing, variab

  20. Multiple contacts with diversion at the point of arrest.

    Science.gov (United States)

    Riordan, Sharon; Wix, Stuart; Haque, M Sayeed; Humphreys, Martin

    2003-04-01

    A diversion at the point of arrest (DAPA) scheme was set up in five police stations in South Birmingham in 1992. In a study of all referrals made over a four-year period a sub group of multiple contact individuals was identified. During that time four hundred and ninety-two contacts were recorded in total, of which 130 were made by 58 individuals. The latter group was generally no different from the single contact group but did have a tendency to be younger. This research highlights the need for a re-evaluation of service provision and associated education of police officers and relevant mental health care professionals.

  1. A RANS knock model to predict the statistical occurrence of engine knock

    International Nuclear Information System (INIS)

    D'Adamo, Alessandro; Breda, Sebastiano; Fontanesi, Stefano; Irimescu, Adrian; Merola, Simona Silvia; Tornatore, Cinzia

    2017-01-01

    Highlights: • Development of a new RANS model for SI engine knock probability. • Turbulence-derived transport equations for variances of mixture fraction and enthalpy. • Gasoline autoignition delay times calculated from detailed chemical kinetics. • Knock probability validated against experiments on optically accessible GDI unit. • PDF-based knock model accounting for the random nature of SI engine knock in RANS simulations. - Abstract: In the recent past engine knock emerged as one of the main limiting aspects for the achievement of higher efficiency targets in modern spark-ignition (SI) engines. To attain these requirements, engine operating points must be moved as close as possible to the onset of abnormal combustions, although the turbulent nature of flow field and SI combustion leads to possibly ample fluctuations between consecutive engine cycles. This forces engine designers to distance the target condition from its theoretical optimum in order to prevent abnormal combustion, which can potentially damage engine components because of few individual heavy-knocking cycles. A statistically based RANS knock model is presented in this study, whose aim is the prediction not only of the ensemble average knock occurrence, poorly meaningful in such a stochastic event, but also of a knock probability. The model is based on look-up tables of autoignition times from detailed chemistry, coupled with transport equations for the variance of mixture fraction and enthalpy. The transported perturbations around the ensemble average value are based on variable gradients and on a local turbulent time scale. A multi-variate cell-based Gaussian-PDF model is proposed for the unburnt mixture, resulting in a statistical distribution for the in-cell reaction rate. An average knock precursor and its variance are independently calculated and transported; this results in the prediction of an earliest knock probability preceding the ensemble average knock onset, as confirmed by

  2. Interior Point Methods on GPU with application to Model Predictive Control

    DEFF Research Database (Denmark)

    Gade-Nielsen, Nicolai Fog

    The goal of this thesis is to investigate the application of interior point methods to solve dynamical optimization problems, using a graphical processing unit (GPU) with a focus on problems arising in Model Predictice Control (MPC). Multi-core processors have been available for over ten years now...... software package called GPUOPT, available under the non-restrictive MIT license. GPUOPT includes includes a primal-dual interior-point method, which supports both the CPU and the GPU. It is implemented as multiple components, where the matrix operations and solver for the Newton directions is separated...

  3. Second Language Experience Facilitates Statistical Learning of Novel Linguistic Materials.

    Science.gov (United States)

    Potter, Christine E; Wang, Tianlin; Saffran, Jenny R

    2017-04-01

    Recent research has begun to explore individual differences in statistical learning, and how those differences may be related to other cognitive abilities, particularly their effects on language learning. In this research, we explored a different type of relationship between language learning and statistical learning: the possibility that learning a new language may also influence statistical learning by changing the regularities to which learners are sensitive. We tested two groups of participants, Mandarin Learners and Naïve Controls, at two time points, 6 months apart. At each time point, participants performed two different statistical learning tasks: an artificial tonal language statistical learning task and a visual statistical learning task. Only the Mandarin-learning group showed significant improvement on the linguistic task, whereas both groups improved equally on the visual task. These results support the view that there are multiple influences on statistical learning. Domain-relevant experiences may affect the regularities that learners can discover when presented with novel stimuli. Copyright © 2016 Cognitive Science Society, Inc.

  4. Statistics, synergy, and mechanism of multiple photogeneration of excitons in quantum dots: Fundamental and applied aspects

    International Nuclear Information System (INIS)

    Oksengendler, B. L.; Turaeva, N. N.; Uralov, I.; Marasulov, M. B.

    2012-01-01

    The effect of multiple exciton generation is analyzed based on statistical physics, quantum mechanics, and synergetics. Statistical problems of the effect of multiple exciton generation (MEG) are broadened and take into account not only exciton generation, but also background excitation. The study of the role of surface states of quantum dots is based on the synergy of self-catalyzed electronic reactions. An analysis of the MEG mechanism is based on the idea of electronic shaking using the sudden perturbation method in quantum mechanics. All of the above-mentioned results are applied to the problem of calculating the limiting efficiency to transform solar energy into electric energy. (authors)

  5. SU-E-T-205: MLC Predictive Maintenance Using Statistical Process Control Analysis.

    Science.gov (United States)

    Able, C; Hampton, C; Baydush, A; Bright, M

    2012-06-01

    MLC failure increases accelerator downtime and negatively affects the clinic treatment delivery schedule. This study investigates the use of Statistical Process Control (SPC), a modern quality control methodology, to retrospectively evaluate MLC performance data thereby predicting the impending failure of individual MLC leaves. SPC, a methodology which detects exceptional variability in a process, was used to analyze MLC leaf velocity data. A MLC velocity test is performed weekly on all leaves during morning QA. The leaves sweep 15 cm across the radiation field with the gantry pointing down. The leaf speed is analyzed from the generated dynalog file using quality assurance software. MLC leaf speeds in which a known motor failure occurred (8) and those in which no motor replacement was performed (11) were retrospectively evaluated for a 71 week period. SPC individual and moving range (I/MR) charts were used in the analysis. The I/MR chart limits were calculated using the first twenty weeks of data and set at 3 standard deviations from the mean. The MLCs in which a motor failure occurred followed two general trends: (a) no data indicating a change in leaf speed prior to failure (5 of 8) and (b) a series of data points exceeding the limit prior to motor failure (3 of 8). I/MR charts for a high percentage (8 of 11) of the non-replaced MLC motors indicated that only a single point exceeded the limit. These single point excesses were deemed false positives. SPC analysis using MLC performance data may be helpful in detecting a significant percentage of impending failures of MLC motors. The ability to detect MLC failure may depend on the method of failure (i.e. gradual or catastrophic). Further study is needed to determine if increasing the sampling frequency could increase reliability. Project was support by a grant from Varian Medical Systems, Inc. © 2012 American Association of Physicists in Medicine.

  6. NEWTONIAN IMPERIALIST COMPETITVE APPROACH TO OPTIMIZING OBSERVATION OF MULTIPLE TARGET POINTS IN MULTISENSOR SURVEILLANCE SYSTEMS

    Directory of Open Access Journals (Sweden)

    A. Afghan-Toloee

    2013-09-01

    Full Text Available The problem of specifying the minimum number of sensors to deploy in a certain area to face multiple targets has been generally studied in the literatures. In this paper, we are arguing the multi-sensors deployment problem (MDP. The Multi-sensor placement problem can be clarified as minimizing the cost required to cover the multi target points in the area. We propose a more feasible method for the multi-sensor placement problem. Our method makes provision the high coverage of grid based placements while minimizing the cost as discovered in perimeter placement techniques. The NICA algorithm as improved ICA (Imperialist Competitive Algorithm is used to decrease the performance time to explore an enough solution compared to other meta-heuristic schemes such as GA, PSO and ICA. A three dimensional area is used for clarify the multiple target and placement points, making provision x, y, and z computations in the observation algorithm. A structure of model for the multi-sensor placement problem is proposed: The problem is constructed as an optimization problem with the objective to minimize the cost while covering all multiple target points upon a given probability of observation tolerance.

  7. A statistical forecast model using the time-scale decomposition technique to predict rainfall during flood period over the middle and lower reaches of the Yangtze River Valley

    Science.gov (United States)

    Hu, Yijia; Zhong, Zhong; Zhu, Yimin; Ha, Yao

    2018-04-01

    In this paper, a statistical forecast model using the time-scale decomposition method is established to do the seasonal prediction of the rainfall during flood period (FPR) over the middle and lower reaches of the Yangtze River Valley (MLYRV). This method decomposites the rainfall over the MLYRV into three time-scale components, namely, the interannual component with the period less than 8 years, the interdecadal component with the period from 8 to 30 years, and the interdecadal component with the period larger than 30 years. Then, the predictors are selected for the three time-scale components of FPR through the correlation analysis. At last, a statistical forecast model is established using the multiple linear regression technique to predict the three time-scale components of the FPR, respectively. The results show that this forecast model can capture the interannual and interdecadal variation of FPR. The hindcast of FPR during 14 years from 2001 to 2014 shows that the FPR can be predicted successfully in 11 out of the 14 years. This forecast model performs better than the model using traditional scheme without time-scale decomposition. Therefore, the statistical forecast model using the time-scale decomposition technique has good skills and application value in the operational prediction of FPR over the MLYRV.

  8. Monthly to seasonal low flow prediction: statistical versus dynamical models

    Science.gov (United States)

    Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke

    2016-04-01

    the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.

  9. Farey Statistics in Time n^{2/3} and Counting Primitive Lattice Points in Polygons

    OpenAIRE

    Patrascu, Mihai

    2007-01-01

    We present algorithms for computing ranks and order statistics in the Farey sequence, taking time O (n^{2/3}). This improves on the recent algorithms of Pawlewicz [European Symp. Alg. 2007], running in time O (n^{3/4}). We also initiate the study of a more general algorithmic problem: counting primitive lattice points in planar shapes.

  10. A point-based prediction model for cardiovascular risk in orthotopic liver transplantation: The CAR-OLT score.

    Science.gov (United States)

    VanWagner, Lisa B; Ning, Hongyan; Whitsett, Maureen; Levitsky, Josh; Uttal, Sarah; Wilkins, John T; Abecassis, Michael M; Ladner, Daniela P; Skaro, Anton I; Lloyd-Jones, Donald M

    2017-12-01

    Cardiovascular disease (CVD) complications are important causes of morbidity and mortality after orthotopic liver transplantation (OLT). There is currently no preoperative risk-assessment tool that allows physicians to estimate the risk for CVD events following OLT. We sought to develop a point-based prediction model (risk score) for CVD complications after OLT, the Cardiovascular Risk in Orthotopic Liver Transplantation risk score, among a cohort of 1,024 consecutive patients aged 18-75 years who underwent first OLT in a tertiary-care teaching hospital (2002-2011). The main outcome measures were major 1-year CVD complications, defined as death from a CVD cause or hospitalization for a major CVD event (myocardial infarction, revascularization, heart failure, atrial fibrillation, cardiac arrest, pulmonary embolism, and/or stroke). The bootstrap method yielded bias-corrected 95% confidence intervals for the regression coefficients of the final model. Among 1,024 first OLT recipients, major CVD complications occurred in 329 (32.1%). Variables selected for inclusion in the model (using model optimization strategies) included preoperative recipient age, sex, race, employment status, education status, history of hepatocellular carcinoma, diabetes, heart failure, atrial fibrillation, pulmonary or systemic hypertension, and respiratory failure. The discriminative performance of the point-based score (C statistic = 0.78, bias-corrected C statistic = 0.77) was superior to other published risk models for postoperative CVD morbidity and mortality, and it had appropriate calibration (Hosmer-Lemeshow P = 0.33). The point-based risk score can identify patients at risk for CVD complications after OLT surgery (available at www.carolt.us); this score may be useful for identification of candidates for further risk stratification or other management strategies to improve CVD outcomes after OLT. (Hepatology 2017;66:1968-1979). © 2017 by the American Association for the Study of Liver

  11. SPY: a new scission-point model based on microscopic inputs to predict fission fragment properties

    Energy Technology Data Exchange (ETDEWEB)

    Panebianco, Stefano; Lemaître, Jean-Francois; Sida, Jean-Luc [CEA Centre de Saclay, Gif-sur-Ivette (France); Dubray, Noëel [CEA, DAM, DIF, Arpajon (France); Goriely, Stephane [Institut d' Astronomie et d' Astrophisique, Universite Libre de Bruxelles, Brussels (Belgium)

    2014-07-01

    Despite the difficulty in describing the whole fission dynamics, the main fragment characteristics can be determined in a static approach based on a so-called scission-point model. Within this framework, a new Scission-Point model for the calculations of fission fragment Yields (SPY) has been developed. This model, initially based on the approach developed by Wilkins in the late seventies, consists in performing a static energy balance at scission, where the two fragments are supposed to be completely separated so that their macroscopic properties (mass and charge) can be considered as fixed. Given the knowledge of the system state density, averaged quantities such as mass and charge yields, mean kinetic and excitation energy can then be extracted in the framework of a microcanonical statistical description. The main advantage of the SPY model is the introduction of one of the most up-to-date microscopic descriptions of the nucleus for the individual energy of each fragment and, in the future, for their state density. These quantities are obtained in the framework of HFB calculations using the Gogny nucleon-nucleon interaction, ensuring an overall coherence of the model. Starting from a description of the SPY model and its main features, a comparison between the SPY predictions and experimental data will be discussed for some specific cases, from light nuclei around mercury to major actinides. Moreover, extensive predictions over the whole chart of nuclides will be discussed, with particular attention to their implication in stellar nucleosynthesis. Finally, future developments, mainly concerning the introduction of microscopic state densities, will be briefly discussed. (author)

  12. Statistical Analysis of a Method to Predict Drug-Polymer Miscibility

    DEFF Research Database (Denmark)

    Knopp, Matthias Manne; Olesen, Niels Erik; Huang, Yanbin

    2016-01-01

    In this study, a method proposed to predict drug-polymer miscibility from differential scanning calorimetry measurements was subjected to statistical analysis. The method is relatively fast and inexpensive and has gained popularity as a result of the increasing interest in the formulation of drug...... as provided in this study. © 2015 Wiley Periodicals, Inc. and the American Pharmacists Association J Pharm Sci....

  13. Using ANFIS for selection of more relevant parameters to predict dew point temperature

    International Nuclear Information System (INIS)

    Mohammadi, Kasra; Shamshirband, Shahaboddin; Petković, Dalibor; Yee, Por Lip; Mansor, Zulkefli

    2016-01-01

    Highlights: • ANFIS is used to select the most relevant variables for dew point temperature prediction. • Two cities from the central and south central parts of Iran are selected as case studies. • Influence of 5 parameters on dew point temperature is evaluated. • Appropriate selection of input variables has a notable effect on prediction. • Considering the most relevant combination of 2 parameters would be more suitable. - Abstract: In this research work, for the first time, the adaptive neuro fuzzy inference system (ANFIS) is employed to propose an approach for identifying the most significant parameters for prediction of daily dew point temperature (T_d_e_w). The ANFIS process for variable selection is implemented, which includes a number of ways to recognize the parameters offering favorable predictions. According to the physical factors influencing the dew formation, 8 variables of daily minimum, maximum and average air temperatures (T_m_i_n, T_m_a_x and T_a_v_g), relative humidity (R_h), atmospheric pressure (P), water vapor pressure (V_P), sunshine hour (n) and horizontal global solar radiation (H) are considered to investigate their effects on T_d_e_w. The used data include 7 years daily measured data of two Iranian cities located in the central and south central parts of the country. The results indicate that despite climate difference between the considered case studies, for both stations, V_P is the most influential variable while R_h is the least relevant element. Furthermore, the combination of T_m_i_n and V_P is recognized as the most influential set to predict T_d_e_w. The conducted examinations show that there is a remarkable difference between the errors achieved for most and less relevant input parameters, which highlights the importance of appropriate selection of input parameters. The use of more than two inputs may not be advisable and appropriate; thus, considering the most relevant combination of 2 parameters would be more suitable

  14. Statistical MOSFET Parameter Extraction with Parameter Selection for Minimal Point Measurement

    Directory of Open Access Journals (Sweden)

    Marga Alisjahbana

    2013-11-01

    Full Text Available A method to statistically extract MOSFET model parameters from a minimal number of transistor I(V characteristic curve measurements, taken during fabrication process monitoring. It includes a sensitivity analysis of the model, test/measurement point selection, and a parameter extraction experiment on the process data. The actual extraction is based on a linear error model, the sensitivity of the MOSFET model with respect to the parameters, and Newton-Raphson iterations. Simulated results showed good accuracy of parameter extraction and I(V curve fit for parameter deviations of up 20% from nominal values, including for a process shift of 10% from nominal.

  15. Testing statistical hypotheses

    CERN Document Server

    Lehmann, E L

    2005-01-01

    The third edition of Testing Statistical Hypotheses updates and expands upon the classic graduate text, emphasizing optimality theory for hypothesis testing and confidence sets. The principal additions include a rigorous treatment of large sample optimality, together with the requisite tools. In addition, an introduction to the theory of resampling methods such as the bootstrap is developed. The sections on multiple testing and goodness of fit testing are expanded. The text is suitable for Ph.D. students in statistics and includes over 300 new problems out of a total of more than 760. E.L. Lehmann is Professor of Statistics Emeritus at the University of California, Berkeley. He is a member of the National Academy of Sciences and the American Academy of Arts and Sciences, and the recipient of honorary degrees from the University of Leiden, The Netherlands and the University of Chicago. He is the author of Elements of Large-Sample Theory and (with George Casella) he is also the author of Theory of Point Estimat...

  16. A Point Kinetics Model for Estimating Neutron Multiplication of Bare Uranium Metal in Tagged Neutron Measurements

    International Nuclear Information System (INIS)

    Tweardy, Matthew C.; McConchie, Seth; Hayward, Jason P.

    2017-01-01

    An extension of the point kinetics model is developed in this paper to describe the neutron multiplicity response of a bare uranium object under interrogation by an associated particle imaging deuterium-tritium (D-T) measurement system. This extended model is used to estimate the total neutron multiplication of the uranium. Both MCNPX-PoliMi simulations and data from active interrogation measurements of highly enriched and depleted uranium geometries are used to evaluate the potential of this method and to identify the sources of systematic error. The detection efficiency correction for measured coincidence response is identified as a large source of systematic error. If the detection process is not considered, results suggest that the method can estimate total multiplication to within 13% of the simulated value. Values for multiplicity constants in the point kinetics equations are sensitive to enrichment due to (n, xn) interactions by D-T neutrons and can introduce another significant source of systematic bias. This can theoretically be corrected if isotopic composition is known a priori. Finally, the spatial dependence of multiplication is also suspected of introducing further systematic bias for high multiplication uranium objects.

  17. Connecting functional and statistical definitions of genotype by genotype interactions in coevolutionary studies

    Directory of Open Access Journals (Sweden)

    Katy Denise Heath

    2014-04-01

    Full Text Available Predicting how species interactions evolve requires that we understand the mechanistic basis of coevolution, and thus the functional genotype-by-genotype interactions (G × G that drive reciprocal natural selection. Theory on host-parasite coevolution provides testable hypotheses for empiricists, but depends upon models of functional G × G that remain loosely tethered to the molecular details of any particular system. In practice, reciprocal cross-infection studies are often used to partition the variation in infection or fitness in a population that is attributable to G × G (statistical G × G. Here we use simulations to demonstrate that within-population statistical G × G likely tells us little about the existence of coevolution, its strength, or the genetic basis of functional G × G. Combined with studies of multiple populations or points in time, mapping and molecular techniques can bridge the gap between natural variation and mechanistic models of coevolution, while model-based statistics can formally confront coevolutionary models with cross-infection data. Together these approaches provide a robust framework for inferring the infection genetics underlying statistical G × G, helping unravel the genetic basis of coevolution.

  18. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

    Directory of Open Access Journals (Sweden)

    Lei Jia

    Full Text Available Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG and melting temperature change (dTm were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.

  19. Two-point versus multiple-point geostatistics: the ability of geostatistical methods to capture complex geobodies and their facies associations—an application to a channelized carbonate reservoir, southwest Iran

    International Nuclear Information System (INIS)

    Hashemi, Seyyedhossein; Javaherian, Abdolrahim; Ataee-pour, Majid; Khoshdel, Hossein

    2014-01-01

    Facies models try to explain facies architectures which have a primary control on the subsurface heterogeneities and the fluid flow characteristics of a given reservoir. In the process of facies modeling, geostatistical methods are implemented to integrate different sources of data into a consistent model. The facies models should describe facies interactions; the shape and geometry of the geobodies as they occur in reality. Two distinct categories of geostatistical techniques are two-point and multiple-point (geo) statistics (MPS). In this study, both of the aforementioned categories were applied to generate facies models. A sequential indicator simulation (SIS) and a truncated Gaussian simulation (TGS) represented two-point geostatistical methods, and a single normal equation simulation (SNESIM) selected as an MPS simulation representative. The dataset from an extremely channelized carbonate reservoir located in southwest Iran was applied to these algorithms to analyze their performance in reproducing complex curvilinear geobodies. The SNESIM algorithm needs consistent training images (TI) in which all possible facies architectures that are present in the area are included. The TI model was founded on the data acquired from modern occurrences. These analogies delivered vital information about the possible channel geometries and facies classes that are typically present in those similar environments. The MPS results were conditioned to both soft and hard data. Soft facies probabilities were acquired from a neural network workflow. In this workflow, seismic-derived attributes were implemented as the input data. Furthermore, MPS realizations were conditioned to hard data to guarantee the exact positioning and continuity of the channel bodies. A geobody extraction workflow was implemented to extract the most certain parts of the channel bodies from the seismic data. These extracted parts of the channel bodies were applied to the simulation workflow as hard data

  20. Translating visual information into action predictions: Statistical learning in action and nonaction contexts.

    Science.gov (United States)

    Monroy, Claire D; Gerson, Sarah A; Hunnius, Sabine

    2018-05-01

    Humans are sensitive to the statistical regularities in action sequences carried out by others. In the present eyetracking study, we investigated whether this sensitivity can support the prediction of upcoming actions when observing unfamiliar action sequences. In two between-subjects conditions, we examined whether observers would be more sensitive to statistical regularities in sequences performed by a human agent versus self-propelled 'ghost' events. Secondly, we investigated whether regularities are learned better when they are associated with contingent effects. Both implicit and explicit measures of learning were compared between agent and ghost conditions. Implicit learning was measured via predictive eye movements to upcoming actions or events, and explicit learning was measured via both uninstructed reproduction of the action sequences and verbal reports of the regularities. The findings revealed that participants, regardless of condition, readily learned the regularities and made correct predictive eye movements to upcoming events during online observation. However, different patterns of explicit-learning outcomes emerged following observation: Participants were most likely to re-create the sequence regularities and to verbally report them when they had observed an actor create a contingent effect. These results suggest that the shift from implicit predictions to explicit knowledge of what has been learned is facilitated when observers perceive another agent's actions and when these actions cause effects. These findings are discussed with respect to the potential role of the motor system in modulating how statistical regularities are learned and used to modify behavior.

  1. A comparison of random forest regression and multiple linear regression for prediction in neuroscience.

    Science.gov (United States)

    Smith, Paul F; Ganesh, Siva; Liu, Ping

    2013-10-30

    Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. A Method of Calculating Functional Independence Measure at Discharge from Functional Independence Measure Effectiveness Predicted by Multiple Regression Analysis Has a High Degree of Predictive Accuracy.

    Science.gov (United States)

    Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru

    2017-09-01

    Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  3. Distributed video coding with multiple side information

    DEFF Research Database (Denmark)

    Huang, Xin; Brites, C.; Ascenso, J.

    2009-01-01

    Distributed Video Coding (DVC) is a new video coding paradigm which mainly exploits the source statistics at the decoder based on the availability of some decoder side information. The quality of the side information has a major impact on the DVC rate-distortion (RD) performance in the same way...... the quality of the predictions had a major impact in predictive video coding. In this paper, a DVC solution exploiting multiple side information is proposed; the multiple side information is generated by frame interpolation and frame extrapolation targeting to improve the side information of a single...

  4. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    Science.gov (United States)

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  5. Identifying influential data points in hydrological model calibration and their impact on streamflow predictions

    Science.gov (United States)

    Wright, David; Thyer, Mark; Westra, Seth

    2015-04-01

    Highly influential data points are those that have a disproportionately large impact on model performance, parameters and predictions. However, in current hydrological modelling practice the relative influence of individual data points on hydrological model calibration is not commonly evaluated. This presentation illustrates and evaluates several influence diagnostics tools that hydrological modellers can use to assess the relative influence of data. The feasibility and importance of including influence detection diagnostics as a standard tool in hydrological model calibration is discussed. Two classes of influence diagnostics are evaluated: (1) computationally demanding numerical "case deletion" diagnostics; and (2) computationally efficient analytical diagnostics, based on Cook's distance. These diagnostics are compared against hydrologically orientated diagnostics that describe changes in the model parameters (measured through the Mahalanobis distance), performance (objective function displacement) and predictions (mean and maximum streamflow). These influence diagnostics are applied to two case studies: a stage/discharge rating curve model, and a conceptual rainfall-runoff model (GR4J). Removing a single data point from the calibration resulted in differences to mean flow predictions of up to 6% for the rating curve model, and differences to mean and maximum flow predictions of up to 10% and 17%, respectively, for the hydrological model. When using the Nash-Sutcliffe efficiency in calibration, the computationally cheaper Cook's distance metrics produce similar results to the case-deletion metrics at a fraction of the computational cost. However, Cooks distance is adapted from linear regression with inherit assumptions on the data and is therefore less flexible than case deletion. Influential point detection diagnostics show great potential to improve current hydrological modelling practices by identifying highly influential data points. The findings of this

  6. Statistics for experimentalists

    CERN Document Server

    Cooper, B E

    2014-01-01

    Statistics for Experimentalists aims to provide experimental scientists with a working knowledge of statistical methods and search approaches to the analysis of data. The book first elaborates on probability and continuous probability distributions. Discussions focus on properties of continuous random variables and normal variables, independence of two random variables, central moments of a continuous distribution, prediction from a normal distribution, binomial probabilities, and multiplication of probabilities and independence. The text then examines estimation and tests of significance. Topics include estimators and estimates, expected values, minimum variance linear unbiased estimators, sufficient estimators, methods of maximum likelihood and least squares, and the test of significance method. The manuscript ponders on distribution-free tests, Poisson process and counting problems, correlation and function fitting, balanced incomplete randomized block designs and the analysis of covariance, and experiment...

  7. On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.

    Directory of Open Access Journals (Sweden)

    Paul B Conn

    Full Text Available Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH, developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models.

  8. C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families

    Directory of Open Access Journals (Sweden)

    Cutler Sean R

    2007-06-01

    Full Text Available Abstract Background The carboxy termini of proteins are a frequent site of activity for a variety of biologically important functions, ranging from post-translational modification to protein targeting. Several short peptide motifs involved in protein sorting roles and dependent upon their proximity to the C-terminus for proper function have already been characterized. As a limited number of such motifs have been identified, the potential exists for genome-wide statistical analysis and comparative genomics to reveal novel peptide signatures functioning in a C-terminal dependent manner. We have applied a novel methodology to the prediction of C-terminal-anchored peptide motifs involving a simple z-statistic and several techniques for improving the signal-to-noise ratio. Results We examined the statistical over-representation of position-specific C-terminal tripeptides in 7 eukaryotic proteomes. Sequence randomization models and simple-sequence masking were applied to the successful reduction of background noise. Similarly, as C-terminal homology among members of large protein families may artificially inflate tripeptide counts in an irrelevant and obfuscating manner, gene-family clustering was performed prior to the analysis in order to assess tripeptide over-representation across protein families as opposed to across all proteins. Finally, comparative genomics was used to identify tripeptides significantly occurring in multiple species. This approach has been able to predict, to our knowledge, all C-terminally anchored targeting motifs present in the literature. These include the PTS1 peroxisomal targeting signal (SKL*, the ER-retention signal (K/HDEL*, the ER-retrieval signal for membrane bound proteins (KKxx*, the prenylation signal (CC* and the CaaX box prenylation motif. In addition to a high statistical over-representation of these known motifs, a collection of significant tripeptides with a high propensity for biological function exists

  9. Reproducible statistical analysis with multiple languages

    DEFF Research Database (Denmark)

    Lenth, Russell; Højsgaard, Søren

    2011-01-01

    This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....

  10. Methods of fast, multiple-point in vivo T1 determination

    International Nuclear Information System (INIS)

    Zhang, Y.; Spigarelli, M.; Fencil, L.E.; Yeung, H.N.

    1989-01-01

    Two methods of rapid, multiple-point determination of T1 in vivo have been evaluated with a phantom consisting of vials of gel in different Mn + + concentrations. The first method was an inversion-recovery- on-the-fly technique, and the second method used a variable- tip-angle (α) progressive saturation with two sub- sequences of different repetition times. In the first method, 1/T1 was evaluated by an exponential fit. In the second method, 1/T1 was obtained iteratively with a linear fit and then readjusted together with α to a model equation until self-consistency was reached

  11. Rapid point-of-care breath test for biomarkers of breast cancer and abnormal mammograms.

    Directory of Open Access Journals (Sweden)

    Michael Phillips

    Full Text Available BACKGROUND: Previous studies have reported volatile organic compounds (VOCs in breath as biomarkers of breast cancer and abnormal mammograms, apparently resulting from increased oxidative stress and cytochrome p450 induction. We evaluated a six-minute point-of-care breath test for VOC biomarkers in women screened for breast cancer at centers in the USA and the Netherlands. METHODS: 244 women had a screening mammogram (93/37 normal/abnormal or a breast biopsy (cancer/no cancer 35/79. A mobile point-of-care system collected and concentrated breath and air VOCs for analysis with gas chromatography and surface acoustic wave detection. Chromatograms were segmented into a time series of alveolar gradients (breath minus room air. Segmental alveolar gradients were ranked as candidate biomarkers by C-statistic value (area under curve [AUC] of receiver operating characteristic [ROC] curve. Multivariate predictive algorithms were constructed employing significant biomarkers identified with multiple Monte Carlo simulations and cross validated with a leave-one-out (LOO procedure. RESULTS: Performance of breath biomarker algorithms was determined in three groups: breast cancer on biopsy versus normal screening mammograms (81.8% sensitivity, 70.0% specificity, accuracy 79% (73% on LOO [C-statistic value], negative predictive value 99.9%; normal versus abnormal screening mammograms (86.5% sensitivity, 66.7% specificity, accuracy 83%, 62% on LOO; and cancer versus no cancer on breast biopsy (75.8% sensitivity, 74.0% specificity, accuracy 78%, 67% on LOO. CONCLUSIONS: A pilot study of a six-minute point-of-care breath test for volatile biomarkers accurately identified women with breast cancer and with abnormal mammograms. Breath testing could potentially reduce the number of needless mammograms without loss of diagnostic sensitivity.

  12. System for prediction and determination of the sub critic multiplication

    International Nuclear Information System (INIS)

    Martinez, Aquilino S.; Pereira, Valmir; Silva, Fernando C. da

    1997-01-01

    It is presented a concept of a system which may be used to calculate and anticipate the subcritical multiplication of a PWR nuclear power plant. The system is divided into two different modules. The first module allows the theoretical prediction of the subcritical multiplication factor through the solution of the multigroup diffusion equation. The second module determines this factor based on the data acquired from the neutron detectors of a NPP external nuclear detection system. (author). 3 refs., 3 figs., 2 tabs

  13. Estimation and prediction of convection-diffusion-reaction systems from point measurement

    NARCIS (Netherlands)

    Vries, D.

    2008-01-01

    Different procedures with respect to estimation and prediction of systems characterized by convection, diffusion and reactions on the basis of point measurement data, have been studied. Two applications of these convection-diffusion-reaction (CDR) systems have been used as a case study of the

  14. Statistical prediction of biomethane potentials based on the composition of lignocellulosic biomass

    DEFF Research Database (Denmark)

    Thomsen, Sune Tjalfe; Spliid, Henrik; Østergård, Hanne

    2014-01-01

    Mixture models are introduced as a new and stronger methodology for statistical prediction of biomethane potentials (BPM) from lignocellulosic biomass compared to the linear regression models previously used. A large dataset from literature combined with our own data were analysed using canonical...

  15. Robust set-point regulation for ecological models with multiple management goals.

    Science.gov (United States)

    Guiver, Chris; Mueller, Markus; Hodgson, Dave; Townley, Stuart

    2016-05-01

    Population managers will often have to deal with problems of meeting multiple goals, for example, keeping at specific levels both the total population and population abundances in given stage-classes of a stratified population. In control engineering, such set-point regulation problems are commonly tackled using multi-input, multi-output proportional and integral (PI) feedback controllers. Building on our recent results for population management with single goals, we develop a PI control approach in a context of multi-objective population management. We show that robust set-point regulation is achieved by using a modified PI controller with saturation and anti-windup elements, both described in the paper, and illustrate the theory with examples. Our results apply more generally to linear control systems with positive state variables, including a class of infinite-dimensional systems, and thus have broader appeal.

  16. Probably not future prediction using probability and statistical inference

    CERN Document Server

    Dworsky, Lawrence N

    2008-01-01

    An engaging, entertaining, and informative introduction to probability and prediction in our everyday lives Although Probably Not deals with probability and statistics, it is not heavily mathematical and is not filled with complex derivations, proofs, and theoretical problem sets. This book unveils the world of statistics through questions such as what is known based upon the information at hand and what can be expected to happen. While learning essential concepts including "the confidence factor" and "random walks," readers will be entertained and intrigued as they move from chapter to chapter. Moreover, the author provides a foundation of basic principles to guide decision making in almost all facets of life including playing games, developing winning business strategies, and managing personal finances. Much of the book is organized around easy-to-follow examples that address common, everyday issues such as: How travel time is affected by congestion, driving speed, and traffic lights Why different gambling ...

  17. Three-dimensional Reconstruction and Homogenization of Heterogeneous Materials Using Statistical Correlation Functions and FEM

    Energy Technology Data Exchange (ETDEWEB)

    Baniassadi, Majid; Mortazavi, Behzad; Hamedani, Amani; Garmestani, Hamid; Ahzi, Said; Fathi-Torbaghan, Madjid; Ruch, David; Khaleel, Mohammad A.

    2012-01-31

    In this study, a previously developed reconstruction methodology is extended to three-dimensional reconstruction of a three-phase microstructure, based on two-point correlation functions and two-point cluster functions. The reconstruction process has been implemented based on hybrid stochastic methodology for simulating the virtual microstructure. While different phases of the heterogeneous medium are represented by different cells, growth of these cells is controlled by optimizing parameters such as rotation, shrinkage, translation, distribution and growth rates of the cells. Based on the reconstructed microstructure, finite element method (FEM) was used to compute the effective elastic modulus and effective thermal conductivity. A statistical approach, based on two-point correlation functions, was also used to directly estimate the effective properties of the developed microstructures. Good agreement between the predicted results from FEM analysis and statistical methods was found confirming the efficiency of the statistical methods for prediction of thermo-mechanical properties of three-phase composites.

  18. The goal of ape pointing.

    Science.gov (United States)

    Halina, Marta; Liebal, Katja; Tomasello, Michael

    2018-01-01

    Captive great apes regularly use pointing gestures in their interactions with humans. However, the precise function of this gesture is unknown. One possibility is that apes use pointing primarily to direct attention (as in "please look at that"); another is that they point mainly as an action request (such as "can you give that to me?"). We investigated these two possibilities here by examining how the looking behavior of recipients affects pointing in chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). Upon pointing to food, subjects were faced with a recipient who either looked at the indicated object (successful-look) or failed to look at the indicated object (failed-look). We predicted that, if apes point primarily to direct attention, subjects would spend more time pointing in the failed-look condition because the goal of their gesture had not been met. Alternatively, we expected that, if apes point primarily to request an object, subjects would not differ in their pointing behavior between the successful-look and failed-look conditions because these conditions differed only in the looking behavior of the recipient. We found that subjects did differ in their pointing behavior across the successful-look and failed-look conditions, but contrary to our prediction subjects spent more time pointing in the successful-look condition. These results suggest that apes are sensitive to the attentional states of gestural recipients, but their adjustments are aimed at multiple goals. We also found a greater number of individuals with a strong right-hand than left-hand preference for pointing.

  19. Modeling fixation locations using spatial point processes.

    Science.gov (United States)

    Barthelmé, Simon; Trukenbrod, Hans; Engbert, Ralf; Wichmann, Felix

    2013-10-01

    Whenever eye movements are measured, a central part of the analysis has to do with where subjects fixate and why they fixated where they fixated. To a first approximation, a set of fixations can be viewed as a set of points in space; this implies that fixations are spatial data and that the analysis of fixation locations can be beneficially thought of as a spatial statistics problem. We argue that thinking of fixation locations as arising from point processes is a very fruitful framework for eye-movement data, helping turn qualitative questions into quantitative ones. We provide a tutorial introduction to some of the main ideas of the field of spatial statistics, focusing especially on spatial Poisson processes. We show how point processes help relate image properties to fixation locations. In particular we show how point processes naturally express the idea that image features' predictability for fixations may vary from one image to another. We review other methods of analysis used in the literature, show how they relate to point process theory, and argue that thinking in terms of point processes substantially extends the range of analyses that can be performed and clarify their interpretation.

  20. Testing earthquake prediction algorithms: Statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992-1997

    Science.gov (United States)

    Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.

    1999-01-01

    Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier

  1. Etiologies of Acute Undifferentiated Fever and Clinical Prediction of Scrub Typhus in a Non-Tropical Endemic Area

    Science.gov (United States)

    Jung, Ho-Chul; Chon, Sung-Bin; Oh, Won Sup; Lee, Dong-Hyun; Lee, Ho-Jin

    2015-01-01

    Scrub typhus usually presents as acute undifferentiated fever. This cross-sectional study included adult patients presenting with acute undifferentiated fever defined as any febrile illness for ≤ 14 days without evidence of localized infection. Scrub typhus cases were defined by an antibody titer of a ≥ fourfold increase in paired sera, a ≥ 1:160 in a single serum using indirect immunofluorescence assay, or a positive result of the immunochromatographic test. Multiple regression analysis identified predictors associated with scrub typhus to develop a prediction rule. Of 250 cases with known etiology of acute undifferentiated fever, influenza (28.0%), hepatitis A (25.2%), and scrub typhus (16.4%) were major causes. A prediction rule for identifying suspected cases of scrub typhus consisted of age ≥ 65 years (two points), recent fieldwork/outdoor activities (one point), onset of illness during an outbreak period (two points), myalgia (one point), and eschar (two points). The c statistic was 0.977 (95% confidence interval = 0.960–0.994). At a cutoff value ≥ 4, the sensitivity and specificity were 92.7% (79.0–98.1%) and 90.9% (86.0–94.3%), respectively. Scrub typhus, the third leading cause of acute undifferentiated fever in our region, can be identified early using the prediction rule. PMID:25448236

  2. Effect of incisor inclination changes on cephalometric points a and b

    International Nuclear Information System (INIS)

    Hassan, S.; Shaikh, A.; Fida, M.

    2015-01-01

    The position of cephalometric points A and B are liable to be affected by alveolar remodelling caused by orthodontic tooth movement during incisor retraction. This study was conducted to evaluate the change in positions of cephalometric points A and B in sagittal and vertical dimensions due to change in incisor inclinations. Methods: Total sample of 31 subjects were recruited into the study. The inclusion criteria were extraction of premolars in upper and lower arches, completion of growth and orthodontic treatment. The exclusion criteria were patients with craniofacial anomalies and history of orthodontic treatment. By superimposition of pre and post treatment tracings, various linear and angular parameters were measured. Various tests and multiple linear regression analysis were performed to determine changes in outcome variables. Statistically significant p-value was <0.05. Results:One-sample t-test showed that change in position of only point A was statistically significant which was 1.61mm (p<0.01) in sagittal direction and 1.49mm (p<0.01) in vertical direction. Multiple linear regression analysis showed that if we retrocline upper incisor by 100, the point A will move superiorly by 0.6mm. Conclusions: Total change in the position of point A is in a downward and forward direction. Total Change in upper incisors inclinations causes change in position of point A only in vertical direction. (author)

  3. Novel evaluation metrics for sparse spatio-temporal point process hotspot predictions - a crime case study

    OpenAIRE

    Adepeju, M.; Rosser, G.; Cheng, T.

    2016-01-01

    Many physical and sociological processes are represented as discrete events in time and space. These spatio-temporal point processes are often sparse, meaning that they cannot be aggregated and treated with conventional regression models. Models based on the point process framework may be employed instead for prediction purposes. Evaluating the predictive performance of these models poses a unique challenge, as the same sparseness prevents the use of popular measures such as the root mean squ...

  4. Statistical Modeling of Antenna: Urban Equipment Interactions for LTE Access Points

    Directory of Open Access Journals (Sweden)

    Xin Zeng

    2012-01-01

    Full Text Available The latest standards for wireless networks such as LTE are essentially based on small cells in order to achieve a large network capacity. This applies for antennas to be deployed at street level or even within buildings. However, antennas are commonly designed, simulated, and measured in ideal conditions, which is not the real situation for most applications where antennas are often deployed in proximity to objects acting as disturbers. In this paper, three conventional wireless access point scenarios (antenna-wall, antenna-shelter, and antenna lamppost are investigated for directional or omnidirectional antennas. The paper first addresses the definition of three performance indicators for such scenarios and secondly uses such parameters towards the statistical analysis of the interactions between the wall and the antennas.

  5. Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

    Science.gov (United States)

    Wang, Ming; Long, Qi

    2016-09-01

    Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.

  6. Statistical approach to predict compressive strength of high workability slag-cement mortars

    International Nuclear Information System (INIS)

    Memon, N.A.; Memon, N.A.; Sumadi, S.R.

    2009-01-01

    This paper reports an attempt made to develop empirical expressions to estimate/ predict the compressive strength of high workability slag-cement mortars. Experimental data of 54 mix mortars were used. The mortars were prepared with slag as cement replacement of the order of 0, 50 and 60%. The flow (workability) was maintained at 136+-3%. The numerical and statistical analysis was performed by using database computer software Microsoft Office Excel 2003. Three empirical mathematical models were developed to estimate/predict 28 days compressive strength of high workability slag cement-mortars with 0, 50 and 60% slag which predict the values accurate between 97 and 98%. Finally a generalized empirical mathematical model was proposed which can predict 28 days compressive strength of high workability mortars up to degree of accuracy 95%. (author)

  7. Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields

    Science.gov (United States)

    Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.

    2018-01-01

    This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.

  8. The Meta-Analysis of Clinical Judgment Project: Fifty-Six Years of Accumulated Research on Clinical Versus Statistical Prediction

    Science.gov (United States)

    Aegisdottir, Stefania; White, Michael J.; Spengler, Paul M.; Maugherman, Alan S.; Anderson, Linda A.; Cook, Robert S.; Nichols, Cassandra N.; Lampropoulos, Georgios K.; Walker, Blain S.; Cohen, Genna; Rush, Jeffrey D.

    2006-01-01

    Clinical predictions made by mental health practitioners are compared with those using statistical approaches. Sixty-seven studies were identified from a comprehensive search of 56 years of research; 92 effect sizes were derived from these studies. The overall effect of clinical versus statistical prediction showed a somewhat greater accuracy for…

  9. Statistical model predictions for p+p and Pb+Pb collisions at LHC

    NARCIS (Netherlands)

    Kraus, I.; Cleymans, J.; Oeschler, H.; Redlich, K.; Wheaton, S.

    2009-01-01

    Particle production in p+p and central collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in

  10. Evaluation of clustering statistics with N-body simulations

    International Nuclear Information System (INIS)

    Quinn, T.R.

    1986-01-01

    Two series of N-body simulations are used to determine the effectiveness of various clustering statistics in revealing initial conditions from evolved models. All the simulations contained 16384 particles and were integrated with the PPPM code. One series is a family of models with power at only one wavelength. The family contains five models with the wavelength of the power separated by factors of √2. The second series is a family of all equal power combinations of two wavelengths taken from the first series. The clustering statistics examined are the two point correlation function, the multiplicity function, the nearest neighbor distribution, the void probability distribution, the distribution of counts in cells, and the peculiar velocity distribution. It is found that the covariance function, the nearest neighbor distribution, and the void probability distribution are relatively insensitive to the initial conditions. The distribution of counts in cells show a little more sensitivity, but the multiplicity function is the best of the statistics considered for revealing the initial conditions

  11. Statistical and extra-statistical considerations in differential item functioning analyses

    Directory of Open Access Journals (Sweden)

    G. K. Huysamen

    2004-10-01

    Full Text Available This article briefly describes the main procedures for performing differential item functioning (DIF analyses and points out some of the statistical and extra-statistical implications of these methods. Research findings on the sources of DIF, including those associated with translated tests, are reviewed. As DIF analyses are oblivious of correlations between a test and relevant criteria, the elimination of differentially functioning items does not necessarily improve predictive validity or reduce any predictive bias. The implications of the results of past DIF research for test development in the multilingual and multi-cultural South African society are considered. Opsomming Hierdie artikel beskryf kortliks die hoofprosedures vir die ontleding van differensiële itemfunksionering (DIF en verwys na sommige van die statistiese en buite-statistiese implikasies van hierdie metodes. ’n Oorsig word verskaf van navorsingsbevindings oor die bronne van DIF, insluitend dié by vertaalde toetse. Omdat DIF-ontledings nie die korrelasies tussen ’n toets en relevante kriteria in ag neem nie, sal die verwydering van differensieel-funksionerende items nie noodwendig voorspellingsgeldigheid verbeter of voorspellingsydigheid verminder nie. Die implikasies van vorige DIF-navorsingsbevindings vir toetsontwikkeling in die veeltalige en multikulturele Suid-Afrikaanse gemeenskap word oorweeg.

  12. A prediction method based on wavelet transform and multiple models fusion for chaotic time series

    International Nuclear Information System (INIS)

    Zhongda, Tian; Shujiang, Li; Yanhong, Wang; Yi, Sha

    2017-01-01

    In order to improve the prediction accuracy of chaotic time series, a prediction method based on wavelet transform and multiple models fusion is proposed. The chaotic time series is decomposed and reconstructed by wavelet transform, and approximate components and detail components are obtained. According to different characteristics of each component, least squares support vector machine (LSSVM) is used as predictive model for approximation components. At the same time, an improved free search algorithm is utilized for predictive model parameters optimization. Auto regressive integrated moving average model (ARIMA) is used as predictive model for detail components. The multiple prediction model predictive values are fusion by Gauss–Markov algorithm, the error variance of predicted results after fusion is less than the single model, the prediction accuracy is improved. The simulation results are compared through two typical chaotic time series include Lorenz time series and Mackey–Glass time series. The simulation results show that the prediction method in this paper has a better prediction.

  13. Investigating the interactive role of stressful life events, reinforcement sensitivity and personality traits in prediction of the severity of Multiple Sclerosis (MS symptoms

    Directory of Open Access Journals (Sweden)

    2017-06-01

    Full Text Available Background & Objective: Multiple sclerosis is a chronic neurological condition recognized by demyelination in the central nervous system. The present study was conducted to investigate the interactive role of stressful life events, reinforcement sensitivity, and personality traits in prediction of the severity of symptoms of Multiple sclerosis (MS symptoms. Materials & Methods: This is a correlational study whose statistical population consisted of all the patients suffering from Multiple Sclerosis in Shiraz in the first half of 1394, among whom 162 patients were included in this research by means of purposive sampling method. Five-Factor Personality Inventory, Jackson Personality Inventory, Stressful Life Events Scale, and Expanded Disability Status Scale (EDSS were utilised as research tools. In order to analyze the data, descriptive and inferential methods were used. The data were analysed using Pearson correlation and hierarchical regression. Results: The findings revealed that stressful life events (β = 0.41, p <0.001, Behavioral Inhibition System (β = 0.26, p<0.05, and neuroticism index (β = 0.92, p <0.05 were able to predict variance of scores of the severity of symptoms of Multiple Sclerosis significantly. Conclusion: Stressful life events, Behavioral Inhibition System, and neuroticism showed a significant relationship with the severity of symptoms of Multiple Sclerosis; thus, it seems that interaction of personality traits and environmental conditions are among influential factors of the severity of symptoms of Multiple Sclerosis. This fact implies that individuals' personal traits play an eminent role in the progression of the disease.

  14. Very high multiplicity hadron processes

    International Nuclear Information System (INIS)

    Mandzhavidze, I.; Sisakyan, A.

    2000-01-01

    The paper contains a description of a first attempt to understand the extremely inelastic high energy hadron collisions, when the multiplicity of produced hadrons considerably exceeds its mean value. Problems with existing model predictions are discussed. The real-time finite-temperature S-matrix theory is built to have a possibility to find model-free predictions. This allows one to include the statistical effects into consideration and build the phenomenology. The questions to experiment are formulated at the very end of the paper

  15. CADASTER QSPR Models for Predictions of Melting and Boiling Points of Perfluorinated Chemicals.

    Science.gov (United States)

    Bhhatarai, Barun; Teetz, Wolfram; Liu, Tao; Öberg, Tomas; Jeliazkova, Nina; Kochev, Nikolay; Pukalov, Ognyan; Tetko, Igor V; Kovarich, Simona; Papa, Ester; Gramatica, Paola

    2011-03-14

    Quantitative structure property relationship (QSPR) studies on per- and polyfluorinated chemicals (PFCs) on melting point (MP) and boiling point (BP) are presented. The training and prediction chemicals used for developing and validating the models were selected from Syracuse PhysProp database and literatures. The available experimental data sets were split in two different ways: a) random selection on response value, and b) structural similarity verified by self-organizing-map (SOM), in order to propose reliable predictive models, developed only on the training sets and externally verified on the prediction sets. Individual linear and non-linear approaches based models developed by different CADASTER partners on 0D-2D Dragon descriptors, E-state descriptors and fragment based descriptors as well as consensus model and their predictions are presented. In addition, the predictive performance of the developed models was verified on a blind external validation set (EV-set) prepared using PERFORCE database on 15 MP and 25 BP data respectively. This database contains only long chain perfluoro-alkylated chemicals, particularly monitored by regulatory agencies like US-EPA and EU-REACH. QSPR models with internal and external validation on two different external prediction/validation sets and study of applicability-domain highlighting the robustness and high accuracy of the models are discussed. Finally, MPs for additional 303 PFCs and BPs for 271 PFCs were predicted for which experimental measurements are unknown. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Quantitative Prediction of Coalbed Gas Content Based on Seismic Multiple-Attribute Analyses

    Directory of Open Access Journals (Sweden)

    Renfang Pan

    2015-09-01

    Full Text Available Accurate prediction of gas planar distribution is crucial to selection and development of new CBM exploration areas. Based on seismic attributes, well logging and testing data we found that seismic absorption attenuation, after eliminating the effects of burial depth, shows an evident correlation with CBM gas content; (positive structure curvature has a negative correlation with gas content; and density has a negative correlation with gas content. It is feasible to use the hydrocarbon index (P*G and pseudo-Poisson ratio attributes for detection of gas enrichment zones. Based on seismic multiple-attribute analyses, a multiple linear regression equation was established between the seismic attributes and gas content at the drilling wells. Application of this equation to the seismic attributes at locations other than the drilling wells yielded a quantitative prediction of planar gas distribution. Prediction calculations were performed for two different models, one using pre-stack inversion and the other one disregarding pre-stack inversion. A comparison of the results indicates that both models predicted a similar trend for gas content distribution, except that the model using pre-stack inversion yielded a prediction result with considerably higher precision than the other model.

  17. Sensitivity of the Hydrogen Epoch of Reionization Array and its build-out stages to one-point statistics from redshifted 21 cm observations

    Science.gov (United States)

    Kittiwisit, Piyanat; Bowman, Judd D.; Jacobs, Daniel C.; Beardsley, Adam P.; Thyagarajan, Nithyanandan

    2018-03-01

    We present a baseline sensitivity analysis of the Hydrogen Epoch of Reionization Array (HERA) and its build-out stages to one-point statistics (variance, skewness, and kurtosis) of redshifted 21 cm intensity fluctuation from the Epoch of Reionization (EoR) based on realistic mock observations. By developing a full-sky 21 cm light-cone model, taking into account the proper field of view and frequency bandwidth, utilizing a realistic measurement scheme, and assuming perfect foreground removal, we show that HERA will be able to recover statistics of the sky model with high sensitivity by averaging over measurements from multiple fields. All build-out stages will be able to detect variance, while skewness and kurtosis should be detectable for HERA128 and larger. We identify sample variance as the limiting constraint of the measurements at the end of reionization. The sensitivity can also be further improved by performing frequency windowing. In addition, we find that strong sample variance fluctuation in the kurtosis measured from an individual field of observation indicates the presence of outlying cold or hot regions in the underlying fluctuations, a feature that can potentially be used as an EoR bubble indicator.

  18. A density functional theory based approach for predicting melting points of ionic liquids.

    Science.gov (United States)

    Chen, Lihua; Bryantsev, Vyacheslav S

    2017-02-01

    Accurate prediction of melting points of ILs is important both from the fundamental point of view and from the practical perspective for screening ILs with low melting points and broadening their utilization in a wider temperature range. In this work, we present an ab initio approach to calculate melting points of ILs with known crystal structures and illustrate its application for a series of 11 ILs containing imidazolium/pyrrolidinium cations and halide/polyatomic fluoro-containing anions. The melting point is determined as a temperature at which the Gibbs free energy of fusion is zero. The Gibbs free energy of fusion can be expressed through the use of the Born-Fajans-Haber cycle via the lattice free energy of forming a solid IL from gaseous phase ions and the sum of the solvation free energies of ions comprising IL. Dispersion-corrected density functional theory (DFT) involving (semi)local (PBE-D3) and hybrid exchange-correlation (HSE06-D3) functionals is applied to estimate the lattice enthalpy, entropy, and free energy. The ions solvation free energies are calculated with the SMD-generic-IL solvation model at the M06-2X/6-31+G(d) level of theory under standard conditions. The melting points of ILs computed with the HSE06-D3 functional are in good agreement with the experimental data, with a mean absolute error of 30.5 K and a mean relative error of 8.5%. The model is capable of accurately reproducing the trends in melting points upon variation of alkyl substituents in organic cations and replacement one anion by another. The results verify that the lattice energies of ILs containing polyatomic fluoro-containing anions can be approximated reasonably well using the volume-based thermodynamic approach. However, there is no correlation of the computed lattice energies with molecular volume for ILs containing halide anions. Moreover, entropies of solid ILs follow two different linear relationships with molecular volume for halides and polyatomic fluoro

  19. Projecting future precipitation and temperature at sites with diverse climate through multiple statistical downscaling schemes

    Science.gov (United States)

    Vallam, P.; Qin, X. S.

    2017-10-01

    Anthropogenic-driven climate change would affect the global ecosystem and is becoming a world-wide concern. Numerous studies have been undertaken to determine the future trends of meteorological variables at different scales. Despite these studies, there remains significant uncertainty in the prediction of future climates. To examine the uncertainty arising from using different schemes to downscale the meteorological variables for the future horizons, projections from different statistical downscaling schemes were examined. These schemes included statistical downscaling method (SDSM), change factor incorporated with LARS-WG, and bias corrected disaggregation (BCD) method. Global circulation models (GCMs) based on CMIP3 (HadCM3) and CMIP5 (CanESM2) were utilized to perturb the changes in the future climate. Five study sites (i.e., Alice Springs, Edmonton, Frankfurt, Miami, and Singapore) with diverse climatic conditions were chosen for examining the spatial variability of applying various statistical downscaling schemes. The study results indicated that the regions experiencing heavy precipitation intensities were most likely to demonstrate the divergence between the predictions from various statistical downscaling methods. Also, the variance computed in projecting the weather extremes indicated the uncertainty derived from selection of downscaling tools and climate models. This study could help gain an improved understanding about the features of different downscaling approaches and the overall downscaling uncertainty.

  20. Energy dependence of multiplicity fluctuations in heavy ion collisions at 20A to 158A GeV

    CERN Document Server

    Alt, C.; Baatar, B.; Barna, D.; Bartke, J.; Betev, L.; Bialkowska, Helena; Blume, Christoph; Boimska, B.; Botje, Michiel; Bracinik, J.; Bramm, R.; Buncic, P.; Cerny, V.; Christakoglou, P.; Chung, P.; Chvala, O.; Cramer, J.G.; Csato, P.; Dinkelaker, P.; Eckardt, V.; Flierl, D.; Fodor, Zoltan; Foka, P.; Friese, Volker; Gal, J.; Gazdzicki, Marek; Genchev, V.; Georgopoulos, G.; Gladysz, E.; Grebieszkow, K.; Hegyi, S.; Hohne, C.; Kadija, K.; Karev, A.; Kikola, D.; Kliemant, M.; Kniege, S.; Kolesnikov, V.I.; Kornas, E.; Korus, R.; Kowalski, M.; Kraus, I.; Kreps, M.; Laszlo, A.; Lacey, Roy A.; van Leeuwen, M.; Levai, P.; Litov, Leandar; Makariev, M.; Malakhov, A.I.; Mateev, M.; Melkumov, G.L.; Mischke, A.; Mitrovski, M.; Molnar, J.; Mrowczynski, St.; Nicolic, V.; Palla, G.; Panagiotou, Apostolos D.; Panayotov, D.; Peryt, W.; Pikna, M.; Pluta, J.; Prindle, D.; Puhlhofer, F.; Renfordt, R.; Roland, C.; Roland, Gunther; Rybczynski, M.; Rybicki, A.; Sandoval, Andres; Schmitz, Norbert; Schuster, T.; Seyboth, P.; Sikler, F.; Sitar, B.; Skrzypczak, E.; Slodkowski, M.; Stefanek, Grzegorz; Stock, R.; Strabel, C.; Strobele, H.; Susa, T.; Szentpetery, I.; Sziklai, J.; Szuba, M.; Szymanski, P.; Trubnikov, V.; Utvic, M.; Varga, D.; Vassiliou, M.; Veres, G.I.; Vesztergombi, G.; Vranic, D.; Wetzler, A.; Wlodarczyk, Z.; Wojtaszek, A.; Yoo, I.K.

    2008-01-01

    Multiplicity fluctuations of positively, negatively and all charged hadrons in the forward hemisphere were studied in central Pb+Pb collisions at 20A, 30A, 40A, 80A and 158A GeV. The multiplicity distributions and their scaled variances $\\omega$ are presented in dependence of collision energy as well as of rapidity and transverse momentum. The distributions have bell-like shape and their scaled variances are in the range from 0.8 to 1.2 without any significant structure in their energy dependence. No indication of the critical point fluctuations are observed. The string-hadronic model UrQMD significantly overpredicts the mean, but approximately reproduces the scaled variance of the multiplicity distributions. The predictions of the statistical hadron-resonance gas model obtained within the grand-canonical and canonical ensembles disagree with the measured scaled variances. The narrower than Poissonian multiplicity fluctuations measured in numerous cases may be explained by the impact of conservation laws on f...

  1. Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy

    Science.gov (United States)

    Jia, Yi; Jannink, Jean-Luc

    2012-01-01

    Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored. PMID:23086217

  2. Comparison of classical statistical methods and artificial neural network in traffic noise prediction

    International Nuclear Information System (INIS)

    Nedic, Vladimir; Despotovic, Danijela; Cvetanovic, Slobodan; Despotovic, Milan; Babic, Sasa

    2014-01-01

    Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L eq . Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model

  3. Comparison of classical statistical methods and artificial neural network in traffic noise prediction

    Energy Technology Data Exchange (ETDEWEB)

    Nedic, Vladimir, E-mail: vnedic@kg.ac.rs [Faculty of Philology and Arts, University of Kragujevac, Jovana Cvijića bb, 34000 Kragujevac (Serbia); Despotovic, Danijela, E-mail: ddespotovic@kg.ac.rs [Faculty of Economics, University of Kragujevac, Djure Pucara Starog 3, 34000 Kragujevac (Serbia); Cvetanovic, Slobodan, E-mail: slobodan.cvetanovic@eknfak.ni.ac.rs [Faculty of Economics, University of Niš, Trg kralja Aleksandra Ujedinitelja, 18000 Niš (Serbia); Despotovic, Milan, E-mail: mdespotovic@kg.ac.rs [Faculty of Engineering, University of Kragujevac, Sestre Janjic 6, 34000 Kragujevac (Serbia); Babic, Sasa, E-mail: babicsf@yahoo.com [College of Applied Mechanical Engineering, Trstenik (Serbia)

    2014-11-15

    Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L{sub eq}. Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model.

  4. Predicting the initial freezing point and water activity of meat products from composition data

    NARCIS (Netherlands)

    Sman, van der R.G.M.; Boer, E.P.J.

    2005-01-01

    In this paper we predict the water activity and initial freezing point of food products (meat and fish) based on their composition. The prediction is based on thermodynamics (the Clausius-Clapeyron equation, the Ross equation and an approximation of the Pitzer equation). Furthermore, we have taken

  5. Prediction of transmission loss through an aircraft sidewall using statistical energy analysis

    Science.gov (United States)

    Ming, Ruisen; Sun, Jincai

    1989-06-01

    The transmission loss of randomly incident sound through an aircraft sidewall is investigated using statistical energy analysis. Formulas are also obtained for the simple calculation of sound transmission loss through single- and double-leaf panels. Both resonant and nonresonant sound transmissions can be easily calculated using the formulas. The formulas are used to predict sound transmission losses through a Y-7 propeller airplane panel. The panel measures 2.56 m x 1.38 m and has two windows. The agreement between predicted and measured values through most of the frequency ranges tested is quite good.

  6. Predicting fracture of mortar beams under three-point bending using non-extensive statistical modeling of electric emissions

    Science.gov (United States)

    Stergiopoulos, Ch.; Stavrakas, I.; Triantis, D.; Vallianatos, F.; Stonham, J.

    2015-02-01

    Weak electric signals termed as 'Pressure Stimulated Currents, PSC' are generated and detected while cement based materials are found under mechanical load, related to the creation of cracks and the consequent evolution of cracks' network in the bulk of the specimen. During the experiment a set of cement mortar beams of rectangular cross-section were subjected to Three-Point Bending (3PB). For each one of the specimens an abrupt mechanical load step was applied, increased from the low load level (Lo) to a high final value (Lh) , where Lh was different for each specimen and it was maintained constant for long time. The temporal behavior of the recorded PSC show that during the load increase a spike-like PSC emission was recorded and consequently a relaxation of the PSC, after reaching its final value, follows. The relaxation process of the PSC was studied using non-extensive statistical physics (NESP) based on Tsallis entropy equation. The behavior of the Tsallis q parameter was studied in relaxation PSCs in order to investigate its potential use as an index for monitoring the crack evolution process with a potential use in non-destructive laboratory testing of cement-based specimens of unknown internal damage level. The dependence of the q-parameter on the Lh (when Lh <0.8Lf), where Lf represents the 3PB strength of the specimen, shows an increase on the q value when the specimens are subjected to gradually higher bending loadings and reaches a maximum value close to 1.4 when the applied Lh becomes higher than 0.8Lf. While the applied Lh becomes higher than 0.9Lf the value of the q-parameter gradually decreases. This analysis of the experimental data manifests that the value of the entropic index q obtains a characteristic decrease while reaching the ultimate strength of the specimen, and thus could be used as a forerunner of the expected failure.

  7. Performance analysis of commercial multiple-input-multiple-output access point in distributed antenna system.

    Science.gov (United States)

    Fan, Yuting; Aighobahi, Anthony E; Gomes, Nathan J; Xu, Kun; Li, Jianqiang

    2015-03-23

    In this paper, we experimentally investigate the throughput of IEEE 802.11n 2x2 multiple-input-multiple-output (MIMO) signals in a radio-over-fiber-based distributed antenna system (DAS) with different fiber lengths and power imbalance. Both a MIMO-supported access point (AP) and a spatial-diversity-supported AP were separately employed in the experiments. Throughput measurements were carried out with wireless users at different locations in a typical office environment. For the different fiber length effect, the results indicate that MIMO signals can maintain high throughput when the fiber length difference between the two remote antenna units (RAUs) is under 100 m and falls quickly when the length difference is greater. For the spatial diversity signals, high throughput can be maintained even when the difference is 150 m. On the other hand, the separation of the MIMO antennas allows additional freedom in placing the antennas in strategic locations for overall improved system performance, although it may also lead to received power imbalance problems. The results show that the throughput performance drops in specific positions when the received power imbalance is above around 13 dB. Hence, there is a trade-off between the extent of the wireless coverage for moderate bit-rates and the area over which peak bit-rates can be achieved.

  8. Statistical prediction of AVB wear growth and initiation in model F steam generator tubes using Monte Carlo method

    International Nuclear Information System (INIS)

    Lee, Jae Bong; Park, Jae Hak; Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong

    2005-01-01

    The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained

  9. Statistical prediction of AVB wear growth and initiation in model F steam generator tubes using Monte Carlo method

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jae Bong; Park, Jae Hak [Chungbuk National Univ., Cheongju (Korea, Republic of); Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong [Korea Electtric Power Research Institute, Daejeon (Korea, Republic of)

    2005-07-01

    The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained.

  10. Flood statistics of simple and multiple scaling; Invarianza di scala del regime di piena

    Energy Technology Data Exchange (ETDEWEB)

    Rosso, Renzo; Mancini, Marco; Burlando, Paolo; De Michele, Carlo [Milan, Politecnico Univ. (Italy). DIIAR; Brath, Armando [Bologna, Univ. (Italy). DISTART

    1996-09-01

    The variability of flood probabilities throughout the river network is investigated by introducing the concepts of simple and multiple scaling. Flood statistics and quantiles as parametrized by drainage area are considered, and a distributed geomorphoclimatic model is used to analyze in detail their scaling properties for two river basins in Thyrrhenian Liguria (North-Western Italy). Although temporal storm precipitation and spatial runoff production are not scaling, the resulting flood flows do not display substantial deviations from statistical self-similarity or simple scaling. This result has a wide potential in order to assess the concept of hydrological homogeneity, also indicating a new route towards establishing physically-based procedures for flood frequency regionalization.

  11. Photonic crystals possessing multiple Weyl points and the experimental observation of robust surface states

    Science.gov (United States)

    Chen, Wen-Jie; Xiao, Meng; Chan, C. T.

    2016-01-01

    Weyl points, as monopoles of Berry curvature in momentum space, have captured much attention recently in various branches of physics. Realizing topological materials that exhibit such nodal points is challenging and indeed, Weyl points have been found experimentally in transition metal arsenide and phosphide and gyroid photonic crystal whose structure is complex. If realizing even the simplest type of single Weyl nodes with a topological charge of 1 is difficult, then making a real crystal carrying higher topological charges may seem more challenging. Here we design, and fabricate using planar fabrication technology, a photonic crystal possessing single Weyl points (including type-II nodes) and multiple Weyl points with topological charges of 2 and 3. We characterize this photonic crystal and find nontrivial 2D bulk band gaps for a fixed kz and the associated surface modes. The robustness of these surface states against kz-preserving scattering is experimentally observed for the first time. PMID:27703140

  12. On the performance of metrics to predict quality in point cloud representations

    Science.gov (United States)

    Alexiou, Evangelos; Ebrahimi, Touradj

    2017-09-01

    Point clouds are a promising alternative for immersive representation of visual contents. Recently, an increased interest has been observed in the acquisition, processing and rendering of this modality. Although subjective and objective evaluations are critical in order to assess the visual quality of media content, they still remain open problems for point cloud representation. In this paper we focus our efforts on subjective quality assessment of point cloud geometry, subject to typical types of impairments such as noise corruption and compression-like distortions. In particular, we propose a subjective methodology that is closer to real-life scenarios of point cloud visualization. The performance of the state-of-the-art objective metrics is assessed by considering the subjective scores as the ground truth. Moreover, we investigate the impact of adopting different test methodologies by comparing them. Advantages and drawbacks of every approach are reported, based on statistical analysis. The results and conclusions of this work provide useful insights that could be considered in future experimentation.

  13. Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens.

    Science.gov (United States)

    Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi

    2017-01-01

    High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Analysis of gene expression profiles of soft tissue sarcoma using a combination of knowledge-based filtering with integration of multiple statistics.

    Directory of Open Access Journals (Sweden)

    Anna Takahashi

    Full Text Available The diagnosis and treatment of soft tissue sarcomas (STS have been difficult. Of the diverse histological subtypes, undifferentiated pleomorphic sarcoma (UPS is particularly difficult to diagnose accurately, and its classification per se is still controversial. Recent advances in genomic technologies provide an excellent way to address such problems. However, it is often difficult, if not impossible, to identify definitive disease-associated genes using genome-wide analysis alone, primarily because of multiple testing problems. In the present study, we analyzed microarray data from 88 STS patients using a combination method that used knowledge-based filtering and a simulation based on the integration of multiple statistics to reduce multiple testing problems. We identified 25 genes, including hypoxia-related genes (e.g., MIF, SCD1, P4HA1, ENO1, and STAT1 and cell cycle- and DNA repair-related genes (e.g., TACC3, PRDX1, PRKDC, and H2AFY. These genes showed significant differential expression among histological subtypes, including UPS, and showed associations with overall survival. STAT1 showed a strong association with overall survival in UPS patients (logrank p = 1.84 × 10(-6 and adjusted p value 2.99 × 10(-3 after the permutation test. According to the literature, the 25 genes selected are useful not only as markers of differential diagnosis but also as prognostic/predictive markers and/or therapeutic targets for STS. Our combination method can identify genes that are potential prognostic/predictive factors and/or therapeutic targets in STS and possibly in other cancers. These disease-associated genes deserve further preclinical and clinical validation.

  15. Parallel Beam-Beam Simulation Incorporating Multiple Bunches and Multiple Interaction Regions

    CERN Document Server

    Jones, F W; Pieloni, T

    2007-01-01

    The simulation code COMBI has been developed to enable the study of coherent beam-beam effects in the full collision scenario of the LHC, with multiple bunches interacting at multiple crossing points over many turns. The program structure and input are conceived in a general way which allows arbitrary numbers and placements of bunches and interaction points (IP's), together with procedural options for head-on and parasitic collisions (in the strong-strong sense), beam transport, statistics gathering, harmonic analysis, and periodic output of simulation data. The scale of this problem, once we go beyond the simplest case of a pair of bunches interacting once per turn, quickly escalates into the parallel computing arena, and herein we will describe the construction of an MPI-based version of COMBI able to utilize arbitrary numbers of processors to support efficient calculation of multi-bunch multi-IP interactions and transport. Implementing the parallel version did not require extensive disruption of the basic ...

  16. Statistics for Locally Scaled Point Patterns

    DEFF Research Database (Denmark)

    Prokesová, Michaela; Hahn, Ute; Vedel Jensen, Eva B.

    2006-01-01

    scale factor. The main emphasis of the present paper is on analysis of such models. Statistical methods are developed for estimation of scaling function and template parameters as well as for model validation. The proposed methods are assessed by simulation and used in the analysis of a vegetation...

  17. Student nurse selection and predictability of academic success: The Multiple Mini Interview project.

    Science.gov (United States)

    Gale, Julia; Ooms, Ann; Grant, Robert; Paget, Kris; Marks-Maran, Di

    2016-05-01

    With recent reports of public enquiries into failure to care, universities are under pressure to ensure that candidates selected for undergraduate nursing programmes demonstrate academic potential as well as characteristics and values such as compassion, empathy and integrity. The Multiple Mini Interview (MMI) was used in one university as a way of ensuring that candidates had the appropriate numeracy and literacy skills as well as a range of communication, empathy, decision-making and problem-solving skills as well as ethical insights and integrity, initiative and team-work. To ascertain whether there is evidence of bias in MMIs (gender, age, nationality and location of secondary education) and to determine the extent to which the MMI is predictive of academic success in nursing. A longitudinal retrospective analysis of student demographics, MMI data and the assessment marks for years 1, 2 and 3. One university in southwest London. One cohort of students who commenced their programme in September 2011, including students in all four fields of nursing (adult, child, mental health and learning disability). Inferential statistics and a Bayesian Multilevel Model. MMI in conjunction with MMI numeracy test and MMI literacy test shows little or no bias in terms of ages, gender, nationality or location of secondary school education. Although MMI in conjunction with numeracy and literacy testing is predictive of academic success, it is only weakly predictive. The MMI used in conjunction with literacy and numeracy testing appears to be a successful technique for selecting candidates for nursing. However, other selection methods such as psychological profiling or testing of emotional intelligence may add to the extent to which selection methods are predictive of academic success on nursing. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Quantifying natural delta variability using a multiple-point geostatistics prior uncertainty model

    Science.gov (United States)

    Scheidt, Céline; Fernandes, Anjali M.; Paola, Chris; Caers, Jef

    2016-10-01

    We address the question of quantifying uncertainty associated with autogenic pattern variability in a channelized transport system by means of a modern geostatistical method. This question has considerable relevance for practical subsurface applications as well, particularly those related to uncertainty quantification relying on Bayesian approaches. Specifically, we show how the autogenic variability in a laboratory experiment can be represented and reproduced by a multiple-point geostatistical prior uncertainty model. The latter geostatistical method requires selection of a limited set of training images from which a possibly infinite set of geostatistical model realizations, mimicking the training image patterns, can be generated. To that end, we investigate two methods to determine how many training images and what training images should be provided to reproduce natural autogenic variability. The first method relies on distance-based clustering of overhead snapshots of the experiment; the second method relies on a rate of change quantification by means of a computer vision algorithm termed the demon algorithm. We show quantitatively that with either training image selection method, we can statistically reproduce the natural variability of the delta formed in the experiment. In addition, we study the nature of the patterns represented in the set of training images as a representation of the "eigenpatterns" of the natural system. The eigenpattern in the training image sets display patterns consistent with previous physical interpretations of the fundamental modes of this type of delta system: a highly channelized, incisional mode; a poorly channelized, depositional mode; and an intermediate mode between the two.

  19. Limited Sampling Strategy for Accurate Prediction of Pharmacokinetics of Saroglitazar: A 3-point Linear Regression Model Development and Successful Prediction of Human Exposure.

    Science.gov (United States)

    Joshi, Shuchi N; Srinivas, Nuggehally R; Parmar, Deven V

    2018-03-01

    Our aim was to develop and validate the extrapolative performance of a regression model using a limited sampling strategy for accurate estimation of the area under the plasma concentration versus time curve for saroglitazar. Healthy subject pharmacokinetic data from a well-powered food-effect study (fasted vs fed treatments; n = 50) was used in this work. The first 25 subjects' serial plasma concentration data up to 72 hours and corresponding AUC 0-t (ie, 72 hours) from the fasting group comprised a training dataset to develop the limited sampling model. The internal datasets for prediction included the remaining 25 subjects from the fasting group and all 50 subjects from the fed condition of the same study. The external datasets included pharmacokinetic data for saroglitazar from previous single-dose clinical studies. Limited sampling models were composed of 1-, 2-, and 3-concentration-time points' correlation with AUC 0-t of saroglitazar. Only models with regression coefficients (R 2 ) >0.90 were screened for further evaluation. The best R 2 model was validated for its utility based on mean prediction error, mean absolute prediction error, and root mean square error. Both correlations between predicted and observed AUC 0-t of saroglitazar and verification of precision and bias using Bland-Altman plot were carried out. None of the evaluated 1- and 2-concentration-time points models achieved R 2 > 0.90. Among the various 3-concentration-time points models, only 4 equations passed the predefined criterion of R 2 > 0.90. Limited sampling models with time points 0.5, 2, and 8 hours (R 2 = 0.9323) and 0.75, 2, and 8 hours (R 2 = 0.9375) were validated. Mean prediction error, mean absolute prediction error, and root mean square error were prediction of saroglitazar. The same models, when applied to the AUC 0-t prediction of saroglitazar sulfoxide, showed mean prediction error, mean absolute prediction error, and root mean square error model predicts the exposure of

  20. Validating Whole-Airway CFD Predictions of DPI Aerosol Deposition at Multiple Flow Rates.

    Science.gov (United States)

    Longest, P Worth; Tian, Geng; Khajeh-Hosseini-Dalasm, Navvab; Hindle, Michael

    2016-12-01

    The objective of this study was to compare aerosol deposition predictions of a new whole-airway CFD model with available in vivo data for a dry powder inhaler (DPI) considered across multiple inhalation waveforms, which affect both the particle size distribution (PSD) and particle deposition. The Novolizer DPI with a budesonide formulation was selected based on the availability of 2D gamma scintigraphy data in humans for three different well-defined inhalation waveforms. Initial in vitro cascade impaction experiments were conducted at multiple constant (square-wave) particle sizing flow rates to characterize PSDs. The whole-airway CFD modeling approach implemented the experimentally determined PSDs at the point of aerosol formation in the inhaler. Complete characteristic airway geometries for an adult were evaluated through the lobar bronchi, followed by stochastic individual pathway (SIP) approximations through the tracheobronchial region and new acinar moving wall models of the alveolar region. It was determined that the PSD used for each inhalation waveform should be based on a constant particle sizing flow rate equal to the average of the inhalation waveform's peak inspiratory flow rate (PIFR) and mean flow rate [i.e., AVG(PIFR, Mean)]. Using this technique, agreement with the in vivo data was acceptable with <15% relative differences averaged across the three regions considered for all inhalation waveforms. Defining a peripheral to central deposition ratio (P/C) based on alveolar and tracheobronchial compartments, respectively, large flow-rate-dependent differences were observed, which were not evident in the original 2D in vivo data. The agreement between the CFD predictions and in vivo data was dependent on accurate initial estimates of the PSD, emphasizing the need for a combination in vitro-in silico approach. Furthermore, use of the AVG(PIFR, Mean) value was identified as a potentially useful method for characterizing a DPI aerosol at a constant flow rate.

  1. Channel capacity of TDD-OFDM-MIMO for multiple access points in a wireless single-frequency-network

    DEFF Research Database (Denmark)

    Takatori, Y.; Fitzek, Frank; Tsunekawa, K.

    2005-01-01

    MIMO data transmission scheme, which combines Single-Frequency-Network (SFN) with TDD-OFDM-MIMO applied for wireless LAN networks. In our proposal, we advocate to use SFN for multiple access points (MAP) MIMO data transmission. The goal of this approach is to achieve very high channel capacity in both......The multiple-input-multiple-output (MIMO) technique is the most attractive candidate to improve the spectrum efficiency in the next generation wireless communication systems. However, the efficiency of MIMO techniques reduces in the line of sight (LOS) environments. In this paper, we propose a new...

  2. Foundations of Complex Systems Nonlinear Dynamics, Statistical Physics, and Prediction

    CERN Document Server

    Nicolis, Gregoire

    2007-01-01

    Complexity is emerging as a post-Newtonian paradigm for approaching a large body of phenomena of concern at the crossroads of physical, engineering, environmental, life and human sciences from a unifying point of view. This book outlines the foundations of modern complexity research as it arose from the cross-fertilization of ideas and tools from nonlinear science, statistical physics and numerical simulation. It is shown how these developments lead to an understanding, both qualitative and quantitative, of the complex systems encountered in nature and in everyday experience and, conversely, h

  3. Accuracy of the paracetamol-aminotransferase multiplication product to predict hepatotoxicity in modified-release paracetamol overdose.

    Science.gov (United States)

    Wong, Anselm; Sivilotti, Marco L A; Graudins, Andis

    2017-06-01

    The paracetamol-aminotransferase multiplication product (APAP × ALT) is a risk predictor of hepatotoxicity that is somewhat independent of time and type of ingestion. However, its accuracy following ingestion of modified-release formulations is not known, as the product has been derived and validated after immediate-release paracetamol overdoses. The aim of this retrospective cohort study was to evaluate the accuracy of the multiplication product to predict hepatotoxicity in a cohort of patients with modified-release paracetamol overdose. We assessed all patients with modified-release paracetamol overdose presenting to our hospital network from October 2009 to July 2016. Ingestion of a modified-release formulation was identified by patient self-report or retrieval of the original container. Hepatotoxicity was defined as peak alanine aminotransferase ≥1000 IU/L, and acute liver injury (ALI) as a doubling of baseline ALT to more than 50 IU/L. Of 1989 paracetamol overdose presentations, we identified 73 modified-release paracetamol exposures treated with acetylcysteine. Five patients developed hepatotoxicity, including one who received acetylcysteine within eight hours of an acute ingestion. No patient with an initial multiplication product paracetamol overdose treated with acetylcysteine, the paracetamol-aminotransferase multiplication product demonstrated similar accuracy and temporal profile to previous reports involving mostly immediate-release formulations. Above a cut-point of 10,000 mg/L × IU/L, it was very strongly associated with the development of acute liver injury and hepatotoxicity, especially when calculated more than eight hours post-ingestion. When below 1500 mg/L × IU/L the likelihood of developing hepatotoxicity was very low. Persistently high serial multiplication product calculations were associated with the greatest risk of hepatotoxicity.

  4. Increased short-term variability of repolarization predicts d-sotalol-induced torsades de pointes in dogs

    DEFF Research Database (Denmark)

    Thomsen, Morten Bækgaard; Verduyn, S Cora; Stengl, Milan

    2004-01-01

    Identification of patients at risk for drug-induced torsades de pointes arrhythmia (TdP) is difficult. Increased temporal lability of repolarization has been suggested as being valuable to predict proarrhythmia. The predictive value of different repolarization parameters, including beat...

  5. Point spread function due to multiple scattering of light in the atmosphere

    International Nuclear Information System (INIS)

    Pękala, J.; Wilczyński, H.

    2013-01-01

    The atmospheric scattering of light has a significant influence on the results of optical observations of air showers. It causes attenuation of direct light from the shower, but also contributes a delayed signal to the observed light. The scattering of light therefore should be accounted for, both in simulations of air shower detection and reconstruction of observed events. In this work a Monte Carlo simulation of multiple scattering of light has been used to determine the contribution of the scattered light in observations of a point source of light. Results of the simulations and a parameterization of the angular distribution of the scattered light contribution to the observed signal (the point spread function) are presented. -- Author-Highlights: •Analysis of atmospheric scattering of light from an isotropic point source. •Different geometries and atmospheric conditions were investigated. •A parameterization of scattered light distribution has been developed. •The parameterization allows one to easily account for the light scattering in air. •The results will be useful in analyses of observations of extensive air shower

  6. Methods for meta-analysis of multiple traits using GWAS summary statistics.

    Science.gov (United States)

    Ray, Debashree; Boehnke, Michael

    2018-03-01

    Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses

  7. Methodology to predict the initiation of multiple transverse fractures from horizontal wellbores

    Energy Technology Data Exchange (ETDEWEB)

    Crosby, D. G.; Yang, Z.; Rahman, S. S. [Univ. of New South Wales (Australia)

    2001-10-01

    The criterion based on Drucker and Prager which is designed to predict the pressure required to initiate secondary multiple transverse fractures in close proximity to primary fractures is discussed. Results based on this criterion compare favorably with those measured during a series of laboratory-scale hydraulic fracture interaction tests. It is concluded that the multiple fracture criterion and laboratory results demonstrate that transversely fractured horizontal wellbores have a limited capacity to resist the initiation of multiple fractures from adjacent perforations, or intersecting induced and natural fractures. 23 refs., 1 tab., 9 figs.

  8. A statistical model for aggregating judgments by incorporating peer predictions

    OpenAIRE

    McCoy, John; Prelec, Drazen

    2017-01-01

    We propose a probabilistic model to aggregate the answers of respondents answering multiple-choice questions. The model does not assume that everyone has access to the same information, and so does not assume that the consensus answer is correct. Instead, it infers the most probable world state, even if only a minority vote for it. Each respondent is modeled as receiving a signal contingent on the actual world state, and as using this signal to both determine their own answer and predict the ...

  9. Exploration of machine learning techniques in predicting multiple sclerosis disease course

    OpenAIRE

    Zhao, Yijun; Healy, Brian C.; Rotstein, Dalia; Guttmann, Charles R. G.; Bakshi, Rohit; Weiner, Howard L.; Brodley, Carla E.; Chitnis, Tanuja

    2017-01-01

    Objective To explore the value of machine learning methods for predicting multiple sclerosis disease course. Methods 1693 CLIMB study patients were classified as increased EDSS?1.5 (worsening) or not (non-worsening) at up to five years after baseline visit. Support vector machines (SVM) were used to build the classifier, and compared to logistic regression (LR) using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up. Results Baseline data...

  10. Usefulness of optic nerve ultrasound to predict clinical progression in multiple sclerosis.

    Science.gov (United States)

    Pérez Sánchez, S; Eichau Madueño, S; Rus Hidalgo, M; Domínguez Mayoral, A M; Vilches-Arenas, A; Navarro Mascarell, G; Izquierdo, G

    2018-03-21

    Progressive neuronal and axonal loss are considered the main causes of disability in patients with multiple sclerosis (MS). The disease frequently involves the visual system; the accessibility of the system for several functional and structural tests has made it a model for the in vivo study of MS pathogenesis. Orbital ultrasound is a non-invasive technique that enables various structures of the orbit, including the optic nerve, to be evaluated in real time. We conducted an observational, ambispective study of MS patients. Disease progression data were collected. Orbital ultrasound was performed on all patients, with power set according to the 'as low as reasonably achievable' (ALARA) principle. Optical coherence tomography (OCT) data were also collected for those patients who underwent the procedure. Statistical analysis was conducted using SPSS version 22.0. Disease progression was significantly correlated with ultrasound findings (P=.041 for the right eye and P=.037 for the left eye) and with Expanded Disability Status Scale (EDSS) score at the end of the follow-up period (P=.07 for the right eye and P=.043 for the left eye). No statistically significant differences were found with relation to relapses or other clinical variables. Ultrasound measurement of optic nerve diameter constitutes a useful, predictive factor for the evaluation of patients with MS. Smaller diameters are associated with poor clinical progression and greater disability (measured by EDSS). Copyright © 2018 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.

  11. Prediction of hearing outcomes by multiple regression analysis in patients with idiopathic sudden sensorineural hearing loss.

    Science.gov (United States)

    Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki

    2014-12-01

    This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.

  12. Five year prediction of Sea Surface Temperature in the Tropical Atlantic: a comparison of simple statistical methods

    OpenAIRE

    Laepple, Thomas; Jewson, Stephen; Meagher, Jonathan; O'Shay, Adam; Penzer, Jeremy

    2007-01-01

    We are developing schemes that predict future hurricane numbers by first predicting future sea surface temperatures (SSTs), and then apply the observed statistical relationship between SST and hurricane numbers. As part of this overall goal, in this study we compare the historical performance of three simple statistical methods for making five-year SST forecasts. We also present SST forecasts for 2006-2010 using these methods and compare them to forecasts made from two structural time series ...

  13. Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images

    Science.gov (United States)

    Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina

    2012-03-01

    The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.

  14. Multi-target QSPR modeling for simultaneous prediction of multiple gas-phase kinetic rate constants of diverse chemicals

    Science.gov (United States)

    Basant, Nikita; Gupta, Shikha

    2018-03-01

    The reactions of molecular ozone (O3), hydroxyl (•OH) and nitrate (NO3) radicals are among the major pathways of removal of volatile organic compounds (VOCs) in the atmospheric environment. The gas-phase kinetic rate constants (kO3, kOH, kNO3) are thus, important in assessing the ultimate fate and exposure risk of atmospheric VOCs. Experimental data for rate constants are not available for many emerging VOCs and the computational methods reported so far address a single target modeling only. In this study, we have developed a multi-target (mt) QSPR model for simultaneous prediction of multiple kinetic rate constants (kO3, kOH, kNO3) of diverse organic chemicals considering an experimental data set of VOCs for which values of all the three rate constants are available. The mt-QSPR model identified and used five descriptors related to the molecular size, degree of saturation and electron density in a molecule, which were mechanistically interpretable. These descriptors successfully predicted three rate constants simultaneously. The model yielded high correlations (R2 = 0.874-0.924) between the experimental and simultaneously predicted endpoint rate constant (kO3, kOH, kNO3) values in test arrays for all the three systems. The model also passed all the stringent statistical validation tests for external predictivity. The proposed multi-target QSPR model can be successfully used for predicting reactivity of new VOCs simultaneously for their exposure risk assessment.

  15. Distribution of degrees of polymerization in statistically branched polymers with tetrafunctional branch points: model calculations

    Czech Academy of Sciences Publication Activity Database

    Netopilík, Miloš; Kratochvíl, Pavel

    2006-01-01

    Roč. 55, č. 2 (2006), s. 196-203 ISSN 0959-8103 R&D Projects: GA AV ČR IAA100500501; GA AV ČR IAA4050403; GA AV ČR IAA4050409; GA ČR GA203/03/0617 Institutional research plan: CEZ:AV0Z40500505 Keywords : statistical branching * tetrafunctional branch points * molecular-weight distribution Subject RIV: CD - Macromolecular Chemistry Impact factor: 1.475, year: 2006

  16. Solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators.

    Science.gov (United States)

    Zhao, Jing; Zong, Haili

    2018-01-01

    In this paper, we propose parallel and cyclic iterative algorithms for solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators. We also combine the process of cyclic and parallel iterative methods and propose two mixed iterative algorithms. Our several algorithms do not need any prior information about the operator norms. Under mild assumptions, we prove weak convergence of the proposed iterative sequences in Hilbert spaces. As applications, we obtain several iterative algorithms to solve the multiple-set split equality problem.

  17. Statistical inference an integrated approach

    CERN Document Server

    Migon, Helio S; Louzada, Francisco

    2014-01-01

    Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...

  18. A Comparison of Combustion Dynamics for Multiple 7-Point Lean Direct Injection Combustor Configurations

    Science.gov (United States)

    Tacina, K. M.; Hicks, Y. R.

    2017-01-01

    The combustion dynamics of multiple 7-point lean direct injection (LDI) combustor configurations are compared. LDI is a fuel-lean combustor concept for aero gas turbine engines in which multiple small fuel-air mixers replace one traditionally-sized fuel-air mixer. This 7-point LDI configuration has a circular cross section, with a center (pilot) fuel-air mixer surrounded by six outer (main) fuel-air mixers. Each fuel-air mixer consists of an axial air swirler followed by a converging-diverging venturi. A simplex fuel injector is inserted through the center of the air swirler, with the fuel injector tip located near the venturi throat. All 7 fuel-air mixers are identical except for the swirler blade angle, which varies with the configuration. Testing was done in a 5-atm flame tube with inlet air temperatures from 600 to 800 F and equivalence ratios from 0.4 to 0.7. Combustion dynamics were measured using a cooled PCB pressure transducer flush-mounted in the wall of the combustor test section.

  19. Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages

    Science.gov (United States)

    Kim, Yoonsang; Emery, Sherry

    2013-01-01

    Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415

  20. Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages.

    Science.gov (United States)

    Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry

    2013-08-01

    Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.

  1. The prediction of parental self-efficacy and hyper-anxiety symptoms based on the components of mindfulness in women with multiple sclerosis

    Directory of Open Access Journals (Sweden)

    Mohammad Mohammadi Pour

    2017-08-01

    Full Text Available Introduction: The present study aimed to predict parental self-efficacy and hyper-anxiety symptoms based on the components of mindfulness in women with multiple sclerosis (MS. Materials and Methods: The statistical population of this descriptive-correlational study included all women with MS in Mashhad during March-Jun 2016 who referred for treatment to clinics, neurologists and psychological centers. The statistical sample consisted of 105 women with MS who were selected using convenient sampling method. In order to collect data, Parental Self-Efficacy Questionnaire, Beck Anxiety Inventory (BIA and Mindfulness Questionnaire were used. Data were analyzed using multivariate regression method. Results: The results revealed that the components of mindfulness, judgment and non-reactivity can reduce anxiety significantly in women with MS. In addition, action with awareness, judgment and non-reactivity can increase parental self-efficacy (P

  2. Statistical learning in social action contexts.

    Science.gov (United States)

    Monroy, Claire; Meyer, Marlene; Gerson, Sarah; Hunnius, Sabine

    2017-01-01

    Sensitivity to the regularities and structure contained within sequential, goal-directed actions is an important building block for generating expectations about the actions we observe. Until now, research on statistical learning for actions has solely focused on individual action sequences, but many actions in daily life involve multiple actors in various interaction contexts. The current study is the first to investigate the role of statistical learning in tracking regularities between actions performed by different actors, and whether the social context characterizing their interaction influences learning. That is, are observers more likely to track regularities across actors if they are perceived as acting jointly as opposed to in parallel? We tested adults and toddlers to explore whether social context guides statistical learning and-if so-whether it does so from early in development. In a between-subjects eye-tracking experiment, participants were primed with a social context cue between two actors who either shared a goal of playing together ('Joint' condition) or stated the intention to act alone ('Parallel' condition). In subsequent videos, the actors performed sequential actions in which, for certain action pairs, the first actor's action reliably predicted the second actor's action. We analyzed predictive eye movements to upcoming actions as a measure of learning, and found that both adults and toddlers learned the statistical regularities across actors when their actions caused an effect. Further, adults with high statistical learning performance were sensitive to social context: those who observed actors with a shared goal were more likely to correctly predict upcoming actions. In contrast, there was no effect of social context in the toddler group, regardless of learning performance. These findings shed light on how adults and toddlers perceive statistical regularities across actors depending on the nature of the observed social situation and the

  3. Towards single embryo transfer? Modelling clinical outcomes of potential treatment choices using multiple data sources: predictive models and patient perspectives.

    Science.gov (United States)

    Roberts, Sa; McGowan, L; Hirst, Wm; Brison, Dr; Vail, A; Lieberman, Ba

    2010-07-01

    In vitro fertilisation (IVF) treatments involve an egg retrieval process, fertilisation and culture of the resultant embryos in the laboratory, and the transfer of embryos back to the mother over one or more transfer cycles. The first transfer is usually of fresh embryos and the remainder may be cryopreserved for future frozen cycles. Most commonly in UK practice two embryos are transferred (double embryo transfer, DET). IVF techniques have led to an increase in the number of multiple births, carrying an increased risk of maternal and infant morbidity. The UK Human Fertilisation and Embryology Authority (HFEA) has adopted a multiple birth minimisation strategy. One way of achieving this would be by increased use of single embryo transfer (SET). To collate cohort data from treatment centres and the HFEA; to develop predictive models for live birth and twinning probabilities from fresh and frozen embryo transfers and predict outcomes from treatment scenarios; to understand patients' perspectives and use the modelling results to investigate the acceptability of twin reduction policies. A multidisciplinary approach was adopted, combining statistical modelling with qualitative exploration of patients' perspectives: interviews were conducted with 27 couples at various stages of IVF treatment at both UK NHS and private clinics; datasets were collated of over 90,000 patients from the HFEA registry and nearly 9000 patients from five clinics, both over the period 2000-5; models were developed to determine live birth and twin outcomes and predict the outcomes of policies for selecting patients for SET or DET in the fresh cycle following egg retrieval and fertilisation, and the predictions were used in simulations of treatments; two focus groups were convened, one NHS and one web based on a patient organisation's website, to present the results of the statistical analyses and explore potential treatment policies. The statistical analysis revealed no characteristics that

  4. Statistical analysis of the influence of wheat black point kernels on selected indicators of wheat flour quality

    Directory of Open Access Journals (Sweden)

    Petrov Verica D.

    2011-01-01

    Full Text Available The influence of wheat black point kernels on selected indicators of wheat flour quality - farinograph and extensograph indicators, amylolytic activity, wet gluten and flour ash content, were examined in this study. The examinations were conducted on samples of wheat harvested in the years 2007 and 2008 from the area of Central Banat in four treatments-control (without black point flour and with 2, 4 and 10% of black point flour which was added as a replacement for a part of the control sample. Statistically significant differences between treatments were observed on the dough stability, falling number and extensibility. The samples with 10% of black point flour had the lowest dough stability and the highest amylolytic activity and extensibility. There was a trend of the increasing 15 min drop and water absorption with the increased share of black point flour. Extensograph area, resistance and ratio resistance to extensibility decreased with the addition of black point flour, but not properly. Mahalanobis distance indicates that the addition of 10% black point flour had the greatest influence on the observed quality indicators, thus proving that black point contributes to the technological quality of wheat, i.e .flour.

  5. 77 FR 34211 - Modification of Multiple Compulsory Reporting Points; Continental United States, Alaska and Hawaii

    Science.gov (United States)

    2012-06-11

    ... DEPARTMENT OF TRANSPORTATION Federal Aviation Administration 14 CFR Part 71 [Docket No. FAA-2012-0130; Airspace Docket No. 12-AWA-2] RIN 2120-AA66 Modification of Multiple Compulsory Reporting Points; Continental United States, Alaska and Hawaii AGENCY: Federal Aviation Administration (FAA), DOT. ACTION: Final...

  6. Statistical and Machine-Learning Data Mining Techniques for Better Predictive Modeling and Analysis of Big Data

    CERN Document Server

    Ratner, Bruce

    2011-01-01

    The second edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. The first edition, titled Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, contained 17 chapters of innovative and practical statistical data mining techniques. In this second edition, renamed to reflect the increased coverage of machine-learning data mining techniques, the author has

  7. A statistical approach to the prediction of pressure tube fracture toughness

    International Nuclear Information System (INIS)

    Pandey, M.D.; Radford, D.D.

    2008-01-01

    The fracture toughness of the zirconium alloy (Zr-2.5Nb) is an important parameter in determining the flaw tolerance for operation of pressure tubes in a nuclear reactor. Fracture toughness data have been generated by performing rising pressure burst tests on sections of pressure tubes removed from operating reactors. The test data were used to generate a lower-bound fracture toughness curve, which is used in defining the operational limits of pressure tubes. The paper presents a comprehensive statistical analysis of burst test data and develops a multivariate statistical model to relate toughness with material chemistry, mechanical properties, and operational history. The proposed model can be useful in predicting fracture toughness of specific in-service pressure tubes, thereby minimizing conservatism associated with a generic lower-bound approach

  8. Predicting energy performance of a net-zero energy building: A statistical approach

    International Nuclear Information System (INIS)

    Kneifel, Joshua; Webb, David

    2016-01-01

    Highlights: • A regression model is applied to actual energy data from a net-zero energy building. • The model is validated through a rigorous statistical analysis. • Comparisons are made between model predictions and those of a physics-based model. • The model is a viable baseline for evaluating future models from the energy data. - Abstract: Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid Climate Zone, and compares these

  9. Seismic activity prediction using computational intelligence techniques in northern Pakistan

    Science.gov (United States)

    Asim, Khawaja M.; Awais, Muhammad; Martínez-Álvarez, F.; Iqbal, Talat

    2017-10-01

    Earthquake prediction study is carried out for the region of northern Pakistan. The prediction methodology includes interdisciplinary interaction of seismology and computational intelligence. Eight seismic parameters are computed based upon the past earthquakes. Predictive ability of these eight seismic parameters is evaluated in terms of information gain, which leads to the selection of six parameters to be used in prediction. Multiple computationally intelligent models have been developed for earthquake prediction using selected seismic parameters. These models include feed-forward neural network, recurrent neural network, random forest, multi layer perceptron, radial basis neural network, and support vector machine. The performance of every prediction model is evaluated and McNemar's statistical test is applied to observe the statistical significance of computational methodologies. Feed-forward neural network shows statistically significant predictions along with accuracy of 75% and positive predictive value of 78% in context of northern Pakistan.

  10. Modern Statistics for Spatial Point Processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Waagepetersen, Rasmus

    2007-01-01

    We summarize and discuss the current state of spatial point process theory and directions for future research, making an analogy with generalized linear models and random effect models, and illustrating the theory with various examples of applications. In particular, we consider Poisson, Gibbs...

  11. Modern statistics for spatial point processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Waagepetersen, Rasmus

    We summarize and discuss the current state of spatial point process theory and directions for future research, making an analogy with generalized linear models and random effect models, and illustrating the theory with various examples of applications. In particular, we consider Poisson, Gibbs...

  12. Quantum statistics and liquid helium 3 - helum 4 mixtures

    International Nuclear Information System (INIS)

    Cohen, E.G.D.

    1979-01-01

    The behaviour of liquid helium 3-helium 4 mixtures is considered from the point of view of manifestation of quantum statistics effects in macrophysics. The Boze=Einstein statistics is shown to be of great importance for understanding superfluid helium-4 properties whereas the Fermi-Dirac statistics is of importance for understanding helium-3 properties. Without taking into consideration the interaction between the helium atoms it is impossible to understand the basic properties of liquid helium 33 - helium 4 mixtures at constant pressure. Proposed is a simple model of the liquid helium 3-helium 4 mixture, namely the binary mixture consisting of solid spheres of two types subjecting to the Fermi-Dirac and Bose-Einstein statistics relatively. This model predicts correctly the most surprising peculiarities of phase diagrams of concentration dependence on temperature for helium solutions. In particular, the helium 4 Bose-Einstein statistics is responsible for the phase lamination of helium solutions at low temperatures. It starts in the peculiar critical point. The helium 4 Fermi-Dirac statistics results in incomplete phase lamination close to the absolute zero temperatures, that permits operation of a powerful cooling facility, namely refrigerating machine on helium solution

  13. Development of a prognostic model for predicting spontaneous singleton preterm birth.

    Science.gov (United States)

    Schaaf, Jelle M; Ravelli, Anita C J; Mol, Ben Willem J; Abu-Hanna, Ameen

    2012-10-01

    To develop and validate a prognostic model for prediction of spontaneous preterm birth. Prospective cohort study using data of the nationwide perinatal registry in The Netherlands. We studied 1,524,058 singleton pregnancies between 1999 and 2007. We developed a multiple logistic regression model to estimate the risk of spontaneous preterm birth based on maternal and pregnancy characteristics. We used bootstrapping techniques to internally validate our model. Discrimination (AUC), accuracy (Brier score) and calibration (calibration graphs and Hosmer-Lemeshow C-statistic) were used to assess the model's predictive performance. Our primary outcome measure was spontaneous preterm birth at model included 13 variables for predicting preterm birth. The predicted probabilities ranged from 0.01 to 0.71 (IQR 0.02-0.04). The model had an area under the receiver operator characteristic curve (AUC) of 0.63 (95% CI 0.63-0.63), the Brier score was 0.04 (95% CI 0.04-0.04) and the Hosmer Lemeshow C-statistic was significant (pvalues of predicted probability. The positive predictive value was 26% (95% CI 20-33%) for the 0.4 probability cut-off point. The model's discrimination was fair and it had modest calibration. Previous preterm birth, drug abuse and vaginal bleeding in the first half of pregnancy were the most important predictors for spontaneous preterm birth. Although not applicable in clinical practice yet, this model is a next step towards early prediction of spontaneous preterm birth that enables caregivers to start preventive therapy in women at higher risk. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  14. A system for learning statistical motion patterns.

    Science.gov (United States)

    Hu, Weiming; Xiao, Xuejuan; Fu, Zhouyu; Xie, Dan; Tan, Tieniu; Maybank, Steve

    2006-09-01

    Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy K-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction.

  15. Predicting seizures in untreated temporal lobe epilepsy using point-process nonlinear models of heartbeat dynamics.

    Science.gov (United States)

    Valenza, G; Romigi, A; Citi, L; Placidi, F; Izzi, F; Albanese, M; Scilingo, E P; Marciani, M G; Duggento, A; Guerrisi, M; Toschi, N; Barbieri, R

    2016-08-01

    Symptoms of temporal lobe epilepsy (TLE) are frequently associated with autonomic dysregulation, whose underlying biological processes are thought to strongly contribute to sudden unexpected death in epilepsy (SUDEP). While abnormal cardiovascular patterns commonly occur during ictal events, putative patterns of autonomic cardiac effects during pre-ictal (PRE) periods (i.e. periods preceding seizures) are still unknown. In this study, we investigated TLE-related heart rate variability (HRV) through instantaneous, nonlinear estimates of cardiovascular oscillations during inter-ictal (INT) and PRE periods. ECG recordings from 12 patients with TLE were processed to extract standard HRV indices, as well as indices of instantaneous HRV complexity (dominant Lyapunov exponent and entropy) and higher-order statistics (bispectra) obtained through definition of inhomogeneous point-process nonlinear models, employing Volterra-Laguerre expansions of linear, quadratic, and cubic kernels. Experimental results demonstrate that the best INT vs. PRE classification performance (balanced accuracy: 73.91%) was achieved only when retaining the time-varying, nonlinear, and non-stationary structure of heartbeat dynamical features. The proposed approach opens novel important avenues in predicting ictal events using information gathered from cardiovascular signals exclusively.

  16. SU-F-BRB-10: A Statistical Voxel Based Normal Organ Dose Prediction Model for Coplanar and Non-Coplanar Prostate Radiotherapy

    Energy Technology Data Exchange (ETDEWEB)

    Tran, A; Yu, V; Nguyen, D; Woods, K; Low, D; Sheng, K [UCLA, Los Angeles, CA (United States)

    2015-06-15

    Purpose: Knowledge learned from previous plans can be used to guide future treatment planning. Existing knowledge-based treatment planning methods study the correlation between organ geometry and dose volume histogram (DVH), which is a lossy representation of the complete dose distribution. A statistical voxel dose learning (SVDL) model was developed that includes the complete dose volume information. Its accuracy of predicting volumetric-modulated arc therapy (VMAT) and non-coplanar 4π radiotherapy was quantified. SVDL provided more isotropic dose gradients and may improve knowledge-based planning. Methods: 12 prostate SBRT patients originally treated using two full-arc VMAT techniques were re-planned with 4π using 20 intensity-modulated non-coplanar fields to a prescription dose of 40 Gy. The bladder and rectum voxels were binned based on their distances to the PTV. The dose distribution in each bin was resampled by convolving to a Gaussian kernel, resulting in 1000 data points in each bin that predicted the statistical dose information of a voxel with unknown dose in a new patient without triaging information that may be collectively important to a particular patient. We used this method to predict the DVHs, mean and max doses in a leave-one-out cross validation (LOOCV) test and compared its performance against lossy estimators including mean, median, mode, Poisson and Rayleigh of the voxelized dose distributions. Results: SVDL predicted the bladder and rectum doses more accurately than other estimators, giving mean percentile errors ranging from 13.35–19.46%, 4.81–19.47%, 22.49–28.69%, 23.35–30.5%, 21.05–53.93% for predicting mean, max dose, V20, V35, and V40 respectively, to OARs in both planning techniques. The prediction errors were generally lower for 4π than VMAT. Conclusion: By employing all dose volume information in the SVDL model, the OAR doses were more accurately predicted. 4π plans are better suited for knowledge-based planning than

  17. SAAFEC: Predicting the Effect of Single Point Mutations on Protein Folding Free Energy Using a Knowledge-Modified MM/PBSA Approach.

    Science.gov (United States)

    Getov, Ivan; Petukh, Marharyta; Alexov, Emil

    2016-04-07

    Folding free energy is an important biophysical characteristic of proteins that reflects the overall stability of the 3D structure of macromolecules. Changes in the amino acid sequence, naturally occurring or made in vitro, may affect the stability of the corresponding protein and thus could be associated with disease. Several approaches that predict the changes of the folding free energy caused by mutations have been proposed, but there is no method that is clearly superior to the others. The optimal goal is not only to accurately predict the folding free energy changes, but also to characterize the structural changes induced by mutations and the physical nature of the predicted folding free energy changes. Here we report a new method to predict the Single Amino Acid Folding free Energy Changes (SAAFEC) based on a knowledge-modified Molecular Mechanics Poisson-Boltzmann (MM/PBSA) approach. The method is comprised of two main components: a MM/PBSA component and a set of knowledge based terms delivered from a statistical study of the biophysical characteristics of proteins. The predictor utilizes a multiple linear regression model with weighted coefficients of various terms optimized against a set of experimental data. The aforementioned approach yields a correlation coefficient of 0.65 when benchmarked against 983 cases from 42 proteins in the ProTherm database. the webserver can be accessed via http://compbio.clemson.edu/SAAFEC/.

  18. Foreign exchange market data analysis reveals statistical features that predict price movement acceleration.

    Science.gov (United States)

    Nacher, Jose C; Ochiai, Tomoshiro

    2012-05-01

    Increasingly accessible financial data allow researchers to infer market-dynamics-based laws and to propose models that are able to reproduce them. In recent years, several stylized facts have been uncovered. Here we perform an extensive analysis of foreign exchange data that leads to the unveiling of a statistical financial law. First, our findings show that, on average, volatility increases more when the price exceeds the highest (or lowest) value, i.e., breaks the resistance line. We call this the breaking-acceleration effect. Second, our results show that the probability P(T) to break the resistance line in the past time T follows power law in both real data and theoretically simulated data. However, the probability calculated using real data is rather lower than the one obtained using a traditional Black-Scholes (BS) model. Taken together, the present analysis characterizes a different stylized fact of financial markets and shows that the market exceeds a past (historical) extreme price fewer times than expected by the BS model (the resistance effect). However, when the market does, we predict that the average volatility at that time point will be much higher. These findings indicate that any Markovian model does not faithfully capture the market dynamics.

  19. Predicting the Diagnostic and Statistical Manual of Mental Disorders (Fifth Edition): The Mystery of How to Constrain Unchecked Growth.

    Science.gov (United States)

    Blashfield, Roger K; Fuller, A Kenneth

    2016-06-01

    Twenty years ago, slightly after the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition was published, we predicted the characteristics of the future Diagnostic and Statistical Manual of Mental Disorders (fifth edition) (). Included in our predictions were how many diagnoses it would contain, the physical size of the Diagnostic and Statistical Manual of Mental Disorders (fifth edition), who its leader would be, how many professionals would be involved in creating it, the revenue generated, and the color of its cover. This article reports on the accuracy of our predictions. Our largest prediction error concerned financial revenue. The earnings growth of the DSM's has been remarkable. Drug company investments, insurance benefits, the financial need of the American Psychiatric Association, and the research grant process are factors that have stimulated the growth of the DSM's. Restoring order and simplicity to the classification of mental disorders will not be a trivial task.

  20. Comparison of statistical and clinical predictions of functional outcome after ischemic stroke.

    Directory of Open Access Journals (Sweden)

    Douglas D Thompson

    Full Text Available To determine whether the predictions of functional outcome after ischemic stroke made at the bedside using a doctor's clinical experience were more or less accurate than the predictions made by clinical prediction models (CPMs.A prospective cohort study of nine hundred and thirty one ischemic stroke patients recruited consecutively at the outpatient, inpatient and emergency departments of the Western General Hospital, Edinburgh between 2002 and 2005. Doctors made informal predictions of six month functional outcome on the Oxford Handicap Scale (OHS. Patients were followed up at six months with a validated postal questionnaire. For each patient we calculated the absolute predicted risk of death or dependence (OHS≥3 using five previously described CPMs. The specificity of a doctor's informal predictions of OHS≥3 at six months was good 0.96 (95% CI: 0.94 to 0.97 and similar to CPMs (range 0.94 to 0.96; however the sensitivity of both informal clinical predictions 0.44 (95% CI: 0.39 to 0.49 and clinical prediction models (range 0.38 to 0.45 was poor. The prediction of the level of disability after stroke was similar for informal clinical predictions (ordinal c-statistic 0.74 with 95% CI 0.72 to 0.76 and CPMs (range 0.69 to 0.75. No patient or clinician characteristic affected the accuracy of informal predictions, though predictions were more accurate in outpatients.CPMs are at least as good as informal clinical predictions in discriminating between good and bad functional outcome after ischemic stroke. The place of these models in clinical practice has yet to be determined.

  1. The Research of Multiple Attenuation Based on Feedback Iteration and Independent Component Analysis

    Science.gov (United States)

    Xu, X.; Tong, S.; Wang, L.

    2017-12-01

    How to solve the problem of multiple suppression is a difficult problem in seismic data processing. The traditional technology for multiple attenuation is based on the principle of the minimum output energy of the seismic signal, this criterion is based on the second order statistics, and it can't achieve the multiple attenuation when the primaries and multiples are non-orthogonal. In order to solve the above problems, we combine the feedback iteration method based on the wave equation and the improved independent component analysis (ICA) based on high order statistics to suppress the multiple waves. We first use iterative feedback method to predict the free surface multiples of each order. Then, in order to predict multiples from real multiple in amplitude and phase, we design an expanded pseudo multi-channel matching filtering method to get a more accurate matching multiple result. Finally, we present the improved fast ICA algorithm which is based on the maximum non-Gauss criterion of output signal to the matching multiples and get better separation results of the primaries and the multiples. The advantage of our method is that we don't need any priori information to the prediction of the multiples, and can have a better separation result. The method has been applied to several synthetic data generated by finite-difference model technique and the Sigsbee2B model multiple data, the primaries and multiples are non-orthogonal in these models. The experiments show that after three to four iterations, we can get the perfect multiple results. Using our matching method and Fast ICA adaptive multiple subtraction, we can not only effectively preserve the effective wave energy in seismic records, but also can effectively suppress the free surface multiples, especially the multiples related to the middle and deep areas.

  2. A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

    Science.gov (United States)

    Manning, Robert M.

    1990-01-01

    A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.

  3. Statistical Energy Analysis (SEA) and Energy Finite Element Analysis (EFEA) Predictions for a Floor-Equipped Composite Cylinder

    Science.gov (United States)

    Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.

    2011-01-01

    Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.

  4. Using the Direct Sampling Multiple-Point Geostatistical Method for Filling Gaps in Landsat 7 ETM+ SLC-off Imagery

    KAUST Repository

    Yin, Gaohong

    2016-05-01

    Since the failure of the Scan Line Corrector (SLC) instrument on Landsat 7, observable gaps occur in the acquired Landsat 7 imagery, impacting the spatial continuity of observed imagery. Due to the highly geometric and radiometric accuracy provided by Landsat 7, a number of approaches have been proposed to fill the gaps. However, all proposed approaches have evident constraints for universal application. The main issues in gap-filling are an inability to describe the continuity features such as meandering streams or roads, or maintaining the shape of small objects when filling gaps in heterogeneous areas. The aim of the study is to validate the feasibility of using the Direct Sampling multiple-point geostatistical method, which has been shown to reconstruct complicated geological structures satisfactorily, to fill Landsat 7 gaps. The Direct Sampling method uses a conditional stochastic resampling of known locations within a target image to fill gaps and can generate multiple reconstructions for one simulation case. The Direct Sampling method was examined across a range of land cover types including deserts, sparse rural areas, dense farmlands, urban areas, braided rivers and coastal areas to demonstrate its capacity to recover gaps accurately for various land cover types. The prediction accuracy of the Direct Sampling method was also compared with other gap-filling approaches, which have been previously demonstrated to offer satisfactory results, under both homogeneous area and heterogeneous area situations. Studies have shown that the Direct Sampling method provides sufficiently accurate prediction results for a variety of land cover types from homogeneous areas to heterogeneous land cover types. Likewise, it exhibits superior performances when used to fill gaps in heterogeneous land cover types without input image or with an input image that is temporally far from the target image in comparison with other gap-filling approaches.

  5. Predicting tube repair at French nuclear steam generators using statistical modeling

    Energy Technology Data Exchange (ETDEWEB)

    Mathon, C., E-mail: cedric.mathon@edf.fr [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Chaudhary, A. [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Gay, N.; Pitner, P. [EDF Generation, Nuclear Operation Division (UNIE), Saint-Denis (France)

    2014-04-01

    Electricité de France (EDF) currently operates a total of 58 Nuclear Pressurized Water Reactors (PWR) which are composed of 34 units of 900 MWe, 20 units of 1300 MWe and 4 units of 1450 MWe. This report provides an overall status of SG tube bundles on the 1300 MWe units. These units are 4 loop reactors using the AREVA 68/19 type SG model which are equipped either with Alloy 600 thermally treated (TT) tubes or Alloy 690 TT tubes. As of 2011, the effective full power years of operation (EFPY) ranges from 13 to 20 and during this time, the main degradation mechanisms observed on SG tubes are primary water stress corrosion cracking (PWSCC) and wear at anti-vibration bars (AVB) level. Statistical models have been developed for each type of degradation in order to predict the growth rate and number of affected tubes. Additional plugging is also performed to prevent other degradations such as tube wear due to foreign objects or high-cycle flow-induced fatigue. The contribution of these degradation mechanisms on the rate of tube plugging is described. The results from the statistical models are then used in predicting the long-term life of the steam generators and therefore providing a useful tool toward their effective life management and possible replacement.

  6. The Abdominal Aortic Aneurysm Statistically Corrected Operative Risk Evaluation (AAA SCORE) for predicting mortality after open and endovascular interventions.

    Science.gov (United States)

    Ambler, Graeme K; Gohel, Manjit S; Mitchell, David C; Loftus, Ian M; Boyle, Jonathan R

    2015-01-01

    Accurate adjustment of surgical outcome data for risk is vital in an era of surgeon-level reporting. Current risk prediction models for abdominal aortic aneurysm (AAA) repair are suboptimal. We aimed to develop a reliable risk model for in-hospital mortality after intervention for AAA, using rigorous contemporary statistical techniques to handle missing data. Using data collected during a 15-month period in the United Kingdom National Vascular Database, we applied multiple imputation methodology together with stepwise model selection to generate preoperative and perioperative models of in-hospital mortality after AAA repair, using two thirds of the available data. Model performance was then assessed on the remaining third of the data by receiver operating characteristic curve analysis and compared with existing risk prediction models. Model calibration was assessed by Hosmer-Lemeshow analysis. A total of 8088 AAA repair operations were recorded in the National Vascular Database during the study period, of which 5870 (72.6%) were elective procedures. Both preoperative and perioperative models showed excellent discrimination, with areas under the receiver operating characteristic curve of .89 and .92, respectively. This was significantly better than any of the existing models (area under the receiver operating characteristic curve for best comparator model, .84 and .88; P AAA repair. These models were carefully developed with rigorous statistical methodology and significantly outperform existing methods for both elective cases and overall AAA mortality. These models will be invaluable for both preoperative patient counseling and accurate risk adjustment of published outcome data. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  7. Statistical characterization of pitting corrosion process and life prediction

    International Nuclear Information System (INIS)

    Sheikh, A.K.; Younas, M.

    1995-01-01

    In order to prevent corrosion failures of machines and structures, it is desirable to know in advance when the corrosion damage will take place, and appropriate measures are needed to mitigate the damage. The corrosion predictions are needed both at development as well as operational stage of machines and structures. There are several forms of corrosion process through which varying degrees of damage can occur. Under certain conditions these corrosion processes at alone and in other set of conditions, several of these processes may occur simultaneously. For a certain type of machine elements and structures, such as gears, bearing, tubes, pipelines, containers, storage tanks etc., are particularly prone to pitting corrosion which is an insidious form of corrosion. The corrosion predictions are usually based on experimental results obtained from test coupons and/or field experiences of similar machines or parts of a structure. Considerable scatter is observed in corrosion processes. The probabilities nature and kinetics of pitting process makes in necessary to use statistical method to forecast the residual life of machine of structures. The focus of this paper is to characterization pitting as a time-dependent random process, and using this characterization the prediction of life to reach a critical level of pitting damage can be made. Using several data sets from literature on pitting corrosion, the extreme value modeling of pitting corrosion process, the evolution of the extreme value distribution in time, and their relationship to the reliability of machines and structure are explained. (author)

  8. Comparison of four statistical and machine learning methods for crash severity prediction.

    Science.gov (United States)

    Iranitalab, Amirfarrokh; Khattak, Aemal

    2017-11-01

    Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques

    Science.gov (United States)

    Mishra, D.; Goyal, P.

    2014-12-01

    Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.

  10. Evaluation of use of MPAD trajectory tape and number of orbit points for orbiter mission thermal predictions

    Science.gov (United States)

    Vogt, R. A.

    1979-01-01

    The application of using the mission planning and analysis division (MPAD) common format trajectory data tape to predict temperatures for preflight and post flight mission analysis is presented and evaluated. All of the analyses utilized the latest Space Transportation System 1 flight (STS-1) MPAD trajectory tape, and the simplified '136 note' midsection/payload bay thermal math model. For the first 6.7 hours of the STS-1 flight profile, transient temperatures are presented for selected nodal locations with the current standard method, and the trajectory tape method. Whether the differences are considered significant or not depends upon the view point. Other transient temperature predictions are also presented. These results were obtained to investigate an initial concern that perhaps the predicted temperature differences between the two methods would not only be caused by the inaccuracies of the current method's assumed nominal attitude profile but also be affected by a lack of a sufficient number of orbit points in the current method. Comparison between 6, 12, and 24 orbit point parameters showed a surprising insensitivity to the number of orbit points.

  11. Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models.

    Science.gov (United States)

    Ledonne, Norman C; Rissolo, Kevin; Bulgarelli, James; Tini, Leonard

    2011-02-07

    Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI) curves via the SALI curve integral (SCI) as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists. The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK in vitro efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from SimulationsPlus and AstraZeneca. The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.

  12. The optimal hormonal replacement modality selection for multiple organ procurement from brain-dead organ donors

    Directory of Open Access Journals (Sweden)

    Mi Z

    2014-12-01

    Full Text Available Zhibao Mi,1 Dimitri Novitzky,2 Joseph F Collins,1 David KC Cooper3 1Cooperative Studies Program Coordinating Center, VA Maryland Health Care Systems, Perry Point, MD, USA; 2Department of Cardiothoracic Surgery, University of South Florida, Tampa, FL, USA; 3Thomas E Starzl Transplantation Institute, University of Pittsburgh, Pittsburgh, PA, USA Abstract: The management of brain-dead organ donors is complex. The use of inotropic agents and replacement of depleted hormones (hormonal replacement therapy is crucial for successful multiple organ procurement, yet the optimal hormonal replacement has not been identified, and the statistical adjustment to determine the best selection is not trivial. Traditional pair-wise comparisons between every pair of treatments, and multiple comparisons to all (MCA, are statistically conservative. Hsu’s multiple comparisons with the best (MCB – adapted from the Dunnett’s multiple comparisons with control (MCC – has been used for selecting the best treatment based on continuous variables. We selected the best hormonal replacement modality for successful multiple organ procurement using a two-step approach. First, we estimated the predicted margins by constructing generalized linear models (GLM or generalized linear mixed models (GLMM, and then we applied the multiple comparison methods to identify the best hormonal replacement modality given that the testing of hormonal replacement modalities is independent. Based on 10-year data from the United Network for Organ Sharing (UNOS, among 16 hormonal replacement modalities, and using the 95% simultaneous confidence intervals, we found that the combination of thyroid hormone, a corticosteroid, antidiuretic hormone, and insulin was the best modality for multiple organ procurement for transplantation. Keywords: best treatment selection, brain-dead organ donors, hormonal replacement, multiple binary endpoints, organ procurement, multiple comparisons

  13. Predictive geochemical mapping using environmental correlation

    International Nuclear Information System (INIS)

    Wilford, John; Caritat, Patrice de; Bui, Elisabeth

    2016-01-01

    The distribution of chemical elements at and near the Earth's surface, the so-called critical zone, is complex and reflects the geochemistry and mineralogy of the original substrate modified by environmental factors that include physical, chemical and biological processes over time. Geochemical data typically is illustrated in the form of plan view maps or vertical cross-sections, where the composition of regolith, soil, bedrock or any other material is represented. These are primarily point observations that frequently are interpolated to produce rasters of element distributions. Here we propose the application of environmental or covariate regression modelling to predict and better understand the controls on major and trace element geochemistry within the regolith. Available environmental covariate datasets (raster or vector) representing factors influencing regolith or soil composition are intersected with the geochemical point data in a spatial statistical correlation model to develop a system of multiple linear correlations. The spatial resolution of the environmental covariates, which typically is much finer (e.g. ∼90 m pixel) than that of geochemical surveys (e.g. 1 sample per 10-10,000 km 2 ), carries over to the predictions. Therefore the derived predictive models of element concentrations take the form of continuous geochemical landscape representations that are potentially much more informative than geostatistical interpolations. Environmental correlation is applied to the Sir Samuel 1:250,000 scale map sheet in Western Australia to produce distribution models of individual elements describing the geochemical composition of the regolith and exposed bedrock. As an example we model the distribution of two elements – chromium and sodium. We show that the environmental correlation approach generates high resolution predictive maps that are statistically more accurate and effective than ordinary kriging and inverse distance weighting interpolation

  14. Optimal day-ahead wind-thermal unit commitment considering statistical and predicted features of wind speeds

    International Nuclear Information System (INIS)

    Sun, Yanan; Dong, Jizhe; Ding, Lijuan

    2017-01-01

    Highlights: • A day–ahead wind–thermal unit commitment model is presented. • Wind speed transfer matrix is formed to depict the sequential wind features. • Spinning reserve setting considering wind power accuracy and variation is proposed. • Verified study is performed to check the correctness of the program. - Abstract: The increasing penetration of intermittent wind power affects the secure operation of power systems and leads to a requirement of robust and economic generation scheduling. This paper presents an optimal day–ahead wind–thermal generation scheduling method that considers the statistical and predicted features of wind speeds. In this method, the statistical analysis of historical wind data, which represents the local wind regime, is first implemented. Then, according to the statistical results and the predicted wind power, the spinning reserve requirements for the scheduling period are calculated. Based on the calculated spinning reserve requirements, the wind–thermal generation scheduling is finally conducted. To validate the program, a verified study is performed on a test system. Then, numerical studies to demonstrate the effectiveness of the proposed method are conducted.

  15. Statistical inference, the bootstrap, and neural-network modeling with application to foreign exchange rates.

    Science.gov (United States)

    White, H; Racine, J

    2001-01-01

    We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.

  16. Three-point statistics of cosmological stochastic gravitational waves

    International Nuclear Information System (INIS)

    Adshead, Peter; Lim, Eugene A.

    2010-01-01

    We consider the three-point function (i.e. the bispectrum or non-Gaussianity) for stochastic backgrounds of gravitational waves. We estimate the amplitude of this signal for the primordial inflationary background, gravitational waves generated during preheating, and for gravitational waves produced by self-ordering scalar fields following a global phase transition. To assess detectability, we describe how to extract the three-point signal from an idealized interferometric experiment and compute the signal to noise ratio as a function of integration time. The three-point signal for the stochastic gravitational wave background generated by inflation is unsurprisingly tiny. For gravitational radiation generated by purely causal, classical mechanisms we find that, no matter how nonlinear the process is, the three-point correlations produced vanish in direct detection experiments. On the other hand, we show that in scenarios where the B-mode of the cosmic microwave background is sourced by gravitational waves generated by a global phase transition, a strong three-point signal among the polarization modes is also produced. This may provide another method of distinguishing inflationary B-modes. To carry out this computation, we have developed a diagrammatic approach to the calculation of stochastic gravitational waves sourced by scalar fluids, which has applications beyond the present scenario.

  17. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis

    Directory of Open Access Journals (Sweden)

    BUDIMAN

    2012-01-01

    Full Text Available Budiman, Arisoesilaningsih E. 2012. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis. Biodiversitas 13: 18-22. The aims of this research was to determine the multiple regression models of vegetative and corm growth of Amorphophallus muelleri Blume in some age variations and habitat conditions of agroforestry in East Java. Descriptive exploratory research method was conducted by systematic random sampling at five agroforestries on four plantations in East Java: Saradan, Bojonegoro, Nganjuk and Blitar. In each agroforestry, we observed A. muelleri vegetative and corm growth on four growing age (1, 2, 3 and 4 years old respectively as well as environmental variables such as altitude, vegetation, climate and soil conditions. Data were analyzed using descriptive statistics to compare A. muelleri habitat in five agroforestries. Meanwhile, the influence and contribution of each environmental variable to the growth of A. muelleri vegetative and corm were determined using multiple regression analysis of SPSS 17.0. The multiple regression models of A. muelleri vegetative and corm growth were generated based on some characteristics of agroforestries and age showed high validity with R2 = 88-99%. Regression model showed that age, monthly temperatures, percentage of radiation and soil calcium (Ca content either simultaneously or partially determined the growth of A. muelleri vegetative and corm. Based on these models, the A. muelleri corm reached the optimal growth after four years of cultivation and they will be ready to be harvested. Additionally, the soil Ca content should reach 25.3 me.hg-1 as Sugihwaras agroforestry, with the maximal radiation of 60%.

  18. Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data

    Directory of Open Access Journals (Sweden)

    Alexander P. Kartun-Giles

    2018-04-01

    Full Text Available A projective network model is a model that enables predictions to be made based on a subsample of the network data, with the predictions remaining unchanged if a larger sample is taken into consideration. An exchangeable model is a model that does not depend on the order in which nodes are sampled. Despite a large variety of non-equilibrium (growing and equilibrium (static sparse complex network models that are widely used in network science, how to reconcile sparseness (constant average degree with the desired statistical properties of projectivity and exchangeability is currently an outstanding scientific problem. Here we propose a network process with hidden variables which is projective and can generate sparse power-law networks. Despite the model not being exchangeable, it can be closely related to exchangeable uncorrelated networks as indicated by its information theory characterization and its network entropy. The use of the proposed network process as a null model is here tested on real data, indicating that the model offers a promising avenue for statistical network modelling.

  19. Comparison of accuracy in predicting emotional instability from MMPI data: fisherian versus contingent probability statistics

    Energy Technology Data Exchange (ETDEWEB)

    Berghausen, P.E. Jr.; Mathews, T.W.

    1987-01-01

    The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques.

  20. Comparison of accuracy in predicting emotional instability from MMPI data: fisherian versus contingent probability statistics

    International Nuclear Information System (INIS)

    Berghausen, P.E. Jr.; Mathews, T.W.

    1987-01-01

    The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques

  1. Model Predictive Control of Z-source Neutral Point Clamped Inverter

    DEFF Research Database (Denmark)

    Mo, Wei; Loh, Poh Chiang; Blaabjerg, Frede

    2011-01-01

    This paper presents Model Predictive Control (MPC) of Z-source Neutral Point Clamped (NPC) inverter. For illustration, current control of Z-source NPC grid-connected inverter is analyzed and simulated. With MPC’s advantage of easily including system constraints, load current, impedance network...... response are obtained at the same time with a formulated Z-source NPC inverter network model. Operation steady state and transient state simulation results of MPC are going to be presented, which shows good reference tracking ability of this method. It provides new control method for Z-source NPC inverter...

  2. Statistical thermodynamics -- A tool for understanding point defects in intermetallic compounds

    International Nuclear Information System (INIS)

    Ipser, H.; Krachler, R.

    1996-01-01

    The principles of the derivation of statistical-thermodynamic models to interpret the compositional variation of thermodynamic properties in non-stoichiometric intermetallic compounds are discussed. Two types of models are distinguished: the Bragg-Williams type, where the total energy of the crystal is taken as the sum of the interaction energies of all nearest-neighbor pairs of atoms, and the Wagner-Schottky type, where the internal energy, the volume, and the vibrational entropy of the crystal are assumed to be linear functions of the numbers of atoms or vacancies on the different sublattices. A Wagner-Schottky type model is used for the description of two examples with different crystal structures: for β'-FeAl (with B2-structure) defect concentrations and their variation with composition are derived from the results of measurements of the aluminum vapor pressure, the resulting values are compared with results of other independent experimental methods; for Rh 3 Te 4 (with an NiAs-derivative structure) the defect mechanism responsible for non-stoichiometry is worked out by application of a theoretical model to the results of tellurium vapor pressure measurements. In addition it is shown that the shape of the activity curve indicates a certain sequence of superstructures. In principle, there are no limitations to the application of statistical thermodynamics to experimental thermodynamic data as long as these are available with sufficient accuracy, and as long as it is ensured that the distribution of the point defects is truly random, i.e. that there are no aggregates of defects

  3. Predicting protein folding rate change upon point mutation using residue-level coevolutionary information.

    Science.gov (United States)

    Mallik, Saurav; Das, Smita; Kundu, Sudip

    2016-01-01

    Change in folding kinetics of globular proteins upon point mutation is crucial to a wide spectrum of biological research, such as protein misfolding, toxicity, and aggregations. Here we seek to address whether residue-level coevolutionary information of globular proteins can be informative to folding rate changes upon point mutations. Generating residue-level coevolutionary networks of globular proteins, we analyze three parameters: relative coevolution order (rCEO), network density (ND), and characteristic path length (CPL). A point mutation is considered to be equivalent to a node deletion of this network and respective percentage changes in rCEO, ND, CPL are found linearly correlated (0.84, 0.73, and -0.61, respectively) with experimental folding rate changes. The three parameters predict the folding rate change upon a point mutation with 0.031, 0.045, and 0.059 standard errors, respectively. © 2015 Wiley Periodicals, Inc.

  4. Growth Curve Analysis and Change-Points Detection in Extremes

    KAUST Repository

    Meng, Rui

    2016-05-15

    The thesis consists of two coherent projects. The first project presents the results of evaluating salinity tolerance in barley using growth curve analysis where different growth trajectories are observed within barley families. The study of salinity tolerance in plants is crucial to understanding plant growth and productivity. Because fully-automated smarthouses with conveyor systems allow non-destructive and high-throughput phenotyping of large number of plants, it is now possible to apply advanced statistical tools to analyze daily measurements and to study salinity tolerance. To compare different growth patterns of barley variates, we use functional data analysis techniques to analyze the daily projected shoot areas. In particular, we apply the curve registration method to align all the curves from the same barley family in order to summarize the family-wise features. We also illustrate how to use statistical modeling to account for spatial variation in microclimate in smarthouses and for temporal variation across runs, which is crucial for identifying traits of the barley variates. In our analysis, we show that the concentrations of sodium and potassium in leaves are negatively correlated, and their interactions are associated with the degree of salinity tolerance. The second project studies change-points detection methods in extremes when multiple time series data are available. Motived by the scientific question of whether the chances to experience extreme weather are different in different seasons of a year, we develop a change-points detection model to study changes in extremes or in the tail of a distribution. Most of existing models identify seasons from multiple yearly time series assuming a season or a change-point location remains exactly the same across years. In this work, we propose a random effect model that allows the change-point to vary from year to year, following a given distribution. Both parametric and nonparametric methods are developed

  5. A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

    OpenAIRE

    Li, Guohui; Zhang, Songling; Yang, Hong

    2017-01-01

    Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...

  6. Using Patient Demographics and Statistical Modeling to Predict Knee Tibia Component Sizing in Total Knee Arthroplasty.

    Science.gov (United States)

    Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James

    2018-06-01

    Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Prediction of high-temperature point defect formation in TiO2 from combined ab initio and thermodynamic calculations

    International Nuclear Information System (INIS)

    He, J.; Behera, R.K.; Finnis, M.W.; Li, X.; Dickey, E.C.; Phillpot, S.R.; Sinnott, S.B.

    2007-01-01

    A computational approach that integrates ab initio electronic structure and thermodynamic calculations is used to determine point defect stability in rutile TiO 2 over a range of temperatures, oxygen partial pressures and stoichiometries. Both donors (titanium interstitials and oxygen vacancies) and acceptors (titanium vacancies) are predicted to have shallow defect transition levels in the electronic-structure calculations. The resulting defect formation energies for all possible charge states are then used in thermodynamic calculations to predict the influence of temperature and oxygen partial pressure on the relative stabilities of the point defects. Their ordering is found to be the same as temperature increases and oxygen partial pressure decreases: titanium vacancy → oxygen vacancy → titanium interstitial. The charges on these defects, however, are quite sensitive to the Fermi level. Finally, the combined formation energies of point defect complexes, including Schottky, Frenkel and anti-Frenkel defects, are predicted to limit the further formation of point defects

  8. Prediction of Ionizing Radiation Resistance in Bacteria Using a Multiple Instance Learning Model.

    Science.gov (United States)

    Aridhi, Sabeur; Sghaier, Haïtham; Zoghlami, Manel; Maddouri, Mondher; Nguifo, Engelbert Mephu

    2016-01-01

    Ionizing-radiation-resistant bacteria (IRRB) are important in biotechnology. In this context, in silico methods of phenotypic prediction and genotype-phenotype relationship discovery are limited. In this work, we analyzed basal DNA repair proteins of most known proteome sequences of IRRB and ionizing-radiation-sensitive bacteria (IRSB) in order to learn a classifier that correctly predicts this bacterial phenotype. We formulated the problem of predicting bacterial ionizing radiation resistance (IRR) as a multiple-instance learning (MIL) problem, and we proposed a novel approach for this purpose. We provide a MIL-based prediction system that classifies a bacterium to either IRRB or IRSB. The experimental results of the proposed system are satisfactory with 91.5% of successful predictions.

  9. A Statistical Approach for Gain Bandwidth Prediction of Phoenix-Cell Based Reflect arrays

    Directory of Open Access Journals (Sweden)

    Hassan Salti

    2018-01-01

    Full Text Available A new statistical approach to predict the gain bandwidth of Phoenix-cell based reflectarrays is proposed. It combines the effects of both main factors that limit the bandwidth of reflectarrays: spatial phase delays and intrinsic bandwidth of radiating cells. As an illustration, the proposed approach is successfully applied to two reflectarrays based on new Phoenix cells.

  10. Using the expected detection delay to assess the performance of different multivariate statistical process monitoring methods for multiplicative and drift faults.

    Science.gov (United States)

    Zhang, Kai; Shardt, Yuri A W; Chen, Zhiwen; Peng, Kaixiang

    2017-03-01

    Using the expected detection delay (EDD) index to measure the performance of multivariate statistical process monitoring (MSPM) methods for constant additive faults have been recently developed. This paper, based on a statistical investigation of the T 2 - and Q-test statistics, extends the EDD index to the multiplicative and drift fault cases. As well, it is used to assess the performance of common MSPM methods that adopt these two test statistics. Based on how to use the measurement space, these methods can be divided into two groups, those which consider the complete measurement space, for example, principal component analysis-based methods, and those which only consider some subspace that reflects changes in key performance indicators, such as partial least squares-based methods. Furthermore, a generic form for them to use T 2 - and Q-test statistics are given. With the extended EDD index, the performance of these methods to detect drift and multiplicative faults is assessed using both numerical simulations and the Tennessee Eastman process. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  11. A comparison of Landsat point and rectangular field training sets for land-use classification

    Science.gov (United States)

    Tom, C. H.; Miller, L. D.

    1984-01-01

    Rectangular training fields of homogeneous spectroreflectance are commonly used in supervised pattern recognition efforts. Trial image classification with manually selected training sets gives irregular and misleading results due to statistical bias. A self-verifying, grid-sampled training point approach is proposed as a more statistically valid feature extraction technique. A systematic pixel sampling network of every ninth row and ninth column efficiently replaced the full image scene with smaller statistical vectors which preserved the necessary characteristics for classification. The composite second- and third-order average classification accuracy of 50.1 percent for 331,776 pixels in the full image substantially agreed with the 51 percent value predicted by the grid-sampled, 4,100-point training set.

  12. Analyzing Statistical Mediation with Multiple Informants: A New Approach with an Application in Clinical Psychology.

    Science.gov (United States)

    Papa, Lesther A; Litson, Kaylee; Lockhart, Ginger; Chassin, Laurie; Geiser, Christian

    2015-01-01

    Testing mediation models is critical for identifying potential variables that need to be targeted to effectively change one or more outcome variables. In addition, it is now common practice for clinicians to use multiple informant (MI) data in studies of statistical mediation. By coupling the use of MI data with statistical mediation analysis, clinical researchers can combine the benefits of both techniques. Integrating the information from MIs into a statistical mediation model creates various methodological and practical challenges. The authors review prior methodological approaches to MI mediation analysis in clinical research and propose a new latent variable approach that overcomes some limitations of prior approaches. An application of the new approach to mother, father, and child reports of impulsivity, frustration tolerance, and externalizing problems (N = 454) is presented. The results showed that frustration tolerance mediated the relationship between impulsivity and externalizing problems. The new approach allows for a more comprehensive and effective use of MI data when testing mediation models.

  13. Modelling short- and long-term statistical learning of music as a process of predictive entropy reduction

    DEFF Research Database (Denmark)

    Hansen, Niels Christian; Loui, Psyche; Vuust, Peter

    Statistical learning underlies the generation of expectations with different degrees of uncertainty. In music, uncertainty applies to expectations for pitches in a melody. This uncertainty can be quantified by Shannon entropy from distributions of expectedness ratings for multiple continuations o...

  14. Phase transitions in multiplicative competitive processes

    International Nuclear Information System (INIS)

    Shimazaki, Hideaki; Niebur, Ernst

    2005-01-01

    We introduce a discrete multiplicative process as a generic model of competition. Players with different abilities successively join the game and compete for finite resources. Emergence of dominant players and evolutionary development occur as a phase transition. The competitive dynamics underlying this transition is understood from a formal analogy to statistical mechanics. The theory is applicable to bacterial competition, predicting novel population dynamics near criticality

  15. Multiple Illuminant Colour Estimation via Statistical Inference on Factor Graphs.

    Science.gov (United States)

    Mutimbu, Lawrence; Robles-Kelly, Antonio

    2016-08-31

    This paper presents a method to recover a spatially varying illuminant colour estimate from scenes lit by multiple light sources. Starting with the image formation process, we formulate the illuminant recovery problem in a statistically datadriven setting. To do this, we use a factor graph defined across the scale space of the input image. In the graph, we utilise a set of illuminant prototypes computed using a data driven approach. As a result, our method delivers a pixelwise illuminant colour estimate being devoid of libraries or user input. The use of a factor graph also allows for the illuminant estimates to be recovered making use of a maximum a posteriori (MAP) inference process. Moreover, we compute the probability marginals by performing a Delaunay triangulation on our factor graph. We illustrate the utility of our method for pixelwise illuminant colour recovery on widely available datasets and compare against a number of alternatives. We also show sample colour correction results on real-world images.

  16. On statistical analysis of compound point process

    Czech Academy of Sciences Publication Activity Database

    Volf, Petr

    2006-01-01

    Roč. 35, 2-3 (2006), s. 389-396 ISSN 1026-597X R&D Projects: GA ČR(CZ) GA402/04/1294 Institutional research plan: CEZ:AV0Z10750506 Keywords : counting process * compound process * hazard function * Cox -model Subject RIV: BB - Applied Statistics, Operational Research

  17. Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes.

    Science.gov (United States)

    Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M

    2011-12-01

    This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.

  18. Geometrical prediction of maximum power point for photovoltaics

    International Nuclear Information System (INIS)

    Kumar, Gaurav; Panchal, Ashish K.

    2014-01-01

    Highlights: • Direct MPP finding by parallelogram constructed from geometry of I–V curve of cell. • Exact values of V and P at MPP obtained by Lagrangian interpolation exploration. • Extensive use of Lagrangian interpolation for implementation of proposed method. • Method programming on C platform with minimum computational burden. - Abstract: It is important to drive solar photovoltaic (PV) system to its utmost capacity using maximum power point (MPP) tracking algorithms. This paper presents a direct MPP prediction method for a PV system considering the geometry of the I–V characteristic of a solar cell and a module. In the first step, known as parallelogram exploration (PGE), the MPP is determined from a parallelogram constructed using the open circuit (OC) and the short circuit (SC) points of the I–V characteristic and Lagrangian interpolation. In the second step, accurate values of voltage and power at the MPP, defined as V mp and P mp respectively, are decided by the Lagrangian interpolation formula, known as the Lagrangian interpolation exploration (LIE). Specifically, this method works with a few (V, I) data points instead most of the MPP algorithms work with (P, V) data points. The performance of the method is examined by several PV technologies including silicon, copper indium gallium selenide (CIGS), copper zinc tin sulphide selenide (CZTSSe), organic, dye sensitized solar cell (DSSC) and organic tandem cells’ data previously reported in literatures. The effectiveness of the method is tested experimentally for a few silicon cells’ I–V characteristics considering variation in the light intensity and the temperature. At last, the method is also employed for a 10 W silicon module tested in the field. To testify the preciseness of the method, an absolute value of the derivative of power (P) with respect to voltage (V) defined as (dP/dV) is evaluated and plotted against V. The method estimates the MPP parameters with high accuracy for any

  19. Multiple-Factor Based Sparse Urban Travel Time Prediction

    Directory of Open Access Journals (Sweden)

    Xinyan Zhu

    2018-02-01

    Full Text Available The prediction of travel time is challenging given the sparseness of real-time traffic data and the uncertainty of travel, because it is influenced by multiple factors on the congested urban road networks. In our paper, we propose a three-layer neural network from big probe vehicles data incorporating multi-factors to estimate travel time. The procedure includes the following three steps. First, we aggregate data according to the travel time of a single taxi traveling a target link on working days as traffic flows display similar traffic patterns over a weekly cycle. We then extract feature relationships between target and adjacent links at 30 min interval. About 224,830,178 records are extracted from probe vehicles. Second, we design a three-layer artificial neural network model. The number of neurons in input layer is eight, and the number of neurons in output layer is one. Finally, the trained neural network model is used for link travel time prediction. Different factors are included to examine their influence on the link travel time. Our model is verified using historical data from probe vehicles collected from May to July 2014 in Wuhan, China. The results show that we could obtain the link travel time prediction results using the designed artificial neural network model and detect the influence of different factors on link travel time.

  20. A study of single multiplicative neuron model with nonlinear filters for hourly wind speed prediction

    International Nuclear Information System (INIS)

    Wu, Xuedong; Zhu, Zhiyu; Su, Xunliang; Fan, Shaosheng; Du, Zhaoping; Chang, Yanchao; Zeng, Qingjun

    2015-01-01

    Wind speed prediction is one important methods to guarantee the wind energy integrated into the whole power system smoothly. However, wind power has a non–schedulable nature due to the strong stochastic nature and dynamic uncertainty nature of wind speed. Therefore, wind speed prediction is an indispensable requirement for power system operators. Two new approaches for hourly wind speed prediction are developed in this study by integrating the single multiplicative neuron model and the iterated nonlinear filters for updating the wind speed sequence accurately. In the presented methods, a nonlinear state–space model is first formed based on the single multiplicative neuron model and then the iterated nonlinear filters are employed to perform dynamic state estimation on wind speed sequence with stochastic uncertainty. The suggested approaches are demonstrated using three cases wind speed data and are compared with autoregressive moving average, artificial neural network, kernel ridge regression based residual active learning and single multiplicative neuron model methods. Three types of prediction errors, mean absolute error improvement ratio and running time are employed for different models’ performance comparison. Comparison results from Tables 1–3 indicate that the presented strategies have much better performance for hourly wind speed prediction than other technologies. - Highlights: • Developed two novel hybrid modeling methods for hourly wind speed prediction. • Uncertainty and fluctuations of wind speed can be better explained by novel methods. • Proposed strategies have online adaptive learning ability. • Proposed approaches have shown better performance compared with existed approaches. • Comparison and analysis of two proposed novel models for three cases are provided

  1. Prediction of individual probabilities of livebirth and multiple birth events following in vitro fertilization (IVF): a new outcomes counselling tool for IVF providers and patients using HFEA metrics.

    Science.gov (United States)

    Jones, Christopher A; Christensen, Anna L; Salihu, Hamisu; Carpenter, William; Petrozzino, Jeffrey; Abrams, Elizabeth; Sills, Eric Scott; Keith, Louis G

    2011-01-01

    In vitro fertilization (IVF) has become a standard treatment for subfertility after it was demonstrated to be of value to humans in 1978. However, the introduction of IVF into mainstream clinical practice has been accompanied by concerns regarding the number of multiple gestations that it can produce, as multiple births present significant medical consequences to mothers and offspring. When considering IVF as a treatment modality, a balance must be set between the chance of having a live birth and the risk of having a multiple birth. As IVF is often a costly decision for patients-financially, medically, and emotionally-there is benefit from estimating a patient's specific chance that IVF could result in a birth as fertility treatment options are contemplated. Historically, a patient's "chance of success" with IVF has been approximated from institution-based statistics, rather than on the basis of any particular clinical parameter (except age). Furthermore, the likelihood of IVF resulting in a twin or triplet outcome must be acknowledged for each patient, given the known increased complications of multiple gestation and consequent increased risk of poor birth outcomes. In this research, we describe a multivariate risk assessment model that incorporates metrics adapted from a national 7.5-year sampling of the Human Fertilisation & Embryology Authority (HFEA) dataset (1991-1998) to predict reproductive outcome (including estimation of multiple birth) after IVF. To our knowledge, http://www.formyodds.com is the first Software-as-a-Service (SaaS) application to predict IVF outcome. The approach also includes a confirmation functionality, where clinicians can agree or disagree with the computer-generated outcome predictions. It is anticipated that the emergence of predictive tools will augment the reproductive endocrinology consultation, improve the medical informed consent process by tailoring the outcome assessment to each patient, and reduce the potential for adverse

  2. Statistics and predictions of population, energy and environment problems

    International Nuclear Information System (INIS)

    Sobajima, Makoto

    1999-03-01

    In the situation that world's population, especially in developing countries, is rapidly growing, humankind is facing to global problems that they cannot steadily live unless they find individual places to live, obtain foods, and peacefully get energy necessary for living for centuries. For this purpose, humankind has to think what behavior they should take in the finite environment, talk, agree and execute. Though energy has been long respected as a symbol for improving living, demanded and used, they have come to limit the use making the global environment more serious. If there is sufficient energy not loading cost to the environment. If nuclear energy regarded as such one sustain the resource for long and has market competitiveness. What situation of realization of compensating new energy is now in the case the use of nuclear energy is restricted by the society fearing radioactivity. If there are promising ones for the future. One concerning with the study of energy cannot go without knowing these. The statistical materials compiled here are thought to be useful for that purpose, and are collected mainly from ones viewing future prediction based on past practices. Studies on the prediction is so important to have future measures that these data bases are expected to be improved for better accuracy. (author)

  3. Information trimming: Sufficient statistics, mutual information, and predictability from effective channel states

    Science.gov (United States)

    James, Ryan G.; Mahoney, John R.; Crutchfield, James P.

    2017-06-01

    One of the most basic characterizations of the relationship between two random variables, X and Y , is the value of their mutual information. Unfortunately, calculating it analytically and estimating it empirically are often stymied by the extremely large dimension of the variables. One might hope to replace such a high-dimensional variable by a smaller one that preserves its relationship with the other. It is well known that either X (or Y ) can be replaced by its minimal sufficient statistic about Y (or X ) while preserving the mutual information. While intuitively reasonable, it is not obvious or straightforward that both variables can be replaced simultaneously. We demonstrate that this is in fact possible: the information X 's minimal sufficient statistic preserves about Y is exactly the information that Y 's minimal sufficient statistic preserves about X . We call this procedure information trimming. As an important corollary, we consider the case where one variable is a stochastic process' past and the other its future. In this case, the mutual information is the channel transmission rate between the channel's effective states. That is, the past-future mutual information (the excess entropy) is the amount of information about the future that can be predicted using the past. Translating our result about minimal sufficient statistics, this is equivalent to the mutual information between the forward- and reverse-time causal states of computational mechanics. We close by discussing multivariate extensions to this use of minimal sufficient statistics.

  4. Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

    Science.gov (United States)

    Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

    2008-04-01

    Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.

  5. Effect of reheating on predictions following multiple-field inflation

    Science.gov (United States)

    Hotinli, Selim C.; Frazer, Jonathan; Jaffe, Andrew H.; Meyers, Joel; Price, Layne C.; Tarrant, Ewan R. M.

    2018-01-01

    We study the sensitivity of cosmological observables to the reheating phase following inflation driven by many scalar fields. We describe a method which allows semianalytic treatment of the impact of perturbative reheating on cosmological perturbations using the sudden decay approximation. Focusing on N -quadratic inflation, we show how the scalar spectral index and tensor-to-scalar ratio are affected by the rates at which the scalar fields decay into radiation. We find that for certain choices of decay rates, reheating following multiple-field inflation can have a significant impact on the prediction of cosmological observables.

  6. Predicting The Exit Time Of Employees In An Organization Using Statistical Model

    Directory of Open Access Journals (Sweden)

    Ahmed Al Kuwaiti

    2015-08-01

    Full Text Available Employees are considered as an asset to any organization and each organization provide a better and flexible working environment to retain its best and resourceful workforce. As such continuous efforts are being taken to avoid or extend the exitwithdrawal of employees from the organization. Human resource managers are facing a challenge to predict the exit time of employees and there is no precise model existing at present in the literature. This study has been conducted to predict the probability of exit of an employee in an organization using appropriate statistical model. Accordingly authors designed a model using Additive Weibull distribution to predict the expected exit time of employee in an organization. In addition a Shock model approach is also executed to check how well the Additive Weibull distribution suits in an organization. The analytical results showed that when the inter-arrival time increases the expected time for the employees to exit also increases. This study concluded that Additive Weibull distribution can be considered as an alternative in the place of Shock model approach to predict the exit time of employee in an organization.

  7. Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models

    Directory of Open Access Journals (Sweden)

    LeDonne Norman C

    2011-02-01

    Full Text Available Abstract Background Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI curves via the SALI curve integral (SCI as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists. Results The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK in vitro efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from SimulationsPlus and AstraZeneca. Conclusion The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.

  8. Predictive Systems Toxicology

    KAUST Repository

    Kiani, Narsis A.; Shang, Ming-Mei; Zenil, Hector; Tegner, Jesper

    2018-01-01

    In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.

  9. Predictive Systems Toxicology

    KAUST Repository

    Kiani, Narsis A.

    2018-01-15

    In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.

  10. Finite Control Set Model Predictive Control for Multiple Distributed Generators Microgrids

    Science.gov (United States)

    Babqi, Abdulrahman Jamal

    This dissertation proposes two control strategies for AC microgrids that consist of multiple distributed generators (DGs). The control strategies are valid for both grid-connected and islanded modes of operation. In general, microgrid can operate as a stand-alone system (i.e., islanded mode) or while it is connected to the utility grid (i.e., grid connected mode). To enhance the performance of a micrgorid, a sophisticated control scheme should be employed. The control strategies of microgrids can be divided into primary and secondary controls. The primary control regulates the output active and reactive powers of each DG in grid-connected mode as well as the output voltage and frequency of each DG in islanded mode. The secondary control is responsible for regulating the microgrid voltage and frequency in the islanded mode. Moreover, it provides power sharing schemes among the DGs. In other words, the secondary control specifies the set points (i.e. reference values) for the primary controllers. In this dissertation, Finite Control Set Model Predictive Control (FCS-MPC) was proposed for controlling microgrids. FCS-MPC was used as the primary controller to regulate the output power of each DG (in the grid-connected mode) or the voltage of the point of DG coupling (in the islanded mode of operation). In the grid-connected mode, Direct Power Model Predictive Control (DPMPC) was implemented to manage the power flow between each DG and the utility grid. In the islanded mode, Voltage Model Predictive Control (VMPC), as the primary control, and droop control, as the secondary control, were employed to control the output voltage of each DG and system frequency. The controller was equipped with a supplementary current limiting technique in order to limit the output current of each DG in abnormal incidents. The control approach also enabled smooth transition between the two modes. The performance of the control strategy was investigated and verified using PSCAD/EMTDC software

  11. A heuristic model for computational prediction of human branch point sequence.

    Science.gov (United States)

    Wen, Jia; Wang, Jue; Zhang, Qing; Guo, Dianjing

    2017-10-24

    Pre-mRNA splicing is the removal of introns from precursor mRNAs (pre-mRNAs) and the concurrent ligation of the flanking exons to generate mature mRNA. This process is catalyzed by the spliceosome, where the splicing factor 1 (SF1) specifically recognizes the seven-nucleotide branch point sequence (BPS) and the U2 snRNP later displaces the SF1 and binds to the BPS. In mammals, the degeneracy of BPS motifs together with the lack of a large set of experimentally verified BPSs complicates the task of BPS prediction in silico. In this paper, we develop a simple and yet efficient heuristic model for human BPS prediction based on a novel scoring scheme, which quantifies the splicing strength of putative BPSs. The candidate BPS is restricted exclusively within a defined BPS search region to avoid the influences of other elements in the intron and therefore the prediction accuracy is improved. Moreover, using two types of relative frequencies for human BPS prediction, we demonstrate our model outperformed other current implementations on experimentally verified human introns. We propose that the binding energy contributes to the molecular recognition involved in human pre-mRNA splicing. In addition, a genome-wide human BPS prediction is carried out. The characteristics of predicted BPSs are in accordance with experimentally verified human BPSs, and branch site positions relative to the 3'ss and the 5'end of the shortened AGEZ are consistent with the results of published papers. Meanwhile, a webserver for BPS predictor is freely available at http://biocomputer.bio.cuhk.edu.hk/BPS .

  12. Dynamic analysis of multiple nuclear-coupled boiling channels based on a multi-point reactor model

    International Nuclear Information System (INIS)

    Lee, J.D.; Pan Chin

    2005-01-01

    This work investigates the non-linear dynamics and stabilities of a multiple nuclear-coupled boiling channel system based on a multi-point reactor model using the Galerkin nodal approximation method. The nodal approximation method for the multiple boiling channels developed by Lee and Pan [Lee, J.D., Pan, C., 1999. Dynamics of multiple parallel boiling channel systems with forced flows. Nucl. Eng. Des. 192, 31-44] is extended to address the two-phase flow dynamics in the present study. The multi-point reactor model, modified from Uehiro et al. [Uehiro, M., Rao, Y.F., Fukuda, K., 1996. Linear stability analysis on instabilities of in-phase and out-of-phase modes in boiling water reactors. J. Nucl. Sci. Technol. 33, 628-635], is employed to study a multiple-channel system with unequal steady-state neutron density distribution. Stability maps, non-linear dynamics and effects of major parameters on the multiple nuclear-coupled boiling channel system subject to a constant total flow rate are examined. This study finds that the void-reactivity feedback and neutron interactions among subcores are coupled and their competing effects may influence the system stability under different operating conditions. For those cases with strong neutron interaction conditions, by strengthening the void-reactivity feedback, the nuclear-coupled effect on the non-linear dynamics may induce two unstable oscillation modes, the supercritical Hopf bifurcation and the subcritical Hopf bifurcation. Moreover, for those cases with weak neutron interactions, by quadrupling the void-reactivity feedback coefficient, period-doubling and complex chaotic oscillations may appear in a three-channel system under some specific operating conditions. A unique type of complex chaotic attractor may evolve from the Rossler attractor because of the coupled channel-to-channel thermal-hydraulic and subcore-to-subcore neutron interactions. Such a complex chaotic attractor has the imbedding dimension of 5 and the

  13. New England observed and predicted August stream/river temperature maximum daily rate of change points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted August stream/river temperature maximum negative rate of change in New England based on a...

  14. Estimation of Multiple Point Sources for Linear Fractional Order Systems Using Modulating Functions

    KAUST Repository

    Belkhatir, Zehor

    2017-06-28

    This paper proposes an estimation algorithm for the characterization of multiple point inputs for linear fractional order systems. First, using polynomial modulating functions method and a suitable change of variables the problem of estimating the locations and the amplitudes of a multi-pointwise input is decoupled into two algebraic systems of equations. The first system is nonlinear and solves for the time locations iteratively, whereas the second system is linear and solves for the input’s amplitudes. Second, closed form formulas for both the time location and the amplitude are provided in the particular case of single point input. Finally, numerical examples are given to illustrate the performance of the proposed technique in both noise-free and noisy cases. The joint estimation of pointwise input and fractional differentiation orders is also presented. Furthermore, a discussion on the performance of the proposed algorithm is provided.

  15. Fuzzy logic prediction of dew point pressure of selected Iranian gas condensate reservoirs

    Energy Technology Data Exchange (ETDEWEB)

    Nowroozi, Saeed [Shahid Bahonar Univ. of Kerman (Iran); Iranian Offshore Oil Company (I.O.O.C.) (Iran); Ranjbar, Mohammad; Hashemipour, Hassan; Schaffie, Mahin [Shahid Bahonar Univ. of Kerman (Iran)

    2009-12-15

    The experimental determination of dew point pressure in a window PVT cell is often difficult especially in the case of lean retrograde gas condensate. Besides all statistical, graphical and experimental methods, the fuzzy logic method can be useful and more reliable for estimation of reservoir properties. Fuzzy logic can overcome uncertainty existent in many reservoir properties. Complexity, non-linearity and vagueness are some reservoir parameter characteristics, which can be propagated simply by fuzzy logic. The fuzzy logic dew point pressure modeling system used in this study is a multi input single output (MISO) Mamdani system. The model was developed using experimentally constant volume depletion (CVD) measured samples of some Iranian fields. The performance of the model is compared against the performance of some of the most accurate and general correlations for dew point pressure calculation. Results show that this novel method is more accurate and reliable with an average absolute deviation of 1.33% and 2.68% for developing and checking, respectively. (orig.)

  16. A comparative study between the use of artificial neural networks and multiple linear regression for caustic concentration prediction in a stage of alumina production

    Directory of Open Access Journals (Sweden)

    Giovanni Leopoldo Rozza

    2015-09-01

    Full Text Available With world becoming each day a global village, enterprises continuously seek to optimize their internal processes to hold or improve their competitiveness and make better use of natural resources. In this context, decision support tools are an underlying requirement. Such tools are helpful on predicting operational issues, avoiding cost risings, loss of productivity, work-related accident leaves or environmental disasters. This paper has its focus on the prediction of spent liquor caustic concentration of Bayer process for alumina production. Caustic concentration measuring is essential to keep it at expected levels, otherwise quality issues might arise. The organization requests caustic concentration by chemical analysis laboratory once a day, such information is not enough to issue preventive actions to handle process inefficiencies that will be known only after new measurement on the next day. Thereby, this paper proposes using Multiple Linear Regression and Artificial Neural Networks techniques a mathematical model to predict the spent liquor´s caustic concentration. Hence preventive actions will occur in real time. Such models were built using software tool for numerical computation (MATLAB and a statistical analysis software package (SPSS. The models output (predicted caustic concentration were compared with the real lab data. We found evidence suggesting superior results with use of Artificial Neural Networks over Multiple Linear Regression model. The results demonstrate that replacing laboratorial analysis by the forecasting model to support technical staff on decision making could be feasible.

  17. MR Imaging in Monitoring and Predicting Treatment Response in Multiple Sclerosis.

    Science.gov (United States)

    Río, Jordi; Auger, Cristina; Rovira, Àlex

    2017-05-01

    MR imaging is the most sensitive tool for identifying lesions in patients with multiple sclerosis (MS). MR imaging has also acquired an essential role in the detection of complications arising from these treatments and in the assessment and prediction of efficacy. In the future, other radiological measures that have shown prognostic value may be incorporated within the models for predicting treatment response. This article examines the role of MR imaging as a prognostic tool in patients with MS and the recommendations that have been proposed in recent years to monitor patients who are treated with disease-modifying drugs. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Statistical prediction of nanoparticle delivery: from culture media to cell

    Science.gov (United States)

    Rowan Brown, M.; Hondow, Nicole; Brydson, Rik; Rees, Paul; Brown, Andrew P.; Summers, Huw D.

    2015-04-01

    The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for

  19. Statistical prediction of nanoparticle delivery: from culture media to cell

    International Nuclear Information System (INIS)

    Brown, M Rowan; Rees, Paul; Summers, Huw D; Hondow, Nicole; Brydson, Rik; Brown, Andrew P

    2015-01-01

    The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for

  20. New England observed and predicted Julian day of maximum growing season stream/river temperature points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted Julian day of maximum growing season stream/river temperatures in New England based on a spatial...

  1. Predicting Fuel Ignition Quality Using 1H NMR Spectroscopy and Multiple Linear Regression

    KAUST Repository

    Abdul Jameel, Abdul Gani; Naser, Nimal; Emwas, Abdul-Hamid M.; Dooley, Stephen; Sarathy, Mani

    2016-01-01

    An improved model for the prediction of ignition quality of hydrocarbon fuels has been developed using 1H nuclear magnetic resonance (NMR) spectroscopy and multiple linear regression (MLR) modeling. Cetane number (CN) and derived cetane number (DCN

  2. Sensitivity of point scale surface runoff predictions to rainfall resolution

    Directory of Open Access Journals (Sweden)

    A. J. Hearman

    2007-01-01

    Full Text Available This paper investigates the effects of using non-linear, high resolution rainfall, compared to time averaged rainfall on the triggering of hydrologic thresholds and therefore model predictions of infiltration excess and saturation excess runoff at the point scale. The bounded random cascade model, parameterized to three locations in Western Australia, was used to scale rainfall intensities at various time resolutions ranging from 1.875 min to 2 h. A one dimensional, conceptual rainfall partitioning model was used that instantaneously partitioned water into infiltration excess, infiltration, storage, deep drainage, saturation excess and surface runoff, where the fluxes into and out of the soil store were controlled by thresholds. The results of the numerical modelling were scaled by relating soil infiltration properties to soil draining properties, and in turn, relating these to average storm intensities. For all soil types, we related maximum infiltration capacities to average storm intensities (k* and were able to show where model predictions of infiltration excess were most sensitive to rainfall resolution (ln k*=0.4 and where using time averaged rainfall data can lead to an under prediction of infiltration excess and an over prediction of the amount of water entering the soil (ln k*>2 for all three rainfall locations tested. For soils susceptible to both infiltration excess and saturation excess, total runoff sensitivity was scaled by relating drainage coefficients to average storm intensities (g* and parameter ranges where predicted runoff was dominated by infiltration excess or saturation excess depending on the resolution of rainfall data were determined (ln g*<2. Infiltration excess predicted from high resolution rainfall was short and intense, whereas saturation excess produced from low resolution rainfall was more constant and less intense. This has important implications for the accuracy of current hydrological models that use time

  3. Artificial neural network optimisation for monthly average daily global solar radiation prediction

    International Nuclear Information System (INIS)

    Alsina, Emanuel Federico; Bortolini, Marco; Gamberi, Mauro; Regattieri, Alberto

    2016-01-01

    Highlights: • Prediction of the monthly average daily global solar radiation over Italy. • Multi-location Artificial Neural Network (ANN) model: 45 locations considered. • Optimal ANN configuration with 7 input climatologic/geographical parameters. • Statistical indicators: MAPE, NRMSE, MPBE. - Abstract: The availability of reliable climatologic data is essential for multiple purposes in a wide set of anthropic activities and operative sectors. Frequently direct measures present spatial and temporal lacks so that predictive approaches become of interest. This paper focuses on the prediction of the Monthly Average Daily Global Solar Radiation (MADGSR) over Italy using Artificial Neural Networks (ANNs). Data from 45 locations compose the multi-location ANN training and testing sets. For each location, 13 input parameters are considered, including the geographical coordinates and the monthly values for the most frequently adopted climatologic parameters. A subset of 17 locations is used for ANN training, while the testing step is against data from the remaining 28 locations. Furthermore, the Automatic Relevance Determination method (ARD) is used to point out the most relevant input for the accurate MADGSR prediction. The ANN best configuration includes 7 parameters, only, i.e. Top of Atmosphere (TOA) radiation, day length, number of rainy days and average rainfall, latitude and altitude. The correlation performances, expressed through statistical indicators as the Mean Absolute Percentage Error (MAPE), range between 1.67% and 4.25%, depending on the number and type of the chosen input, representing a good solution compared to the current standards.

  4. Nuclear multifragmentation within the framework of different statistical ensembles

    International Nuclear Information System (INIS)

    Aguiar, C.E.; Donangelo, R.; Souza, S.R.

    2006-01-01

    The sensitivity of the statistical multifragmentation model to the underlying statistical assumptions is investigated. We concentrate on its microcanonical, canonical, and isobaric formulations. As far as average values are concerned, our results reveal that all the ensembles make very similar predictions, as long as the relevant macroscopic variables (such as temperature, excitation energy, and breakup volume) are the same in all statistical ensembles. It also turns out that the multiplicity dependence of the breakup volume in the microcanonical version of the model mimics a system at (approximately) constant pressure, at least in the plateau region of the caloric curve. However, in contrast to average values, our results suggest that the distributions of physical observables are quite sensitive to the statistical assumptions. This finding may help in deciding which hypothesis corresponds to the best picture for the freeze-out stage

  5. Bioluminescence in vivo imaging of autoimmune encephalomyelitis predicts disease

    Directory of Open Access Journals (Sweden)

    Steinman Lawrence

    2008-02-01

    Full Text Available Abstract Background Experimental autoimmune encephalomyelitis is a widely used animal model to understand not only multiple sclerosis but also basic principles of immunity. The disease is scored typically by observing signs of paralysis, which do not always correspond with pathological changes. Methods Experimental autoimmune encephalomyelitis was induced in transgenic mice expressing an injury responsive luciferase reporter in astrocytes (GFAP-luc. Bioluminescence in the brain and spinal cord was measured non-invasively in living mice. Mice were sacrificed at different time points to evaluate clinical and pathological changes. The correlation between bioluminescence and clinical and pathological EAE was statistically analyzed by Pearson correlation analysis. Results Bioluminescence from the brain and spinal cord correlates strongly with severity of clinical disease and a number of pathological changes in the brain in EAE. Bioluminescence at early time points also predicts severity of disease. Conclusion These results highlight the potential use of bioluminescence imaging to monitor neuroinflammation for rapid drug screening and immunological studies in EAE and suggest that similar approaches could be applied to other animal models of autoimmune and inflammatory disorders.

  6. Evaluation of burst pressure prediction models for line pipes

    Energy Technology Data Exchange (ETDEWEB)

    Zhu, Xian-Kui, E-mail: zhux@battelle.org [Battelle Memorial Institute, 505 King Avenue, Columbus, OH 43201 (United States); Leis, Brian N. [Battelle Memorial Institute, 505 King Avenue, Columbus, OH 43201 (United States)

    2012-01-15

    Accurate prediction of burst pressure plays a central role in engineering design and integrity assessment of oil and gas pipelines. Theoretical and empirical solutions for such prediction are evaluated in this paper relative to a burst pressure database comprising more than 100 tests covering a variety of pipeline steel grades and pipe sizes. Solutions considered include three based on plasticity theory for the end-capped, thin-walled, defect-free line pipe subjected to internal pressure in terms of the Tresca, von Mises, and ZL (or Zhu-Leis) criteria, one based on a cylindrical instability stress (CIS) concept, and a large group of analytical and empirical models previously evaluated by Law and Bowie (International Journal of Pressure Vessels and Piping, 84, 2007: 487-492). It is found that these models can be categorized into either a Tresca-family or a von Mises-family of solutions, except for those due to Margetson and Zhu-Leis models. The viability of predictions is measured via statistical analyses in terms of a mean error and its standard deviation. Consistent with an independent parallel evaluation using another large database, the Zhu-Leis solution is found best for predicting burst pressure, including consideration of strain hardening effects, while the Tresca strength solutions including Barlow, Maximum shear stress, Turner, and the ASME boiler code provide reasonably good predictions for the class of line-pipe steels with intermediate strain hardening response. - Highlights: Black-Right-Pointing-Pointer This paper evaluates different burst pressure prediction models for line pipes. Black-Right-Pointing-Pointer The existing models are categorized into two major groups of Tresca and von Mises solutions. Black-Right-Pointing-Pointer Prediction quality of each model is assessed statistically using a large full-scale burst test database. Black-Right-Pointing-Pointer The Zhu-Leis solution is identified as the best predictive model.

  7. Latitude-energy structure of multiple ion beamlets in Polar/TIMAS data in plasma sheet boundary layer and boundary plasma sheet below 6 RE radial distance: basic properties and statistical analysis

    Directory of Open Access Journals (Sweden)

    P. Janhunen

    2005-03-01

    Full Text Available Velocity dispersed ion signatures (VDIS occurring at the plasma sheet boundary layer (PSBL are a well reported feature. Theory has, however, predicted the existence of multiple ion beamlets, similar to VDIS, in the boundary plasma sheet (BPS, i.e. at latitudes below the PSBL. In this study we show evidence for the multiple ion beamlets in Polar/TIMAS ion data and basic properties of the ion beamlets will be presented. Statistics of the occurrence frequency of ion multiple beamlets show that they are most common in the midnight MLT sector and for altitudes above 4 RE, while at low altitude (≤3 RE, single beamlets at PSBL (VDIS are more common. Distribution functions of ion beamlets in velocity space have recently been shown to correspond to 3-dimensional hollow spheres, containing a large amount of free energy. We also study correlation with ~100 Hz waves and electron anisotropies and consider the possibility that ion beamlets correspond to stable auroral arcs.

  8. Detection of uterine MMG contractions using a multiple change point estimator and the K-means cluster algorithm.

    Science.gov (United States)

    La Rosa, Patricio S; Nehorai, Arye; Eswaran, Hari; Lowery, Curtis L; Preissl, Hubert

    2008-02-01

    We propose a single channel two-stage time-segment discriminator of uterine magnetomyogram (MMG) contractions during pregnancy. We assume that the preprocessed signals are piecewise stationary having distribution in a common family with a fixed number of parameters. Therefore, at the first stage, we propose a model-based segmentation procedure, which detects multiple change-points in the parameters of a piecewise constant time-varying autoregressive model using a robust formulation of the Schwarz information criterion (SIC) and a binary search approach. In particular, we propose a test statistic that depends on the SIC, derive its asymptotic distribution, and obtain closed-form optimal detection thresholds in the sense of the Neyman-Pearson criterion; therefore, we control the probability of false alarm and maximize the probability of change-point detection in each stage of the binary search algorithm. We compute and evaluate the relative energy variation [root mean squares (RMS)] and the dominant frequency component [first order zero crossing (FOZC)] in discriminating between time segments with and without contractions. The former consistently detects a time segment with contractions. Thus, at the second stage, we apply a nonsupervised K-means cluster algorithm to classify the detected time segments using the RMS values. We apply our detection algorithm to real MMG records obtained from ten patients admitted to the hospital for contractions with gestational ages between 31 and 40 weeks. We evaluate the performance of our detection algorithm in computing the detection and false alarm rate, respectively, using as a reference the patients' feedback. We also analyze the fusion of the decision signals from all the sensors as in the parallel distributed detection approach.

  9. Forecasting winds over nuclear power plants statistics

    International Nuclear Information System (INIS)

    Marais, Ch.

    1997-01-01

    In the event of an accident at nuclear power plant, it is essential to forecast the wind velocity at the level where the efflux occurs (about 100 m). At present meteorologists refine the wind forecast from the coarse grid of numerical weather prediction (NWP) models. The purpose of this study is to improve the forecasts by developing a statistical adaptation method which corrects the NWP forecasts by using statistical comparisons between wind forecasts and observations. The Multiple Linear Regression method is used here to forecast the 100 m wind at 12 and 24 hours range for three Electricite de France (EDF) sites. It turns out that this approach gives better forecasts than the NWP model alone and is worthy of operational use. (author)

  10. Statistical modelling of space-time processes with application to wind power

    DEFF Research Database (Denmark)

    Lenzi, Amanda

    . This thesis aims at contributing to the wind power literature by building and evaluating new statistical techniques for producing forecasts at multiple locations and lead times using spatio-temporal information. By exploring the features of a rich portfolio of wind farms in western Denmark, we investigate...... propose spatial models for predicting wind power generation at two different time scales: for annual average wind power generation and for a high temporal resolution (typically wind power averages over 15-min time steps). In both cases, we use a spatial hierarchical statistical model in which spatial...

  11. Wind gust estimation by combining numerical weather prediction model and statistical post-processing

    Science.gov (United States)

    Patlakas, Platon; Drakaki, Eleni; Galanis, George; Spyrou, Christos; Kallos, George

    2017-04-01

    The continuous rise of off-shore and near-shore activities as well as the development of structures, such as wind farms and various offshore platforms, requires the employment of state-of-the-art risk assessment techniques. Such analysis is used to set the safety standards and can be characterized as a climatologically oriented approach. Nevertheless, a reliable operational support is also needed in order to minimize cost drawbacks and human danger during the construction and the functioning stage as well as during maintenance activities. One of the most important parameters for this kind of analysis is the wind speed intensity and variability. A critical measure associated with this variability is the presence and magnitude of wind gusts as estimated in the reference level of 10m. The latter can be attributed to different processes that vary among boundary-layer turbulence, convection activities, mountain waves and wake phenomena. The purpose of this work is the development of a wind gust forecasting methodology combining a Numerical Weather Prediction model and a dynamical statistical tool based on Kalman filtering. To this end, the parameterization of Wind Gust Estimate method was implemented to function within the framework of the atmospheric model SKIRON/Dust. The new modeling tool combines the atmospheric model with a statistical local adaptation methodology based on Kalman filters. This has been tested over the offshore west coastline of the United States. The main purpose is to provide a useful tool for wind analysis and prediction and applications related to offshore wind energy (power prediction, operation and maintenance). The results have been evaluated by using observational data from the NOAA's buoy network. As it was found, the predicted output shows a good behavior that is further improved after the local adjustment post-process.

  12. Factors That Predict Marijuana Use and Grade Point Average among Undergraduate College Students

    Science.gov (United States)

    Coco, Marlena B.

    2017-01-01

    The purpose of this study was to analyze factors that predict marijuana use and grade point average among undergraduate college students using the Core Institute national database. The Core Alcohol and Drug Survey was used to collect data on students' attitudes, beliefs, and experiences related to substance use in college. The sample used in this…

  13. Multiple Positive Solutions of a Nonlinear Four-Point Singular Boundary Value Problem with a p-Laplacian Operator on Time Scales

    Directory of Open Access Journals (Sweden)

    Shihuang Hong

    2009-01-01

    Full Text Available We present sufficient conditions for the existence of at least twin or triple positive solutions of a nonlinear four-point singular boundary value problem with a p-Laplacian dynamic equation on a time scale. Our results are obtained via some new multiple fixed point theorems.

  14. Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

    Directory of Open Access Journals (Sweden)

    Fabrizio Pucci

    Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.

  15. A multiple model approach to respiratory motion prediction for real-time IGRT

    International Nuclear Information System (INIS)

    Putra, Devi; Haas, Olivier C L; Burnham, Keith J; Mills, John A

    2008-01-01

    Respiration induces significant movement of tumours in the vicinity of thoracic and abdominal structures. Real-time image-guided radiotherapy (IGRT) aims to adapt radiation delivery to tumour motion during irradiation. One of the main problems for achieving this objective is the presence of time lag between the acquisition of tumour position and the radiation delivery. Such time lag causes significant beam positioning errors and affects the dose coverage. A method to solve this problem is to employ an algorithm that is able to predict future tumour positions from available tumour position measurements. This paper presents a multiple model approach to respiratory-induced tumour motion prediction using the interacting multiple model (IMM) filter. A combination of two models, constant velocity (CV) and constant acceleration (CA), is used to capture respiratory-induced tumour motion. A Kalman filter is designed for each of the local models and the IMM filter is applied to combine the predictions of these Kalman filters for obtaining the predicted tumour position. The IMM filter, likewise the Kalman filter, is a recursive algorithm that is suitable for real-time applications. In addition, this paper proposes a confidence interval (CI) criterion to evaluate the performance of tumour motion prediction algorithms for IGRT. The proposed CI criterion provides a relevant measure for the prediction performance in terms of clinical applications and can be used to specify the margin to accommodate prediction errors. The prediction performance of the IMM filter has been evaluated using 110 traces of 4-minute free-breathing motion collected from 24 lung-cancer patients. The simulation study was carried out for prediction time 0.1-0.6 s with sampling rates 3, 5 and 10 Hz. It was found that the prediction of the IMM filter was consistently better than the prediction of the Kalman filter with the CV or CA model. There was no significant difference of prediction errors for the

  16. Handbook of Spatial Statistics

    CERN Document Server

    Gelfand, Alan E

    2010-01-01

    Offers an introduction detailing the evolution of the field of spatial statistics. This title focuses on the three main branches of spatial statistics: continuous spatial variation (point referenced data); discrete spatial variation, including lattice and areal unit data; and, spatial point patterns.

  17. Analyzing Statistical Mediation with Multiple Informants: A New Approach with an Application in Clinical Psychology

    Directory of Open Access Journals (Sweden)

    Lesther ePapa

    2015-11-01

    Full Text Available Testing mediation models is critical for identifying potential variables that need to be targeted to effectively change one or more outcome variables. In addition, it is now common practice for clinicians to use multiple informant (MI data in studies of statistical mediation. By coupling the use of MI data with statistical mediation analysis, clinical researchers can combine the benefits of both techniques. Integrating the information from MIs into a statistical mediation model creates various methodological and practical challenges. The authors review prior methodological approaches to MI mediation analysis in clinical research and propose a new latent variable approach that overcomes some limitations of prior approaches. An application of the new approach to mother, father, and child reports of impulsivity, frustration tolerance, and externalizing problems (N = 454 is presented. The results showed that frustration tolerance mediated the relationship between impulsivity and externalizing problems. Advantages and limitations of the new approach are discussed. The new approach can help clinical researchers overcome limitations of prior techniques. It allows for a more comprehensive and effective use of MI data when testing mediation models.

  18. Hydrogen-bond coordination in organic crystal structures: statistics, predictions and applications.

    Science.gov (United States)

    Galek, Peter T A; Chisholm, James A; Pidcock, Elna; Wood, Peter A

    2014-02-01

    Statistical models to predict the number of hydrogen bonds that might be formed by any donor or acceptor atom in a crystal structure have been derived using organic structures in the Cambridge Structural Database. This hydrogen-bond coordination behaviour has been uniquely defined for more than 70 unique atom types, and has led to the development of a methodology to construct hypothetical hydrogen-bond arrangements. Comparing the constructed hydrogen-bond arrangements with known crystal structures shows promise in the assessment of structural stability, and some initial examples of industrially relevant polymorphs, co-crystals and hydrates are described.

  19. Validity of a simple Internet-based outcome-prediction tool in patients with total hip replacement: a pilot study.

    Science.gov (United States)

    Stöckli, Cornel; Theiler, Robert; Sidelnikov, Eduard; Balsiger, Maria; Ferrari, Stephen M; Buchzig, Beatus; Uehlinger, Kurt; Riniker, Christoph; Bischoff-Ferrari, Heike A

    2014-04-01

    We developed a user-friendly Internet-based tool for patients undergoing total hip replacement (THR) due to osteoarthritis to predict their pain and function after surgery. In the first step, the key questions were identified by statistical modelling in a data set of 375 patients undergoing THR. Based on multiple regression, we identified the two most predictive WOMAC questions for pain and the three most predictive WOMAC questions for functional outcome, while controlling for comorbidity, body mass index, age, gender and specific comorbidities relevant to the outcome. In the second step, a pilot study was performed to validate the resulting tool against the full WOMAC questionnaire among 108 patients undergoing THR. The mean difference between observed (WOMAC) and model-predicted value was -1.1 points (95% confidence interval, CI -3.8, 1.5) for pain and -2.5 points (95% CI -5.3, 0.3) for function. The model-predicted value was within 20% of the observed value in 48% of cases for pain and in 57% of cases for function. The tool demonstrated moderate validity, but performed weakly for patients with extreme levels of pain and extreme functional limitations at 3 months post surgery. This may have been partly due to early complications after surgery. However, the outcome-prediction tool may be useful in helping patients to become better informed about the realistic outcome of their THR.

  20. Towards personalized therapy for multiple sclerosis: prediction of individual treatment response.

    Science.gov (United States)

    Kalincik, Tomas; Manouchehrinia, Ali; Sobisek, Lukas; Jokubaitis, Vilija; Spelman, Tim; Horakova, Dana; Havrdova, Eva; Trojano, Maria; Izquierdo, Guillermo; Lugaresi, Alessandra; Girard, Marc; Prat, Alexandre; Duquette, Pierre; Grammond, Pierre; Sola, Patrizia; Hupperts, Raymond; Grand'Maison, Francois; Pucci, Eugenio; Boz, Cavit; Alroughani, Raed; Van Pesch, Vincent; Lechner-Scott, Jeannette; Terzi, Murat; Bergamaschi, Roberto; Iuliano, Gerardo; Granella, Franco; Spitaleri, Daniele; Shaygannejad, Vahid; Oreja-Guevara, Celia; Slee, Mark; Ampapa, Radek; Verheul, Freek; McCombe, Pamela; Olascoaga, Javier; Amato, Maria Pia; Vucic, Steve; Hodgkinson, Suzanne; Ramo-Tello, Cristina; Flechter, Shlomo; Cristiano, Edgardo; Rozsa, Csilla; Moore, Fraser; Luis Sanchez-Menoyo, Jose; Laura Saladino, Maria; Barnett, Michael; Hillert, Jan; Butzkueven, Helmut

    2017-09-01

    Timely initiation of effective therapy is crucial for preventing disability in multiple sclerosis; however, treatment response varies greatly among patients. Comprehensive predictive models of individual treatment response are lacking. Our aims were: (i) to develop predictive algorithms for individual treatment response using demographic, clinical and paraclinical predictors in patients with multiple sclerosis; and (ii) to evaluate accuracy, and internal and external validity of these algorithms. This study evaluated 27 demographic, clinical and paraclinical predictors of individual response to seven disease-modifying therapies in MSBase, a large global cohort study. Treatment response was analysed separately for disability progression, disability regression, relapse frequency, conversion to secondary progressive disease, change in the cumulative disease burden, and the probability of treatment discontinuation. Multivariable survival and generalized linear models were used, together with the principal component analysis to reduce model dimensionality and prevent overparameterization. Accuracy of the individual prediction was tested and its internal validity was evaluated in a separate, non-overlapping cohort. External validity was evaluated in a geographically distinct cohort, the Swedish Multiple Sclerosis Registry. In the training cohort (n = 8513), the most prominent modifiers of treatment response comprised age, disease duration, disease course, previous relapse activity, disability, predominant relapse phenotype and previous therapy. Importantly, the magnitude and direction of the associations varied among therapies and disease outcomes. Higher probability of disability progression during treatment with injectable therapies was predominantly associated with a greater disability at treatment start and the previous therapy. For fingolimod, natalizumab or mitoxantrone, it was mainly associated with lower pretreatment relapse activity. The probability of

  1. Ensemble-based prediction of RNA secondary structures.

    Science.gov (United States)

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  2. Exploration of machine learning techniques in predicting multiple sclerosis disease course.

    Directory of Open Access Journals (Sweden)

    Yijun Zhao

    Full Text Available To explore the value of machine learning methods for predicting multiple sclerosis disease course.1693 CLIMB study patients were classified as increased EDSS≥1.5 (worsening or not (non-worsening at up to five years after baseline visit. Support vector machines (SVM were used to build the classifier, and compared to logistic regression (LR using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up.Baseline data alone provided little predictive value. Clinical observation for one year improved overall SVM sensitivity to 62% and specificity to 65% in predicting worsening cases. The addition of one year MRI data improved sensitivity to 71% and specificity to 68%. Use of non-uniform misclassification costs in the SVM model, weighting towards increased sensitivity, improved predictions (up to 86%. Sensitivity, specificity, and overall accuracy improved minimally with additional follow-up data. Predictions improved within specific groups defined by baseline EDSS. LR performed more poorly than SVM in most cases. Race, family history of MS, and brain parenchymal fraction, ranked highly as predictors of the non-worsening group. Brain T2 lesion volume ranked highly as predictive of the worsening group.SVM incorporating short-term clinical and brain MRI data, class imbalance corrective measures, and classification costs may be a promising means to predict MS disease course, and for selection of patients suitable for more aggressive treatment regimens.

  3. Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

    Science.gov (United States)

    Qi, D.; Majda, A.

    2017-12-01

    A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with

  4. On the estimation of multiple random integrals and U-statistics

    CERN Document Server

    Major, Péter

    2013-01-01

    This work starts with the study of those limit theorems in probability theory for which classical methods do not work. In many cases some form of linearization can help to solve the problem, because the linearized version is simpler. But in order to apply such a method we have to show that the linearization causes a negligible error. The estimation of this error leads to some important large deviation type problems, and the main subject of this work is their investigation. We provide sharp estimates of the tail distribution of multiple integrals with respect to a normalized empirical measure and so-called degenerate U-statistics and also of the supremum of appropriate classes of such quantities. The proofs apply a number of useful techniques of modern probability that enable us to investigate the non-linear functionals of independent random variables. This lecture note yields insights into these methods, and may also be useful for those who only want some new tools to help them prove limit theorems when stand...

  5. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    Science.gov (United States)

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  6. Seismic response analysis of structural system subjected to multiple support excitation

    International Nuclear Information System (INIS)

    Wu, R.W.; Hussain, F.A.; Liu, L.K.

    1978-01-01

    In the seismic analysis of a multiply supported structural system subjected to nonuniform excitations at each support point, the single response spectrum, the time history, and the multiple response spectrum are the three commonly employed methods. In the present paper the three methods are developed, evaluated, and the limitations and advantages of each method assessed. A numerical example has been carried out for a typical piping system. Considerably smaller responses have been predicted by the time history method than that by the single response spectrum method. This is mainly due to the fact that the phase and amplitude relations between the support excitations are faithfully retained in the time history method. The multiple response spectrum prediction has been observed to compare favourably with the time history method prediction. Based on the present evaluation, the multiple response spectrum method is the most efficient method for seismic response analysis of structural systems subjected to multiple support excitation. (Auth.)

  7. Lagrangian statistics in weakly forced two-dimensional turbulence.

    Science.gov (United States)

    Rivera, Michael K; Ecke, Robert E

    2016-01-01

    Measurements of Lagrangian single-point and multiple-point statistics in a quasi-two-dimensional stratified layer system are reported. The system consists of a layer of salt water over an immiscible layer of Fluorinert and is forced electromagnetically so that mean-squared vorticity is injected at a well-defined spatial scale ri. Simultaneous cascades develop in which enstrophy flows predominately to small scales whereas energy cascades, on average, to larger scales. Lagrangian correlations and one- and two-point displacements are measured for random initial conditions and for initial positions within topological centers and saddles. Some of the behavior of these quantities can be understood in terms of the trapping characteristics of long-lived centers, the slow motion near strong saddles, and the rapid fluctuations outside of either centers or saddles. We also present statistics of Lagrangian velocity fluctuations using energy spectra in frequency space and structure functions in real space. We compare with complementary Eulerian velocity statistics. We find that simultaneous inverse energy and enstrophy ranges present in spectra are not directly echoed in real-space moments of velocity difference. Nevertheless, the spectral ranges line up well with features of moment ratios, indicating that although the moments are not exhibiting unambiguous scaling, the behavior of the probability distribution functions is changing over short ranges of length scales. Implications for understanding weakly forced 2D turbulence with simultaneous inverse and direct cascades are discussed.

  8. Predicting hearing thresholds and occupational hearing loss with multiple-frequency auditory steady-state responses.

    Science.gov (United States)

    Hsu, Ruey-Fen; Ho, Chi-Kung; Lu, Sheng-Nan; Chen, Shun-Sheng

    2010-10-01

    An objective investigation is needed to verify the existence and severity of hearing impairments resulting from work-related, noise-induced hearing loss in arbitration of medicolegal aspects. We investigated the accuracy of multiple-frequency auditory steady-state responses (Mf-ASSRs) between subjects with sensorineural hearing loss (SNHL) with and without occupational noise exposure. Cross-sectional study. Tertiary referral medical centre. Pure-tone audiometry and Mf-ASSRs were recorded in 88 subjects (34 patients had occupational noise-induced hearing loss [NIHL], 36 patients had SNHL without noise exposure, and 18 volunteers were normal controls). Inter- and intragroup comparisons were made. A predicting equation was derived using multiple linear regression analysis. ASSRs and pure-tone thresholds (PTTs) showed a strong correlation for all subjects (r = .77 ≈ .94). The relationship is demonstrated by the equationThe differences between the ASSR and PTT were significantly higher for the NIHL group than for the subjects with non-noise-induced SNHL (p tool for objectively evaluating hearing thresholds. Predictive value may be lower in subjects with occupational hearing loss. Regardless of carrier frequencies, the severity of hearing loss affects the steady-state response. Moreover, the ASSR may assist in detecting noise-induced injury of the auditory pathway. A multiple linear regression equation to accurately predict thresholds was shown that takes into consideration all effect factors.

  9. The neurobiology of uncertainty: implications for statistical learning.

    Science.gov (United States)

    Hasson, Uri

    2017-01-05

    The capacity for assessing the degree of uncertainty in the environment relies on estimating statistics of temporally unfolding inputs. This, in turn, allows calibration of predictive and bottom-up processing, and signalling changes in temporally unfolding environmental features. In the last decade, several studies have examined how the brain codes for and responds to input uncertainty. Initial neurobiological experiments implicated frontoparietal and hippocampal systems, based largely on paradigms that manipulated distributional features of visual stimuli. However, later work in the auditory domain pointed to different systems, whose activation profiles have interesting implications for computational and neurobiological models of statistical learning (SL). This review begins by briefly recapping the historical development of ideas pertaining to the sensitivity to uncertainty in temporally unfolding inputs. It then discusses several issues at the interface of studies of uncertainty and SL. Following, it presents several current treatments of the neurobiology of uncertainty and reviews recent findings that point to principles that serve as important constraints on future neurobiological theories of uncertainty, and relatedly, SL. This review suggests it may be useful to establish closer links between neurobiological research on uncertainty and SL, considering particularly mechanisms sensitive to local and global structure in inputs, the degree of input uncertainty, the complexity of the system generating the input, learning mechanisms that operate on different temporal scales and the use of learnt information for online prediction.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  10. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    Science.gov (United States)

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  11. Modeling Rabbit Responses to Single and Multiple Aerosol ...

    Science.gov (United States)

    Journal Article Survival models are developed here to predict response and time-to-response for mortality in rabbits following exposures to single or multiple aerosol doses of Bacillus anthracis spores. Hazard function models were developed for a multiple dose dataset to predict the probability of death through specifying dose-response functions and the time between exposure and the time-to-death (TTD). Among the models developed, the best-fitting survival model (baseline model) has an exponential dose-response model with a Weibull TTD distribution. Alternative models assessed employ different underlying dose-response functions and use the assumption that, in a multiple dose scenario, earlier doses affect the hazard functions of each subsequent dose. In addition, published mechanistic models are analyzed and compared with models developed in this paper. None of the alternative models that were assessed provided a statistically significant improvement in fit over the baseline model. The general approach utilizes simple empirical data analysis to develop parsimonious models with limited reliance on mechanistic assumptions. The baseline model predicts TTDs consistent with reported results from three independent high-dose rabbit datasets. More accurate survival models depend upon future development of dose-response datasets specifically designed to assess potential multiple dose effects on response and time-to-response. The process used in this paper to dev

  12. Distributed Model Predictive Control over Multiple Groups of Vehicles in Highway Intelligent Space for Large Scale System

    Directory of Open Access Journals (Sweden)

    Tang Xiaofeng

    2014-01-01

    Full Text Available The paper presents the three time warning distances for solving the large scale system of multiple groups of vehicles safety driving characteristics towards highway tunnel environment based on distributed model prediction control approach. Generally speaking, the system includes two parts. First, multiple vehicles are divided into multiple groups. Meanwhile, the distributed model predictive control approach is proposed to calculate the information framework of each group. Each group of optimization performance considers the local optimization and the neighboring subgroup of optimization characteristics, which could ensure the global optimization performance. Second, the three time warning distances are studied based on the basic principles used for highway intelligent space (HIS and the information framework concept is proposed according to the multiple groups of vehicles. The math model is built to avoid the chain avoidance of vehicles. The results demonstrate that the proposed highway intelligent space method could effectively ensure driving safety of multiple groups of vehicles under the environment of fog, rain, or snow.

  13. Multiple Model Predictive Hybrid Feedforward Control of Fuel Cell Power Generation System

    Directory of Open Access Journals (Sweden)

    Long Wu

    2018-02-01

    Full Text Available Solid oxide fuel cell (SOFC is widely considered as an alternative solution among the family of the sustainable distributed generation. Its load flexibility enables it adjusting the power output to meet the requirements from power grid balance. Although promising, its control is challenging when faced with load changes, during which the output voltage is required to be maintained as constant and fuel utilization rate kept within a safe range. Moreover, it makes the control even more intractable because of the multivariable coupling and strong nonlinearity within the wide-range operating conditions. To this end, this paper developed a multiple model predictive control strategy for reliable SOFC operation. The resistance load is regarded as a measurable disturbance, which is an input to the model predictive control as feedforward compensation. The coupling is accommodated by the receding horizon optimization. The nonlinearity is mitigated by the multiple linear models, the weighted sum of which serves as the final control execution. The merits of the proposed control structure are demonstrated by the simulation results.

  14. Statistical methods for QTL mapping and genomic prediction of multiple traits and environments: case studies in pepper

    NARCIS (Netherlands)

    Alimi, Nurudeen Adeniyi

    2016-01-01

    In this thesis we describe the results of a number of quantitative techniques that were used to understand the genetics of yield in pepper as an example of complex trait measured in a number of environments. Main objectives were; i) to propose a number of mixed models to detect QTLs for multiple

  15. Subcritical Multiplicative Chaos for Regularized Counting Statistics from Random Matrix Theory

    Science.gov (United States)

    Lambert, Gaultier; Ostrovsky, Dmitry; Simm, Nick

    2018-05-01

    For an {N × N} Haar distributed random unitary matrix U N , we consider the random field defined by counting the number of eigenvalues of U N in a mesoscopic arc centered at the point u on the unit circle. We prove that after regularizing at a small scale {ɛN > 0}, the renormalized exponential of this field converges as N \\to ∞ to a Gaussian multiplicative chaos measure in the whole subcritical phase. We discuss implications of this result for obtaining a lower bound on the maximum of the field. We also show that the moments of the total mass converge to a Selberg-like integral and by taking a further limit as the size of the arc diverges, we establish part of the conjectures in Ostrovsky (Nonlinearity 29(2):426-464, 2016). By an analogous construction, we prove that the multiplicative chaos measure coming from the sine process has the same distribution, which strongly suggests that this limiting object should be universal. Our approach to the L 1-phase is based on a generalization of the construction in Berestycki (Electron Commun Probab 22(27):12, 2017) to random fields which are only asymptotically Gaussian. In particular, our method could have applications to other random fields coming from either random matrix theory or a different context.

  16. Electron teleportation and statistical transmutation in multiterminal Majorana islands

    Science.gov (United States)

    Michaeli, Karen; Landau, L. Aviad; Sela, Eran; Fu, Liang

    2017-11-01

    We study a topological superconductor island with spatially separated Majorana modes coupled to multiple normal-metal leads by single-electron tunneling in the Coulomb blockade regime. We show that low-temperature transport in such a Majorana island is carried by an emergent charge-e boson composed of a Majorana mode and an electronic excitation in leads. This transmutation from Fermi to Bose statistics has remarkable consequences. For noninteracting leads, the system flows to a non-Fermi-liquid fixed point, which is stable against tunnel couplings anisotropy or detuning away from the charge-degeneracy point. As a result, the system exhibits a universal conductance at zero temperature, which is a fraction of the conductance quantum, and low-temperature corrections with a universal power-law exponent. In addition, we consider Majorana islands connected to interacting one-dimensional leads, and find different stable fixed points near and far from the charge-degeneracy point.

  17. Validation of Kepler's multiple planet candidates. II. Refined statistical framework and descriptions of systems of special interest

    International Nuclear Information System (INIS)

    Lissauer, Jack J.; Bryson, Stephen T.; Rowe, Jason F.; Jontof-Hutter, Daniel; Borucki, William J.; Marcy, Geoffrey W.; Kolbl, Rea; Agol, Eric; Carter, Joshua A.; Torres, Guillermo; Ford, Eric B.; Gilliland, Ronald L.; Star, Kimberly M.; Steffen, Jason H.

    2014-01-01

    We extend the statistical analysis performed by Lissauer et al. in 2012, which demonstrates that the overwhelming majority of Kepler candidate multiple transiting systems (multis) represents true transiting planets, and we develop therefrom a procedure to validate large numbers of planet candidates in multis as bona fide exoplanets. We show that this statistical framework correctly estimates the abundance of false positives already identified around Kepler targets with multiple sets of transit-like signatures based on their abundance around targets with single sets of transit-like signatures. We estimate the number of multis that represent split systems of one or more planets orbiting each component of a binary star system. We use the high reliability rate for multis to validate more than one dozen particularly interesting multi-planet systems herein. Hundreds of additional multi-planet systems are validated in a companion paper by Rowe et al. We note that few very short period (P < 1.6 days) planets orbit within multiple transiting planet systems and discuss possible reasons for their absence. There also appears to be a shortage of planets with periods exceeding a few months in multis.

  18. Validation of Kepler's multiple planet candidates. II. Refined statistical framework and descriptions of systems of special interest

    Energy Technology Data Exchange (ETDEWEB)

    Lissauer, Jack J.; Bryson, Stephen T.; Rowe, Jason F.; Jontof-Hutter, Daniel; Borucki, William J. [NASA Ames Research Center, Moffett Field, CA 94035 (United States); Marcy, Geoffrey W.; Kolbl, Rea [Astronomy Department, University of California, Berkeley, CA 94720 (United States); Agol, Eric [Department of Astronomy, Box 351580, University of Washington, Seattle, WA 98195 (United States); Carter, Joshua A.; Torres, Guillermo [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ford, Eric B.; Gilliland, Ronald L.; Star, Kimberly M. [Department of Astronomy and Astrophysics, 525 Davey Laboratory, The Pennsylvania State University, University Park, PA 16802 (United States); Steffen, Jason H., E-mail: Jack.Lissauer@nasa.gov [Department of Physics and Astronomy/CIERA, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States)

    2014-03-20

    We extend the statistical analysis performed by Lissauer et al. in 2012, which demonstrates that the overwhelming majority of Kepler candidate multiple transiting systems (multis) represents true transiting planets, and we develop therefrom a procedure to validate large numbers of planet candidates in multis as bona fide exoplanets. We show that this statistical framework correctly estimates the abundance of false positives already identified around Kepler targets with multiple sets of transit-like signatures based on their abundance around targets with single sets of transit-like signatures. We estimate the number of multis that represent split systems of one or more planets orbiting each component of a binary star system. We use the high reliability rate for multis to validate more than one dozen particularly interesting multi-planet systems herein. Hundreds of additional multi-planet systems are validated in a companion paper by Rowe et al. We note that few very short period (P < 1.6 days) planets orbit within multiple transiting planet systems and discuss possible reasons for their absence. There also appears to be a shortage of planets with periods exceeding a few months in multis.

  19. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models

    NARCIS (Netherlands)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A.; van t Veld, Aart A.

    2012-01-01

    PURPOSE: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator

  20. Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival

    Directory of Open Access Journals (Sweden)

    Adam Kaplan

    2017-07-01

    Full Text Available Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA. However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE, an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict.

  1. Probabilistic risk assessment framework for structural systems under multiple hazards using Bayesian statistics

    Energy Technology Data Exchange (ETDEWEB)

    Kwag, Shinyoung [North Carolina State University, Raleigh, NC 27695 (United States); Korea Atomic Energy Research Institute, Daejeon 305-353 (Korea, Republic of); Gupta, Abhinav, E-mail: agupta1@ncsu.edu [North Carolina State University, Raleigh, NC 27695 (United States)

    2017-04-15

    Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.

  2. Probabilistic risk assessment framework for structural systems under multiple hazards using Bayesian statistics

    International Nuclear Information System (INIS)

    Kwag, Shinyoung; Gupta, Abhinav

    2017-01-01

    Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.

  3. A Mixed-Methods Study Investigating the Relationship between Media Multitasking Orientation and Grade Point Average

    Science.gov (United States)

    Lee, Jennifer

    2012-01-01

    The intent of this study was to examine the relationship between media multitasking orientation and grade point average. The study utilized a mixed-methods approach to investigate the research questions. In the quantitative section of the study, the primary method of statistical analyses was multiple regression. The independent variables for the…

  4. No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.

    Science.gov (United States)

    Li, Xuelong; Guo, Qun; Lu, Xiaoqiang

    2016-05-13

    It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.

  5. Recent advances in statistical energy analysis

    Science.gov (United States)

    Heron, K. H.

    1992-01-01

    Statistical Energy Analysis (SEA) has traditionally been developed using modal summation and averaging approach, and has led to the need for many restrictive SEA assumptions. The assumption of 'weak coupling' is particularly unacceptable when attempts are made to apply SEA to structural coupling. It is now believed that this assumption is more a function of the modal formulation rather than a necessary formulation of SEA. The present analysis ignores this restriction and describes a wave approach to the calculation of plate-plate coupling loss factors. Predictions based on this method are compared with results obtained from experiments using point excitation on one side of an irregular six-sided box structure. Conclusions show that the use and calculation of infinite transmission coefficients is the way forward for the development of a purely predictive SEA code.

  6. New England observed and predicted August stream/river temperature maximum positive daily rate of change points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted August stream/river temperature maximum positive daily rate of change in New England based on a...

  7. New England observed and predicted July stream/river temperature maximum positive daily rate of change points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted July stream/river temperature maximum positive daily rate of change in New England based on a...

  8. New England observed and predicted July maximum negative stream/river temperature daily rate of change points

    Data.gov (United States)

    U.S. Environmental Protection Agency — The shapefile contains points with associated observed and predicted July stream/river temperature maximum negative daily rate of change in New England based on a...

  9. A escolha do teste estatístico - um tutorial em forma de apresentação em PowerPoint A PowerPoint®-based guide to assist in choosing the suitable statistical test

    Directory of Open Access Journals (Sweden)

    David Normando

    2010-02-01

    Full Text Available A seleção de métodos apropriados para a análise estatística pode parecer complexa, principalmente para estudantes de pós-graduação e pesquisadores no início da carreira científica. Por outro lado, a apresentação em PowerPoint é uma ferramenta comum para estudantes e pesquisadores. Assim, um tutorial de Bioestatística desenvolvido em uma apresentação em PowerPoint poderia estreitar a distância entre ortodontistas e a Bioestatística. Esse guia proporciona informações úteis e objetivas a respeito de vários métodos estatísticos empregando exemplos relacionados à Odontologia e, mais especificamente, à Ortodontia. Esse tutorial deve ser empregado, principalmente, para o usuário obter algumas respostas a questões comuns relacionadas ao teste mais apropriado para executar comparações entre grupos, examinar correlações e regressões ou analisar o erro do método. Também pode ser obtido auxílio para checar a distribuição dos dados (normal ou anormal e a escolha do gráfico mais adequado para a apresentação dos resultados. Esse guia* pode ainda ser de bastante utilidade para revisores de periódicos examinarem, de forma rápida, a adequabilidade do método estatístico apresentado em um artigo submetido à publicação.Selecting appropriate methods for statistical analysis may be difficult, especially for the students and others in the early phases of the research career. On the other hand, PowerPoint presentation is a very common tool to researchers and dental students, so a statistical guide based on PowerPoint could narrow the gap between orthodontist and the Biostatistics. This guide provides objective and useful information about several statistical methods using examples related to the dental field. A Power-Point presentation is employed to assist the user to find answers to common questions regarding Biostatistics, such as the most appropriate statistical test to compare groups, to make correlations and

  10. Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

    International Nuclear Information System (INIS)

    Yahya, Noorazrul; Ebert, Martin A.; Bulsara, Max; House, Michael J.; Kennedy, Angel; Joseph, David J.; Denham, James W.

    2016-01-01

    Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions

  11. Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

    Energy Technology Data Exchange (ETDEWEB)

    Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au [School of Physics, University of Western Australia, Western Australia 6009, Australia and School of Health Sciences, National University of Malaysia, Bangi 43600 (Malaysia); Ebert, Martin A. [School of Physics, University of Western Australia, Western Australia 6009, Australia and Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Bulsara, Max [Institute for Health Research, University of Notre Dame, Fremantle, Western Australia 6959 (Australia); House, Michael J. [School of Physics, University of Western Australia, Western Australia 6009 (Australia); Kennedy, Angel [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008 (Australia); Joseph, David J. [Department of Radiation Oncology, Sir Charles Gairdner Hospital, Western Australia 6008, Australia and School of Surgery, University of Western Australia, Western Australia 6009 (Australia); Denham, James W. [School of Medicine and Public Health, University of Newcastle, New South Wales 2308 (Australia)

    2016-05-15

    Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥ 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions

  12. A Weibull statistics-based lignocellulose saccharification model and a built-in parameter accurately predict lignocellulose hydrolysis performance.

    Science.gov (United States)

    Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu

    2015-09-01

    Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Model for predicting the injury severity score.

    Science.gov (United States)

    Hagiwara, Shuichi; Oshima, Kiyohiro; Murata, Masato; Kaneko, Minoru; Aoki, Makoto; Kanbe, Masahiko; Nakamura, Takuro; Ohyama, Yoshio; Tamura, Jun'ichi

    2015-07-01

    To determine the formula that predicts the injury severity score from parameters that are obtained in the emergency department at arrival. We reviewed the medical records of trauma patients who were transferred to the emergency department of Gunma University Hospital between January 2010 and December 2010. The injury severity score, age, mean blood pressure, heart rate, Glasgow coma scale, hemoglobin, hematocrit, red blood cell count, platelet count, fibrinogen, international normalized ratio of prothrombin time, activated partial thromboplastin time, and fibrin degradation products, were examined in those patients on arrival. To determine the formula that predicts the injury severity score, multiple linear regression analysis was carried out. The injury severity score was set as the dependent variable, and the other parameters were set as candidate objective variables. IBM spss Statistics 20 was used for the statistical analysis. Statistical significance was set at P  Watson ratio was 2.200. A formula for predicting the injury severity score in trauma patients was developed with ordinary parameters such as fibrin degradation products and mean blood pressure. This formula is useful because we can predict the injury severity score easily in the emergency department.

  14. Bridges between multiple-point geostatistics and texture synthesis: Review and guidelines for future research

    Science.gov (United States)

    Mariethoz, Gregoire; Lefebvre, Sylvain

    2014-05-01

    Multiple-Point Simulations (MPS) is a family of geostatistical tools that has received a lot of attention in recent years for the characterization of spatial phenomena in geosciences. It relies on the definition of training images to represent a given type of spatial variability, or texture. We show that the algorithmic tools used are similar in many ways to techniques developed in computer graphics, where there is a need to generate large amounts of realistic textures for applications such as video games and animated movies. Similarly to MPS, these texture synthesis methods use training images, or exemplars, to generate realistic-looking graphical textures. Both domains of multiple-point geostatistics and example-based texture synthesis present similarities in their historic development and share similar concepts. These disciplines have however remained separated, and as a result significant algorithmic innovations in each discipline have not been universally adopted. Texture synthesis algorithms present drastically increased computational efficiency, patterns reproduction and user control. At the same time, MPS developed ways to condition models to spatial data and to produce 3D stochastic realizations, which have not been thoroughly investigated in the field of texture synthesis. In this paper we review the possible links between these disciplines and show the potential and limitations of using concepts and approaches from texture synthesis in MPS. We also provide guidelines on how recent developments could benefit both fields of research, and what challenges remain open.

  15. Predicting objective function weights from patient anatomy in prostate IMRT treatment planning

    International Nuclear Information System (INIS)

    Lee, Taewoo; Hammad, Muhannad; Chan, Timothy C. Y.; Craig, Tim; Sharpe, Michael B.

    2013-01-01

    Purpose: Intensity-modulated radiation therapy (IMRT) treatment planning typically combines multiple criteria into a single objective function by taking a weighted sum. The authors propose a statistical model that predicts objective function weights from patient anatomy for prostate IMRT treatment planning. This study provides a proof of concept for geometry-driven weight determination. Methods: A previously developed inverse optimization method (IOM) was used to generate optimal objective function weights for 24 patients using their historical treatment plans (i.e., dose distributions). These IOM weights were around 1% for each of the femoral heads, while bladder and rectum weights varied greatly between patients. A regression model was developed to predict a patient's rectum weight using the ratio of the overlap volume of the rectum and bladder with the planning target volume at a 1 cm expansion as the independent variable. The femoral head weights were fixed to 1% each and the bladder weight was calculated as one minus the rectum and femoral head weights. The model was validated using leave-one-out cross validation. Objective values and dose distributions generated through inverse planning using the predicted weights were compared to those generated using the original IOM weights, as well as an average of the IOM weights across all patients. Results: The IOM weight vectors were on average six times closer to the predicted weight vectors than to the average weight vector, usingl 2 distance. Likewise, the bladder and rectum objective values achieved by the predicted weights were more similar to the objective values achieved by the IOM weights. The difference in objective value performance between the predicted and average weights was statistically significant according to a one-sided sign test. For all patients, the difference in rectum V54.3 Gy, rectum V70.0 Gy, bladder V54.3 Gy, and bladder V70.0 Gy values between the dose distributions generated by the

  16. Curvelet-domain multiple matching method combined with cubic B-spline function

    Science.gov (United States)

    Wang, Tong; Wang, Deli; Tian, Mi; Hu, Bin; Liu, Chengming

    2018-05-01

    Since the large amount of surface-related multiple existed in the marine data would influence the results of data processing and interpretation seriously, many researchers had attempted to develop effective methods to remove them. The most successful surface-related multiple elimination method was proposed based on data-driven theory. However, the elimination effect was unsatisfactory due to the existence of amplitude and phase errors. Although the subsequent curvelet-domain multiple-primary separation method achieved better results, poor computational efficiency prevented its application. In this paper, we adopt the cubic B-spline function to improve the traditional curvelet multiple matching method. First, select a little number of unknowns as the basis points of the matching coefficient; second, apply the cubic B-spline function on these basis points to reconstruct the matching array; third, build constraint solving equation based on the relationships of predicted multiple, matching coefficients, and actual data; finally, use the BFGS algorithm to iterate and realize the fast-solving sparse constraint of multiple matching algorithm. Moreover, the soft-threshold method is used to make the method perform better. With the cubic B-spline function, the differences between predicted multiple and original data diminish, which results in less processing time to obtain optimal solutions and fewer iterative loops in the solving procedure based on the L1 norm constraint. The applications to synthetic and field-derived data both validate the practicability and validity of the method.

  17. Learning Combinations of Multiple Feature Representations for Music Emotion Prediction

    DEFF Research Database (Denmark)

    Madsen, Jens; Jensen, Bjørn Sand; Larsen, Jan

    2015-01-01

    Music consists of several structures and patterns evolving through time which greatly influences the human decoding of higher-level cognitive aspects of music like the emotions expressed in music. For tasks, such as genre, tag and emotion recognition, these structures have often been identified...... and used as individual and non-temporal features and representations. In this work, we address the hypothesis whether using multiple temporal and non-temporal representations of different features is beneficial for modeling music structure with the aim to predict the emotions expressed in music. We test...

  18. A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. 1: Theoretical development and application to yearly predictions for selected cities in the United States

    Science.gov (United States)

    Manning, Robert M.

    1986-01-01

    A rain attenuation prediction model is described for use in calculating satellite communication link availability for any specific location in the world that is characterized by an extended record of rainfall. Such a formalism is necessary for the accurate assessment of such availability predictions in the case of the small user-terminal concept of the Advanced Communication Technology Satellite (ACTS) Project. The model employs the theory of extreme value statistics to generate the necessary statistical rainrate parameters from rain data in the form compiled by the National Weather Service. These location dependent rain statistics are then applied to a rain attenuation model to obtain a yearly prediction of the occurrence of attenuation on any satellite link at that location. The predictions of this model are compared to those of the Crane Two-Component Rain Model and some empirical data and found to be very good. The model is then used to calculate rain attenuation statistics at 59 locations in the United States (including Alaska and Hawaii) for the 20 GHz downlinks and 30 GHz uplinks of the proposed ACTS system. The flexibility of this modeling formalism is such that it allows a complete and unified treatment of the temporal aspects of rain attenuation that leads to the design of an optimum stochastic power control algorithm, the purpose of which is to efficiently counter such rain fades on a satellite link.

  19. Performance Analysis of a Threshold-Based Parallel Multiple Beam Selection Scheme for WDM FSO Systems

    KAUST Repository

    Nam, Sung Sik; Alouini, Mohamed-Slim; Ko, Young-Chai

    2018-01-01

    In this paper, we statistically analyze the performance of a threshold-based parallel multiple beam selection scheme for a free-space optical (FSO) based system with wavelength division multiplexing (WDM) in cases where a pointing error has occurred

  20. Assisting People with Multiple Disabilities by Improving Their Computer Pointing Efficiency with an Automatic Target Acquisition Program

    Science.gov (United States)

    Shih, Ching-Hsiang; Shih, Ching-Tien; Peng, Chin-Ling

    2011-01-01

    This study evaluated whether two people with multiple disabilities would be able to improve their pointing performance through an Automatic Target Acquisition Program (ATAP) and a newly developed mouse driver (i.e. a new mouse driver replaces standard mouse driver, and is able to monitor mouse movement and intercept click action). Initially, both…

  1. Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Charles Frank

    2018-03-01

    Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.

  2. On the Transfer of a Number of Concepts of Statistical Radiophysics to the Theory of One-dimensional Point Mappings

    Directory of Open Access Journals (Sweden)

    Agalar M. Agalarov

    2018-01-01

    Full Text Available In the article, the possibility of using a bispectrum under the investigation of regular and chaotic behaviour of one-dimensional point mappings is discussed. The effectiveness of the transfer of this concept to nonlinear dynamics was demonstrated by an example of the Feigenbaum mapping. Also in the work, the application of the Kullback-Leibler entropy in the theory of point mappings is considered. It has been shown that this information-like value is able to describe the behaviour of statistical ensembles of one-dimensional mappings. In the framework of this theory some general properties of its behaviour were found out. Constructivity of the Kullback-Leibler entropy in the theory of point mappings was shown by means of its direct calculation for the ”saw tooth” mapping with linear initial probability density. Moreover, for this mapping the denumerable set of initial probability densities hitting into its stationary probability density after a finite number of steps was pointed out. 

  3. Statistical approaches to assessing single and multiple outcome measures in dry eye therapy and diagnosis.

    Science.gov (United States)

    Tomlinson, Alan; Hair, Mario; McFadyen, Angus

    2013-10-01

    Dry eye is a multifactorial disease which would require a broad spectrum of test measures in the monitoring of its treatment and diagnosis. However, studies have typically reported improvements in individual measures with treatment. Alternative approaches involve multiple, combined outcomes being assessed by different statistical analyses. In order to assess the effect of various statistical approaches to the use of single and combined test measures in dry eye, this review reanalyzed measures from two previous studies (osmolarity, evaporation, tear turnover rate, and lipid film quality). These analyses assessed the measures as single variables within groups, pre- and post-intervention with a lubricant supplement, by creating combinations of these variables and by validating these combinations with the combined sample of data from all groups of dry eye subjects. The effectiveness of single measures and combinations in diagnosis of dry eye was also considered. Copyright © 2013. Published by Elsevier Inc.

  4. Improved statistical signal detection in pharmacovigilance by combining multiple strength-of-evidence aspects in vigiRank.

    Science.gov (United States)

    Caster, Ola; Juhlin, Kristina; Watson, Sarah; Norén, G Niklas

    2014-08-01

    Detection of unknown risks with marketed medicines is key to securing the optimal care of individual patients and to reducing the societal burden from adverse drug reactions. Large collections of individual case reports remain the primary source of information and require effective analytics to guide clinical assessors towards likely drug safety signals. Disproportionality analysis is based solely on aggregate numbers of reports and naively disregards report quality and content. However, these latter features are the very fundament of the ensuing clinical assessment. Our objective was to develop and evaluate a data-driven screening algorithm for emerging drug safety signals that accounts for report quality and content. vigiRank is a predictive model for emerging safety signals, here implemented with shrinkage logistic regression to identify predictive variables and estimate their respective contributions. The variables considered for inclusion capture different aspects of strength of evidence, including quality and clinical content of individual reports, as well as trends in time and geographic spread. A reference set of 264 positive controls (historical safety signals from 2003 to 2007) and 5,280 negative controls (pairs of drugs and adverse events not listed in the Summary of Product Characteristics of that drug in 2012) was used for model fitting and evaluation; the latter used fivefold cross-validation to protect against over-fitting. All analyses were performed on a reconstructed version of VigiBase(®) as of 31 December 2004, at around which time most safety signals in our reference set were emerging. The following aspects of strength of evidence were selected for inclusion into vigiRank: the numbers of informative and recent reports, respectively; disproportional reporting; the number of reports with free-text descriptions of the case; and the geographic spread of reporting. vigiRank offered a statistically significant improvement in area under the receiver

  5. Higher moments of net kaon multiplicity distributions at RHIC energies for the search of QCD Critical Point at STAR

    Directory of Open Access Journals (Sweden)

    Sarkar Amal

    2013-11-01

    Full Text Available In this paper we report the measurements of the various moments mean (M, standard deviation (σ skewness (S and kurtosis (κ of the net-Kaon multiplicity distribution at midrapidity from Au+Au collisions at √sNN = 7.7 to 200 GeV in the STAR experiment at RHIC in an effort to locate the critical point in the QCD phase diagram. These moments and their products are related to the thermodynamic susceptibilities of conserved quantities such as net baryon number, net charge, and net strangeness as also to the correlation length of the system. A non-monotonic behavior of these variable indicate the presence of the critical point. In this work we also present the moments products Sσ, κσ2 of net-Kaon multiplicity distribution as a function of collision centrality and energies. The energy and the centrality dependence of higher moments of net-Kaons and their products have been compared with it0s Poisson expectation and with simulations from AMPT which does not include the critical point. From the measurement at all seven available beam energies, we find no evidence for a critical point in the QCD phase diagram for √sNN below 200 GeV.

  6. Psychometrics of Multiple Choice Questions with Non-Functioning Distracters: Implications to Medical Education.

    Science.gov (United States)

    Deepak, Kishore K; Al-Umran, Khalid Umran; AI-Sheikh, Mona H; Dkoli, B V; Al-Rubaish, Abdullah

    2015-01-01

    The functionality of distracters in a multiple choice question plays a very important role. We examined the frequency and impact of functioning and non-functioning distracters on psychometric properties of 5-option items in clinical disciplines. We analyzed item statistics of 1115 multiple choice questions from 15 summative assessments of undergraduate medical students and classified the items into five groups by their number of non-functioning distracters. We analyzed the effect of varying degree of non-functionality ranging from 0 to 4, on test reliability, difficulty index, discrimination index and point biserial correlation. The non-functionality of distracters inversely affected the test reliability and quality of items in a predictable manner. The non-functioning distracters made the items easier and lowered the discrimination index significantly. Three non-functional distracters in a 5-option MCQ significantly affected all psychometric properties (p psychometrically as effective as 5-option items. Our study reveals that a multiple choice question with 3 functional options provides lower most limit of item format that has adequate psychometric property. The test containing items with less number of functioning options have significantly lower reliability. The distracter function analysis and revision of nonfunctioning distracters can serve as important methods to improve the psychometrics and reliability of assessment.

  7. Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models.

    Science.gov (United States)

    Mahony, Shaun; McInerney, James O; Smith, Terry J; Golden, Aaron

    2004-03-05

    Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to gene-prediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.

  8. Multivariate Regression Analysis and Statistical Modeling for Summer Extreme Precipitation over the Yangtze River Basin, China

    Directory of Open Access Journals (Sweden)

    Tao Gao

    2014-01-01

    Full Text Available Extreme precipitation is likely to be one of the most severe meteorological disasters in China; however, studies on the physical factors affecting precipitation extremes and corresponding prediction models are not accurately available. From a new point of view, the sensible heat flux (SHF and latent heat flux (LHF, which have significant impacts on summer extreme rainfall in Yangtze River basin (YRB, have been quantified and then selections of the impact factors are conducted. Firstly, a regional extreme precipitation index was applied to determine Regions of Significant Correlation (RSC by analyzing spatial distribution of correlation coefficients between this index and SHF, LHF, and sea surface temperature (SST on global ocean scale; then the time series of SHF, LHF, and SST in RSCs during 1967–2010 were selected. Furthermore, other factors that significantly affect variations in precipitation extremes over YRB were also selected. The methods of multiple stepwise regression and leave-one-out cross-validation (LOOCV were utilized to analyze and test influencing factors and statistical prediction model. The correlation coefficient between observed regional extreme index and model simulation result is 0.85, with significant level at 99%. This suggested that the forecast skill was acceptable although many aspects of the prediction model should be improved.

  9. Prediction of boiling points of some organic compounds to be used in volume reduction of liquid radioactive wastes

    International Nuclear Information System (INIS)

    Helal, N.L.; Ezz el-Din, M.R.

    2004-01-01

    Boiling points determination may help in the evaporation process used to solidify high-level liquid wastes and to reduce the volume of wastes that require disposal. The problem that always encountered is how to choose an appropriate method to determine the boiling points of the liquid wastes which will be able to solve. We introduce this work with the aim to use mathematical descriptors and their applications in predicting boiling points essential for the evaporation process. This work was applied for diverse database of two sets of chemicals that may exist in radioactive wastes. The first set was 59 alcohols and amines (group a) and the second was 11 aniline compounds (group b). The results show that the used mathematical descriptors give a reasonable predictive model for the diverse sets of molecules

  10. Prediction of interior noise due to random acoustic or turbulent boundary layer excitation using statistical energy analysis

    Science.gov (United States)

    Grosveld, Ferdinand W.

    1990-01-01

    The feasibility of predicting interior noise due to random acoustic or turbulent boundary layer excitation was investigated in experiments in which a statistical energy analysis model (VAPEPS) was used to analyze measurements of the acceleration response and sound transmission of flat aluminum, lucite, and graphite/epoxy plates exposed to random acoustic or turbulent boundary layer excitation. The noise reduction of the plate, when backed by a shallow cavity and excited by a turbulent boundary layer, was predicted using a simplified theory based on the assumption of adiabatic compression of the fluid in the cavity. The predicted plate acceleration response was used as input in the noise reduction prediction. Reasonable agreement was found between the predictions and the measured noise reduction in the frequency range 315-1000 Hz.

  11. Early years of Computational Statistical Mechanics

    Science.gov (United States)

    Mareschal, Michel

    2018-05-01

    Evidence that a model of hard spheres exhibits a first-order solid-fluid phase transition was provided in the late fifties by two new numerical techniques known as Monte Carlo and Molecular Dynamics. This result can be considered as the starting point of computational statistical mechanics: at the time, it was a confirmation of a counter-intuitive (and controversial) theoretical prediction by J. Kirkwood. It necessitated an intensive collaboration between the Los Alamos team, with Bill Wood developing the Monte Carlo approach, and the Livermore group, where Berni Alder was inventing Molecular Dynamics. This article tells how it happened.

  12. A Bayesian Framework for Multiple Trait Colo-calization from Summary Association Statistics.

    Science.gov (United States)

    Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen; Hauberg, Mads; Shi, Huwenbo; Boocock, James; Pickrell, Joe; Jaffe, Andrew E; Pasaniuc, Bogdan; Roussos, Panos

    2018-03-19

    Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g., expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci. We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes. moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe, and scenario of interest. claudia.giambartolomei@gmail.com. Supplementary data are available at Bioinformatics online.

  13. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits

    Directory of Open Access Journals (Sweden)

    Hayashi Takeshi

    2013-01-01

    Full Text Available Abstract Background Genomic selection is an effective tool for animal and plant breeding, allowing effective individual selection without phenotypic records through the prediction of genomic breeding value (GBV. To date, genomic selection has focused on a single trait. However, actual breeding often targets multiple correlated traits, and, therefore, joint analysis taking into consideration the correlation between traits, which might result in more accurate GBV prediction than analyzing each trait separately, is suitable for multi-trait genomic selection. This would require an extension of the prediction model for single-trait GBV to multi-trait case. As the computational burden of multi-trait analysis is even higher than that of single-trait analysis, an effective computational method for constructing a multi-trait prediction model is also needed. Results We described a Bayesian regression model incorporating variable selection for jointly predicting GBVs of multiple traits and devised both an MCMC iteration and variational approximation for Bayesian estimation of parameters in this multi-trait model. The proposed Bayesian procedures with MCMC iteration and variational approximation were referred to as MCBayes and varBayes, respectively. Using simulated datasets of SNP genotypes and phenotypes for three traits with high and low heritabilities, we compared the accuracy in predicting GBVs between multi-trait and single-trait analyses as well as between MCBayes and varBayes. The results showed that, compared to single-trait analysis, multi-trait analysis enabled much more accurate GBV prediction for low-heritability traits correlated with high-heritability traits, by utilizing the correlation structure between traits, while the prediction accuracy for uncorrelated low-heritability traits was comparable or less with multi-trait analysis in comparison with single-trait analysis depending on the setting for prior probability that a SNP has zero

  14. Predictive error dependencies when using pilot points and singular value decomposition in groundwater model calibration

    DEFF Research Database (Denmark)

    Christensen, Steen; Doherty, John

    2008-01-01

    super parameters), and that the structural errors caused by using pilot points and super parameters to parameterize the highly heterogeneous log-transmissivity field can be significant. For the test case much effort is put into studying how the calibrated model's ability to make accurate predictions...

  15. The statistical prediction of offshore winds from land-based data for wind-energy applications

    DEFF Research Database (Denmark)

    Walmsley, J.L.; Barthelmie, R.J.; Burrows, W.R.

    2001-01-01

    Land-based meteorological measurements at two locations on the Danish coast are used to predict offshore wind speeds. Offshore wind-speed data are used only for developing the statistical prediction algorithms and for verification. As a first step, the two datasets were separated into nine...... percentile-based bins, with a minimum of 30 data records in each bin. Next, the records were randomly selected with approximately 70% of the data in each bin being used as a training set for development of the prediction algorithms, and the remaining 30% being reserved as a test set for evaluation purposes....... The binning procedure ensured that both training and test sets fairly represented the overall data distribution. To base the conclusions on firmer ground, five permutations of these training and test sets were created. Thus, all calculations were based on five cases, each one representing a different random...

  16. Simultaneous colour visualizations of multiple ALS point cloud attributes for land cover and vegetation analysis

    Science.gov (United States)

    Zlinszky, András; Schroiff, Anke; Otepka, Johannes; Mandlburger, Gottfried; Pfeifer, Norbert

    2014-05-01

    LIDAR point clouds hold valuable information for land cover and vegetation analysis, not only in the spatial distribution of the points but also in their various attributes. However, LIDAR point clouds are rarely used for visual interpretation, since for most users, the point cloud is difficult to interpret compared to passive optical imagery. Meanwhile, point cloud viewing software is available allowing interactive 3D interpretation, but typically only one attribute at a time. This results in a large number of points with the same colour, crowding the scene and often obscuring detail. We developed a scheme for mapping information from multiple LIDAR point attributes to the Red, Green, and Blue channels of a widely used LIDAR data format, which are otherwise mostly used to add information from imagery to create "photorealistic" point clouds. The possible combinations of parameters are therefore represented in a wide range of colours, but relative differences in individual parameter values of points can be well understood. The visualization was implemented in OPALS software, using a simple and robust batch script, and is viewer independent since the information is stored in the point cloud data file itself. In our case, the following colour channel assignment delivered best results: Echo amplitude in the Red, echo width in the Green and normalized height above a Digital Terrain Model in the Blue channel. With correct parameter scaling (but completely without point classification), points belonging to asphalt and bare soil are dark red, low grassland and crop vegetation are bright red to yellow, shrubs and low trees are green and high trees are blue. Depending on roof material and DTM quality, buildings are shown from red through purple to dark blue. Erroneously high or low points, or points with incorrect amplitude or echo width usually have colours contrasting from terrain or vegetation. This allows efficient visual interpretation of the point cloud in planar

  17. Hierarchical folding of multiple sequence alignments for the prediction of structures and RNA-RNA interactions

    DEFF Research Database (Denmark)

    Seemann, Ernst Stefan; Richter, Andreas S.; Gorodkin, Jan

    2010-01-01

    of that used for individual multiple alignments. Results: We derived a rather extensive algorithm. One of the advantages of our approach (in contrast to other RNARNA interaction prediction methods) is the application of covariance detection and prediction of pseudoknots between intra- and inter-molecular base...... pairs. As a proof of concept, we show an example and discuss the strengths and weaknesses of the approach....

  18. In Silico Perspectives on the Prediction of the PLP's Epitopes involved in Multiple Sclerosis.

    Science.gov (United States)

    Zamanzadeh, Zahra; Ataei, Mitra; Nabavi, Seyed Massood; Ahangari, Ghasem; Sadeghi, Mehdi; Sanati, Mohammad Hosein

    2017-03-01

    Multiple sclerosis (MS) is the most common autoimmune disease of the central nervous system (CNS). The main cause of the MS is yet to be revealed, but the most probable theory is based on the molecular mimicry that concludes some infections in the activation of T cells against brain auto-antigens that initiate the disease cascade. The Purpose of this research is the prediction of the auto-antigen potency of the myelin proteolipid protein (PLP) in multiple sclerosis. As there wasn't any tertiary structure of PLP available in the Protein Data Bank (PDB) and in order to characterize the structural properties of the protein, we modeled this protein using prediction servers. Meta prediction method, as a new perspective in silico , was performed to fi nd PLPs epitopes. For this purpose, several T cell epitope prediction web servers were used to predict PLPs epitopes against Human Leukocyte Antigens (HLA). The overlap regions, as were predicted by most web servers were selected as immunogenic epitopes and were subjected to the BLASTP against microorganisms. Three common regions, AA 58-74 , AA 161-177 , and AA 238-254 were detected as immunodominant regions through meta-prediction. Investigating peptides with more than 50% similarity to that of candidate epitope AA 58-74 in bacteria showed a similar peptide in bacteria (mainly consistent with that of clostridium and mycobacterium) and spike protein of Alphacoronavirus 1, Canine coronavirus, and Feline coronavirus. These results suggest that cross reaction of the immune system to PLP may have originated from a bacteria or viral infection, and therefore molecular mimicry might have an important role in the progression of MS. Through reliable and accurate prediction of the consensus epitopes, it is not necessary to synthesize all PLP fragments and examine their immunogenicity experimentally ( in vitro ). In this study, the best encephalitogenic antigens were predicted based on bioinformatics tools that may provide reliable

  19. Statistical significance of theoretical predictions: A new dimension in nuclear structure theories (I)

    International Nuclear Information System (INIS)

    DUDEK, J; SZPAK, B; FORNAL, B; PORQUET, M-G

    2011-01-01

    In this and the follow-up article we briefly discuss what we believe represents one of the most serious problems in contemporary nuclear structure: the question of statistical significance of parametrizations of nuclear microscopic Hamiltonians and the implied predictive power of the underlying theories. In the present Part I, we introduce the main lines of reasoning of the so-called Inverse Problem Theory, an important sub-field in the contemporary Applied Mathematics, here illustrated on the example of the Nuclear Mean-Field Approach.

  20. A points-based algorithm for prognosticating clinical outcome of Chiari malformation Type I with syringomyelia: results from a predictive model analysis of 82 surgically managed adult patients.

    Science.gov (United States)

    Thakar, Sumit; Sivaraju, Laxminadh; Jacob, Kuruthukulangara S; Arun, Aditya Atal; Aryan, Saritha; Mohan, Dilip; Sai Kiran, Narayanam Anantha; Hegde, Alangar S

    2018-01-01

    OBJECTIVE Although various predictors of postoperative outcome have been previously identified in patients with Chiari malformation Type I (CMI) with syringomyelia, there is no known algorithm for predicting a multifactorial outcome measure in this widely studied disorder. Using one of the largest preoperative variable arrays used so far in CMI research, the authors attempted to generate a formula for predicting postoperative outcome. METHODS Data from the clinical records of 82 symptomatic adult patients with CMI and altered hindbrain CSF flow who were managed with foramen magnum decompression, C-1 laminectomy, and duraplasty over an 8-year period were collected and analyzed. Various preoperative clinical and radiological variables in the 57 patients who formed the study cohort were assessed in a bivariate analysis to determine their ability to predict clinical outcome (as measured on the Chicago Chiari Outcome Scale [CCOS]) and the resolution of syrinx at the last follow-up. The variables that were significant in the bivariate analysis were further analyzed in a multiple linear regression analysis. Different regression models were tested, and the model with the best prediction of CCOS was identified and internally validated in a subcohort of 25 patients. RESULTS There was no correlation between CCOS score and syrinx resolution (p = 0.24) at a mean ± SD follow-up of 40.29 ± 10.36 months. Multiple linear regression analysis revealed that the presence of gait instability, obex position, and the M-line-fourth ventricle vertex (FVV) distance correlated with CCOS score, while the presence of motor deficits was associated with poor syrinx resolution (p ≤ 0.05). The algorithm generated from the regression model demonstrated good diagnostic accuracy (area under curve 0.81), with a score of more than 128 points demonstrating 100% specificity for clinical improvement (CCOS score of 11 or greater). The model had excellent reliability (κ = 0.85) and was validated with