Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus cal...
Statistical Analysis of Regularity of Pesticide Residues in Vegetables Produced in Inner Mongoli
Institute of Scientific and Technical Information of China (English)
Fujin ZHANG; Dekun HOU; Jiang HE; Tianyun GAO; Hong LUO; Songyan LANG; Xinxin ZHANG; Yiping YAO
2013-01-01
[Objective] The paper was to study regularity of pesticide residues in veg-etables produced in Inner Mongolia. [Method] Mathematical statistic analysis was carried out on 6 800 samples of veggies, fruits, leguminous vegetables, nuts and seeds produced in Inner Mongolia. [Result] The dominated vegetables in Inner Mon-golia were green leafy vegetables, solanaceous vegetables and melon vegetables, and their yields accounted for 70% of the total vegetables production. Since 2003, with the rapid increase of detected vegetable numbers, the status of vegetable qual-ity safety had entered into a new stage of sustainable steady after a rapid decline or periodic fluctuation, and the differences in safety levels were obvious, of which the range of exceeding standard rate of pesticide residues in solanaceous vegeta-bles, melon vegetables and leguminous vegetables (about 55% of the total vegeta-bles) was under 2%, with average values of 1.1%, 1.6% and 3.1%, respectively. They belonged to stable type. The exceeding standard rate of pesticide residues in green leafy vegetables and Chinese cabbage group (about 30% of total vegetables) presented a decreasing trend year by year, wondering in the range of 7%-10%. They belonged to main risk type. The time period of exceeding standard of pesticide residues in root vegetables and cole vegetables was under predictable, and its ex-ceeding standard rate in some years was over 5% (amplitude variation is over 15 percentage points). They belonged to random risk type. The kinds of pesticides, which exceeding standard rate in vegetables were relatively intensive, presented vari-ance in different vegetable species. 70% of the pesticides belonged to intermittent over-limits and the probability was below 5%. About 20% tradition pesticides often exceeded standard and their probabilities were over 30%. The exceeding standard of organophosphorus and carbamates pesticides in vegetables presented a decreasing tend, while the risk of some new pesticides
Energy Technology Data Exchange (ETDEWEB)
Golynko, I.N.
1976-10-01
By using the method of principal components, calculations and analyses are made on geochemical parameters describing the mutual relations of chemical products in volcanic materials. It is shown that with spontaneous development of volcanism the degree of order in the mutual relations of chemical products increases in a regular fashion in a series of successively formed associations of volcanic ores. The regularity is independent of the types and scale of volcanism, and is manifested both for petrogenic and for small chemical products. This tendency is shown for examples of the evolution of typical volcanic series, the evolution of basaltoids in zones of riftogenesis, and the evolution of volcano-plutonic associations in the tectonic-magmatic cycle of a large ring structure. A discussion is presented on the theoretical significance of this phenomenon from the point of view of a general theory of evolution, and also of the possibility of its practical use in solving a number of petrological problems and in investigating postvolcanic mineralization. (SJR)
Robust Sparse Analysis Regularization
Vaiter, Samuel; Dossal, Charles; Fadili, Jalal
2011-01-01
This paper studies the properties of L1-analysis regularization for the resolution of linear inverse problems. Most previous works consider sparse synthesis priors where the sparsity is measured as the L1 norm of the coefficients that synthesize the signal in a given dictionary. In contrast, the more general analysis regularization minimizes the L1 norm of the correlations between the signal and the atoms in the dictionary. The corresponding variational problem includes several well-known regularizations such as the discrete total variation and the fused lasso. We first prove that a solution of analysis regularization is a piecewise affine function of the observations. Similarly, it is a piecewise affine function of the regularization parameter. This allows us to compute the degrees of freedom associated to sparse analysis estimators. Another contribution gives a sufficient condition to ensure that a signal is the unique solution of the analysis regularization when there is no noise in the observations. The s...
Directory of Open Access Journals (Sweden)
Philippe Andrey
Full Text Available In eukaryotes, the interphase nucleus is organized in morphologically and/or functionally distinct nuclear "compartments". Numerous studies highlight functional relationships between the spatial organization of the nucleus and gene regulation. This raises the question of whether nuclear organization principles exist and, if so, whether they are identical in the animal and plant kingdoms. We addressed this issue through the investigation of the three-dimensional distribution of the centromeres and chromocenters. We investigated five very diverse populations of interphase nuclei at different differentiation stages in their physiological environment, belonging to rabbit embryos at the 8-cell and blastocyst stages, differentiated rabbit mammary epithelial cells during lactation, and differentiated cells of Arabidopsis thaliana plantlets. We developed new tools based on the processing of confocal images and a new statistical approach based on G- and F- distance functions used in spatial statistics. Our original computational scheme takes into account both size and shape variability by comparing, for each nucleus, the observed distribution against a reference distribution estimated by Monte-Carlo sampling over the same nucleus. This implicit normalization allowed similar data processing and extraction of rules in the five differentiated nuclei populations of the three studied biological systems, despite differences in chromosome number, genome organization and heterochromatin content. We showed that centromeres/chromocenters form significantly more regularly spaced patterns than expected under a completely random situation, suggesting that repulsive constraints or spatial inhomogeneities underlay the spatial organization of heterochromatic compartments. The proposed technique should be useful for identifying further spatial features in a wide range of cell types.
Regularized Generalized Structured Component Analysis
Hwang, Heungsun
2009-01-01
Generalized structured component analysis (GSCA) has been proposed as a component-based approach to structural equation modeling. In practice, GSCA may suffer from multi-collinearity, i.e., high correlations among exogenous variables. GSCA has yet no remedy for this problem. Thus, a regularized extension of GSCA is proposed that integrates a ridge…
Statistical regularities attract attention when task-relevant
Directory of Open Access Journals (Sweden)
Andrea eAlamia
2016-02-01
Full Text Available Visual attention seems essential for learning the statistical regularities in our environment, a process known as statistical learning. However, how attention is allocated when exploring a novel visual scene whose statistical structure is unknown remains unclear. In order to address this question, we investigated visual attention allocation during a task in which we manipulated the conditional probability of occurrence of colored stimuli, unbeknown to the subjects. Participants were instructed to detect a target colored dot among two dots moving along separate circular paths. We evaluated implicit statistical learning, i.e. the effect of color predictability on reaction times (RT, and recorded eye position concurrently. Attention allocation was indexed by comparing the Mahalanobis distance between the position, velocity and acceleration of the eyes and the 2 colored dots.We found that learning the conditional probabilities occurred very early during the course of the experiment as shown by the fact that, starting already from the first block, predictable stimuli were detected with shorter RT than unpredictable ones. In terms of attentional allocation, we found that the predictive stimulus attracted gaze only when it was informative about the occurrence of the target but not when it predicted the occurrence of a task-irrelevant stimulus. This suggests that attention allocation was influenced by regularities only when they were instrumental in performing the task. Moreover, we found that the attentional bias towards task-relevant predictive stimuli occurred at a very early stage of learning, concomitantly with the first effects of learning on RT.In conclusion, these results show that statistical regularities capture visual attention only after a few occurrences, provided these regularities are instrumental to perform the task.
Statistical regularities in the return intervals of volatility
Wang, F.; Weber, P.; Yamasaki, K.; Havlin, S.; Stanley, H. E.
2007-01-01
We discuss recent results concerning statistical regularities in the return intervals of volatility in financial markets. In particular, we show how the analysis of volatility return intervals, defined as the time between two volatilities larger than a given threshold, can help to get a better understanding of the behavior of financial time series. We find scaling in the distribution of return intervals for thresholds ranging over a factor of 25, from 0.6 to 15 standard deviations, and also for various time windows from one minute up to 390 min (an entire trading day). Moreover, these results are universal for different stocks, commodities, interest rates as well as currencies. We also analyze the memory in the return intervals which relates to the memory in the volatility and find two scaling regimes, ℓ ℓ* with α2=0.92±0.04; these exponent values are similar to results of Liu et al. for the volatility. As an application, we use the scaling and memory properties of the return intervals to suggest a possibly useful method for estimating risk.
Statistical regularities in the rank-citation profile of scientists
Petersen, Alexander M.; Stanley, H. Eugene; Succi, Sauro
2011-12-01
Recent science of science research shows that scientific impact measures for journals and individual articles have quantifiable regularities across both time and discipline. However, little is known about the scientific impact distribution at the scale of an individual scientist. We analyze the aggregate production and impact using the rank-citation profile ci(r) of 200 distinguished professors and 100 assistant professors. For the entire range of paper rank r, we fit each ci(r) to a common distribution function. Since two scientists with equivalent Hirsch h-index can have significantly different ci(r) profiles, our results demonstrate the utility of the βi scaling parameter in conjunction with hi for quantifying individual publication impact. We show that the total number of citations Ci tallied from a scientist's Ni papers scales as . Such statistical regularities in the input-output patterns of scientists can be used as benchmarks for theoretical models of career progress.
Statistical regularities in the rank-citation profile of scientists.
Petersen, Alexander M; Stanley, H Eugene; Succi, Sauro
2011-01-01
Recent science of science research shows that scientific impact measures for journals and individual articles have quantifiable regularities across both time and discipline. However, little is known about the scientific impact distribution at the scale of an individual scientist. We analyze the aggregate production and impact using the rank-citation profile c(i)(r) of 200 distinguished professors and 100 assistant professors. For the entire range of paper rank r, we fit each c(i)(r) to a common distribution function. Since two scientists with equivalent Hirsch h-index can have significantly different c(i)(r) profiles, our results demonstrate the utility of the β(i) scaling parameter in conjunction with h(i) for quantifying individual publication impact. We show that the total number of citations C(i) tallied from a scientist's N(i) papers scales as [Formula: see text]. Such statistical regularities in the input-output patterns of scientists can be used as benchmarks for theoretical models of career progress.
Mathematical and statistical analysis
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Using volcano plots and regularized-chi statistics in genetic association studies.
Li, Wentian; Freudenberg, Jan; Suh, Young Ju; Yang, Yaning
2014-02-01
Labor intensive experiments are typically required to identify the causal disease variants from a list of disease associated variants in the genome. For designing such experiments, candidate variants are ranked by their strength of genetic association with the disease. However, the two commonly used measures of genetic association, the odds-ratio (OR) and p-value may rank variants in different order. To integrate these two measures into a single analysis, here we transfer the volcano plot methodology from gene expression analysis to genetic association studies. In its original setting, volcano plots are scatter plots of fold-change and t-test statistic (or -log of the p-value), with the latter being more sensitive to sample size. In genetic association studies, the OR and Pearson's chi-square statistic (or equivalently its square root, chi; or the standardized log(OR)) can be analogously used in a volcano plot, allowing for their visual inspection. Moreover, the geometric interpretation of these plots leads to an intuitive method for filtering results by a combination of both OR and chi-square statistic, which we term "regularized-chi". This method selects associated markers by a smooth curve in the volcano plot instead of the right-angled lines which corresponds to independent cutoffs for OR and chi-square statistic. The regularized-chi incorporates relatively more signals from variants with lower minor-allele-frequencies than chi-square test statistic. As rare variants tend to have stronger functional effects, regularized-chi is better suited to the task of prioritization of candidate genes.
Statistical regularities in the rank-citation profile of scientists
Petersen, Alexander M; Succi, Sauro
2011-01-01
Recent "science of science" research shows common regularities in the publication patterns of scientific papers across time and discipline. Here we analyze the complete publication careers of 300 scientists and find remarkable regularity in the functional form of the rank-citation profile c_{i}(r) for each scientist i =1...300. We find that the rank-ordered citation distribution c_{i}(r) can be approximated by a discrete generalized beta distribution (DGBD) over the entire range of ranks r, which allows for the characterization and comparison of c_{i}(r) using a common framework. The functional form of the DGBD has two scaling exponents, beta_i and gamma_i, which determine the scaling behavior of c_{i}(r) for both small and large rank r. The crossover between two scaling regimes suggests a complex reinforcement or positive-feedback relation between the impact of a scientist's most famous papers and the impact of his/her other papers. Moreover, since two scientists with equivalent Hirsch h-index values may hav...
Deconstructing Statistical Analysis
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Prior knowledge regularization in statistical medical image tasks
DEFF Research Database (Denmark)
Crimi, Alessandro; Sporring, Jon; de Bruijne, Marleen
2009-01-01
The estimation of the covariance matrix is a pivotal step inseveral statistical tasks. In particular, the estimation becomes challeng-ing for high dimensional representations of data when few samples areavailable. Using the standard Maximum Likelihood estimation (MLE)when the number of samples ar...
Regularized canonical correlation analysis with unlabeled data
Institute of Scientific and Technical Information of China (English)
Xi-chuan ZHOU; Hai-bin SHEN
2009-01-01
In standard canonical correlation analysis (CCA), the data from definite datasets are used to estimate their canonical correlation. In real applications, for example in bilingual text retrieval, it may have a great portion of data that we do not know which set it belongs to. This part of data is called unlabeled data, while the rest from definite datasets is called labeled data. We propose a novel method called regularized canonical correlation analysis (RCCA), which makes use of both labeled and unlabeled samples. Specifically, we learn to approximate canonical correlation as if all data were labeled. Then. we describe a generalization of RCCA for the multi-set situation. Experiments on four real world datasets, Yeast, Cloud, Iris, and Haberman, demonstrate that,by incorporating the unlabeled data points, the accuracy of correlation coefficients can be improved by over 30%.
Statistical Descriptors of Ocean Regimes From the Geometric Regularity of SST Observations
Ba, Sileye O.; Autret, Emmanuelle; Chapron, Bertrand; FABLET, Ronan
2012-01-01
In this letter, we evaluate to which extent the activity of ocean fronts can be retrieved from the geometric regularity of ocean tracer observations. Applied to sea surface temperature (SST), we propose a method for the characterization of this geometric regularity from curvature-based statistics along temperature level lines in front regions. To assess the effectiveness of the proposed descriptors, we used six years (from 2003 to 2008) of daily SST observations of the regions of Agulhas in t...
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Error analysis for matrix elastic-net regularization algorithms.
Li, Hong; Chen, Na; Li, Luoqing
2012-05-01
Elastic-net regularization is a successful approach in statistical modeling. It can avoid large variations which occur in estimating complex models. In this paper, elastic-net regularization is extended to a more general setting, the matrix recovery (matrix completion) setting. Based on a combination of the nuclear-norm minimization and the Frobenius-norm minimization, we consider the matrix elastic-net (MEN) regularization algorithm, which is an analog to the elastic-net regularization scheme from compressive sensing. Some properties of the estimator are characterized by the singular value shrinkage operator. We estimate the error bounds of the MEN regularization algorithm in the framework of statistical learning theory. We compute the learning rate by estimates of the Hilbert-Schmidt operators. In addition, an adaptive scheme for selecting the regularization parameter is presented. Numerical experiments demonstrate the superiority of the MEN regularization algorithm.
Associative Analysis in Statistics
Directory of Open Access Journals (Sweden)
Mihaela Muntean
2015-03-01
Full Text Available In the last years, the interest in technologies such as in-memory analytics and associative search has increased. This paper explores how you can use in-memory analytics and an associative model in statistics. The word “associative” puts the emphasis on understanding how datasets relate to one another. The paper presents the main characteristics of “associative” data model. Also, the paper presents how to design an associative model for labor market indicators analysis. The source is the EU Labor Force Survey. Also, this paper presents how to make associative analysis.
Air-chemistry "turbulence": power-law scaling and statistical regularity
Directory of Open Access Journals (Sweden)
H.-m. Hsu
2011-08-01
Full Text Available With the intent to gain further knowledge on the spectral structures and statistical regularities of surface atmospheric chemistry, the chemical gases (NO, NO_{2}, NO_{x}, CO, SO_{2}, and O_{3} and aerosol (PM_{10} measured at 74 air quality monitoring stations over the island of Taiwan are analyzed for the year of 2004 at hourly resolution. They represent a range of surface air quality with a mixed combination of geographic settings, and include urban/rural, coastal/inland, plain/hill, and industrial/agricultural locations. In addition to the well-known semi-diurnal and diurnal oscillations, weekly, and intermediate (20 ~ 30 days peaks are also identified with the continuous wavelet transform (CWT. The spectra indicate power-law scaling regions for the frequencies higher than the diurnal and those lower than the diurnal with the average exponents of −5/3 and −1, respectively. These dual-exponents are corroborated with those with the detrended fluctuation analysis in the corresponding time-lag regions. These exponents are mostly independent of the averages and standard deviations of time series measured at various geographic settings, i.e., the spatial inhomogeneities. In other words, they possess dominant universal structures. After spectral coefficients from the CWT decomposition are grouped according to the spectral bands, and inverted separately, the PDFs of the reconstructed time series for the high-frequency band demonstrate the interesting statistical regularity, −3 power-law scaling for the heavy tails, consistently. Such spectral peaks, dual-exponent structures, and power-law scaling in heavy tails are important structural information, but their relations to turbulence and mesoscale variability require further investigations. This could lead to a better understanding of the processes controlling air quality.
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
Variable. This procedure was developed in order to be able to export objects as ESRI shape data with the 90-percentile of the Hue of each object's pixels as an item in the shape attribute table. This procedure uses a sub-level single pixel chessboard segmentation, loops for each of the objects......This RS code is to do Object-by-Object analysis of each Object's sub-objects, e.g. statistical analysis of an object's individual image data pixels. Statistics, such as percentiles (so-called "quartiles") are derived by the process, but the return of that can only be a Scene Variable, not an Object...... of a specific class in turn, and uses as pair of PPO stages to derive the statistics and then assign them to the objects' Object Variables. It may be that this could all be done in some other, simply way, but several other ways that were tried did not succeed. The procedure ouptut has been tested against...
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Tillmann, Barbara; McAdams, Stephen
2004-01-01
The present study investigated the influence of acoustical characteristics on the implicit learning of statistical regularities (transition probabilities) in sequences of musical timbres. The sequences were constructed in such a way that the acoustical dissimilarities between timbres potentially created segmentations that either supported (S1) or…
Tillmann, Barbara; McAdams, Stephen
2004-01-01
The present study investigated the influence of acoustical characteristics on the implicit learning of statistical regularities (transition probabilities) in sequences of musical timbres. The sequences were constructed in such a way that the acoustical dissimilarities between timbres potentially created segmentations that either supported (S1) or…
Statistical image reconstruction for low-dose CT using nonlocal means-based regularization.
Zhang, Hao; Ma, Jianhua; Wang, Jing; Liu, Yan; Lu, Hongbing; Liang, Zhengrong
2014-09-01
Low-dose computed tomography (CT) imaging without sacrifice of clinical tasks is desirable due to the growing concerns about excessive radiation exposure to the patients. One common strategy to achieve low-dose CT imaging is to lower the milliampere-second (mAs) setting in data scanning protocol. However, the reconstructed CT images by the conventional filtered back-projection (FBP) method from the low-mAs acquisitions may be severely degraded due to the excessive noise. Statistical image reconstruction (SIR) methods have shown potentials to significantly improve the reconstructed image quality from the low-mAs acquisitions, wherein the regularization plays a critical role and an established family of regularizations is based on the Markov random field (MRF) model. Inspired by the success of nonlocal means (NLM) in image processing applications, in this work, we propose to explore the NLM-based regularization for SIR to reconstruct low-dose CT images from low-mAs acquisitions. Experimental results with both digital and physical phantoms consistently demonstrated that SIR with the NLM-based regularization can achieve more gains than SIR with the well-known Gaussian MRF regularization or the generalized Gaussian MRF regularization and the conventional FBP method, in terms of image noise reduction and resolution preservation.
Stability Analysis for Regularized Least Squares Regression
Rudin, Cynthia
2005-01-01
We discuss stability for a class of learning algorithms with respect to noisy labels. The algorithms we consider are for regression, and they involve the minimization of regularized risk functionals, such as L(f) := 1/N sum_i (f(x_i)-y_i)^2+ lambda ||f||_H^2. We shall call the algorithm `stable' if, when y_i is a noisy version of f*(x_i) for some function f* in H, the output of the algorithm converges to f* as the regularization term and noise simultaneously vanish. We consider two flavors of...
Material analysis on engineering statistics
Energy Technology Data Exchange (ETDEWEB)
Lee, Seung Hun
2008-03-15
This book is about material analysis on engineering statistics using mini tab, which includes technical statistics and seven tools of QC, probability distribution, presumption and checking, regression analysis, tim series analysis, control chart, process capacity analysis, measurement system analysis, sampling check, experiment planning, response surface analysis, compound experiment, Taguchi method, and non parametric statistics. It is good for university and company to use because it deals with theory first and analysis using mini tab on 6 sigma BB and MBB.
Directory of Open Access Journals (Sweden)
Shkvarko Yuriy
2006-01-01
Full Text Available We address a new approach to solve the ill-posed nonlinear inverse problem of high-resolution numerical reconstruction of the spatial spectrum pattern (SSP of the backscattered wavefield sources distributed over the remotely sensed scene. An array or synthesized array radar (SAR that employs digital data signal processing is considered. By exploiting the idea of combining the statistical minimum risk estimation paradigm with numerical descriptive regularization techniques, we address a new fused statistical descriptive regularization (SDR strategy for enhanced radar imaging. Pursuing such an approach, we establish a family of the SDR-related SSP estimators, that encompass a manifold of existing beamforming techniques ranging from traditional matched filter to robust and adaptive spatial filtering, and minimum variance methods.
Benvenuto, Federico
2012-01-01
In this paper we propose a new statistical stopping rule for constrained maximum likelihood iterative algorithms applied to ill-posed inverse problems. To this aim we extend the definition of Tikhonov regularization in a statistical framework and prove that the application of the proposed stopping rule to the Iterative Space Reconstruction Algorithm (ISRA) in the Gaussian case and Expectation Maximization (EM) in the Poisson case leads to well defined regularization methods according to the given definition. We also prove that, if an inverse problem is genuinely ill-posed in the sense of Tikhonov, the same definition is not satisfied when ISRA and EM are optimized by classical stopping rule like Morozov's discrepancy principle, Pearson's test and Poisson discrepancy principle. The stopping rule is illustrated in the case of image reconstruction from data recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI). First, by using a simulated image consisting of structures analogous to those ...
Tuan, P H; Yu, Y T; Chiang, P Y; Liang, H C; Huang, K F; Chen, Y F
2012-02-01
We thoroughly analyze the level statistics and eigenfunctions in concentric as well as nonconcentric square torus billiards. We confirm the characteristics of quantum and classical correspondence and the existence of scarred and superscarred modes in concentric square torus billiards. Furthermore, we not only verify that the transition from regular to chaotic behaviors can be manifested in nonconcentric square torus billiards, but also develop an analytical distribution to excellently fit the numerical level statistics. Finally, we intriguingly observe that numerous eigenstates commonly exhibit the wave patterns to be an ensemble of classical diamond trajectories, as the effective wavelengths are considerably shorter than the size of internal hole.
Basic analysis of regularized series and products
Jorgenson, Jay A
1993-01-01
Analytic number theory and part of the spectral theory of operators (differential, pseudo-differential, elliptic, etc.) are being merged under amore general analytic theory of regularized products of certain sequences satisfying a few basic axioms. The most basic examples consist of the sequence of natural numbers, the sequence of zeros with positive imaginary part of the Riemann zeta function, and the sequence of eigenvalues, say of a positive Laplacian on a compact or certain cases of non-compact manifolds. The resulting theory is applicable to ergodic theory and dynamical systems; to the zeta and L-functions of number theory or representation theory and modular forms; to Selberg-like zeta functions; andto the theory of regularized determinants familiar in physics and other parts of mathematics. Aside from presenting a systematic account of widely scattered results, the theory also provides new results. One part of the theory deals with complex analytic properties, and another part deals with Fourier analys...
Dang, H.; Stayman, J. W.; Xu, J.; Sisniega, A.; Zbijewski, W.; Wang, X.; Foos, D. H.; Aygun, N.; Koliatsos, V. E.; Siewerdsen, J. H.
2016-03-01
Intracranial hemorrhage (ICH) is associated with pathologies such as hemorrhagic stroke and traumatic brain injury. Multi-detector CT is the current front-line imaging modality for detecting ICH (fresh blood contrast 40-80 HU, down to 1 mm). Flat-panel detector (FPD) cone-beam CT (CBCT) offers a potential alternative with a smaller scanner footprint, greater portability, and lower cost potentially well suited to deployment at the point of care outside standard diagnostic radiology and emergency room settings. Previous studies have suggested reliable detection of ICH down to 3 mm in CBCT using high-fidelity artifact correction and penalized weighted least-squared (PWLS) image reconstruction with a post-artifact-correction noise model. However, ICH reconstructed by traditional image regularization exhibits nonuniform spatial resolution and noise due to interaction between the statistical weights and regularization, which potentially degrades the detectability of ICH. In this work, we propose three regularization methods designed to overcome these challenges. The first two compute spatially varying certainty for uniform spatial resolution and noise, respectively. The third computes spatially varying regularization strength to achieve uniform "detectability," combining both spatial resolution and noise in a manner analogous to a delta-function detection task. Experiments were conducted on a CBCT test-bench, and image quality was evaluated for simulated ICH in different regions of an anthropomorphic head. The first two methods improved the uniformity in spatial resolution and noise compared to traditional regularization. The third exhibited the highest uniformity in detectability among all methods and best overall image quality. The proposed regularization provides a valuable means to achieve uniform image quality in CBCT of ICH and is being incorporated in a CBCT prototype for ICH imaging.
Zhang, Hao; Ma, Jianhua; Lu, Hongbing; Liang, Zhengrong
2014-01-01
Statistical image reconstruction (SIR) methods have shown potential to substantially improve the image quality of low-dose X-ray computed tomography (CT) as compared to the conventional filtered back-projection (FBP) method for various clinical tasks. According to the maximum a posterior (MAP) estimation, the SIR methods can be typically formulated by an objective function consisting of two terms: (1) data-fidelity (or equivalently, data-fitting or data-mismatch) term modeling the statistics of projection measurements, and (2) regularization (or equivalently, prior or penalty) term reflecting prior knowledge or expectation on the characteristics of the image to be reconstructed. Existing SIR methods for low-dose CT can be divided into two groups: (1) those that use calibrated transmitted photon counts (before log-transform) with penalized maximum likelihood (pML) criterion, and (2) those that use calibrated line-integrals (after log-transform) with penalized weighted least-squares (PWLS) criterion. Accurate s...
Trace Formulae and Spectral Statistics for Discrete Laplacians on Regular Graphs (II)
Oren, Idan
2010-01-01
Following the derivation of the trace formulae in the first paper in this series, we establish here a connection between the spectral statistics of random regular graphs and the predictions of Random Matrix Theory (RMT). This follows from the known Poisson distribution of cycle counts in regular graphs, in the limit that the cycle periods are kept constant and the number of vertices increases indefinitely. The result is analogous to the so called "diagonal approximation" in Quantum Chaos. We also show that by assuming that the spectral correlations are given by RMT to all orders, we can compute the leading deviations from the Poisson distribution for cycle counts. We provide numerical evidence which supports this conjecture.
Qi, Peng; Hu, Xiaolin
2014-04-01
It is well known that there exist nonlinear statistical regularities in natural images. Existing approaches for capturing such regularities always model the image intensities by assuming a parameterized distribution for the intensities and learn the parameters. In the letter, we propose to model the outer product of image intensities by assuming a gaussian distribution for it. A two-layer structure is presented, where the first layer is nonlinear and the second layer is linear. Trained on natural images, the first-layer bases resemble the receptive fields of simple cells in the primary visual cortex (V1), while the second-layer units exhibit some properties of the complex cells in V1, including phase invariance and masking effect. The model can be seen as an approximation of the covariance model proposed in Karklin and Lewicki (2009) but has more robust and efficient learning algorithms.
Statistical Analysis and validation
Hoefsloot, H.C.J.; Horvatovich, P.; Bischoff, R.
2013-01-01
In this chapter guidelines are given for the selection of a few biomarker candidates from a large number of compounds with a relative low number of samples. The main concepts concerning the statistical validation of the search for biomarkers are discussed. These complicated methods and concepts are
Current Redistribution in Resistor Networks: Fat-Tail Statistics in Regular and Small-World Networks
Lehmann, Jörg
2016-01-01
The redistribution of electrical currents in resistor networks after single-bond failures is analyzed in terms of current-redistribution factors that are shown to depend only on the topology of the network and on the values of the bond resistances. We investigate the properties of these current-redistribution factors for regular network topologies (e.g. $d$-dimensional hypercubic lattices) as well as for small-world networks. In particular, we find that the statistics of the current redistribution factors exhibits a fat-tail behavior, which reflects the long-range nature of the current redistribution as determined by Kirchhoff's circuit laws.
Lehmann, Jörg; Bernasconi, Jakob
2017-03-01
The redistribution of electrical currents in resistor networks after single-bond failures is analyzed in terms of current-redistribution factors that are shown to depend only on the topology of the network and on the values of the bond resistances. We investigate the properties of these current-redistribution factors for regular network topologies (e.g., d-dimensional hypercubic lattices) as well as for small-world networks. In particular, we find that the statistics of the current redistribution factors exhibits a fat-tail behavior, which reflects the long-range nature of the current redistribution as determined by Kirchhoff's circuit laws.
Inverse problems with Poisson data: statistical regularization theory, applications and algorithms
Hohage, Thorsten; Werner, Frank
2016-09-01
Inverse problems with Poisson data arise in many photonic imaging modalities in medicine, engineering and astronomy. The design of regularization methods and estimators for such problems has been studied intensively over the last two decades. In this review we give an overview of statistical regularization theory for such problems, the most important applications, and the most widely used algorithms. The focus is on variational regularization methods in the form of penalized maximum likelihood estimators, which can be analyzed in a general setup. Complementing a number of recent convergence rate results we will establish consistency results. Moreover, we discuss estimators based on a wavelet-vaguelette decomposition of the (necessarily linear) forward operator. As most prominent applications we briefly introduce Positron emission tomography, inverse problems in fluorescence microscopy, and phase retrieval problems. The computation of a penalized maximum likelihood estimator involves the solution of a (typically convex) minimization problem. We also review several efficient algorithms which have been proposed for such problems over the last five years.
Energy Technology Data Exchange (ETDEWEB)
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques.
Regularized Multiple-Set Canonical Correlation Analysis
Takane, Yoshio; Hwang, Heungsun; Abdi, Herve
2008-01-01
Multiple-set canonical correlation analysis (Generalized CANO or GCANO for short) is an important technique because it subsumes a number of interesting multivariate data analysis techniques as special cases. More recently, it has also been recognized as an important technique for integrating information from multiple sources. In this paper, we…
DEFF Research Database (Denmark)
Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi
2015-01-01
This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Statistical Testing of Segment Homogeneity in Classification of Piecewise–Regular Objects
Directory of Open Access Journals (Sweden)
Savchenko Andrey V.
2015-12-01
Full Text Available The paper is focused on the problem of multi-class classification of composite (piecewise-regular objects (e.g., speech signals, complex images, etc.. We propose a mathematical model of composite object representation as a sequence of independent segments. Each segment is represented as a random sample of independent identically distributed feature vectors. Based on this model and a statistical approach, we reduce the task to a problem of composite hypothesis testing of segment homogeneity. Several nearest-neighbor criteria are implemented, and for some of them the well-known special cases (e.g., the Kullback–Leibler minimum information discrimination principle, the probabilistic neural network are highlighted. It is experimentally shown that the proposed approach improves the accuracy when compared with contemporary classifiers.
Havens, Timothy C.; Cummings, Ian; Botts, Jonathan; Summers, Jason E.
2017-05-01
The linear ordered statistic (LOS) is a parameterized ordered statistic (OS) that is a weighted average of a rank-ordered sample. LOS operators are useful generalizations of aggregation as they can represent any linear aggregation, from minimum to maximum, including conventional aggregations, such as mean and median. In the fuzzy logic field, these aggregations are called ordered weighted averages (OWAs). Here, we present a method for learning LOS operators from training data, viz., data for which you know the output of the desired LOS. We then extend the learning process with regularization, such that a lower complexity or sparse LOS can be learned. Hence, we discuss what 'lower complexity' means in this context and how to represent that in the optimization procedure. Finally, we apply our learning methods to the well-known constant-false-alarm-rate (CFAR) detection problem, specifically for the case of background levels modeled by long-tailed distributions, such as the K-distribution. These backgrounds arise in several pertinent imaging problems, including the modeling of clutter in synthetic aperture radar and sonar (SAR and SAS) and in wireless communications.
Godoy-Lorite, Antonia; Guimerà, Roger; Sales-Pardo, Marta
2016-01-01
In social networks, individuals constantly drop ties and replace them by new ones in a highly unpredictable fashion. This highly dynamical nature of social ties has important implications for processes such as the spread of information or of epidemics. Several studies have demonstrated the influence of a number of factors on the intricate microscopic process of tie replacement, but the macroscopic long-term effects of such changes remain largely unexplored. Here we investigate whether, despite the inherent randomness at the microscopic level, there are macroscopic statistical regularities in the long-term evolution of social networks. In particular, we analyze the email network of a large organization with over 1,000 individuals throughout four consecutive years. We find that, although the evolution of individual ties is highly unpredictable, the macro-evolution of social communication networks follows well-defined statistical patterns, characterized by exponentially decaying log-variations of the weight of social ties and of individuals' social strength. At the same time, we find that individuals have social signatures and communication strategies that are remarkably stable over the scale of several years.
Statistical Analysis with Missing Data
Little, Roderick J A
2002-01-01
Praise for the First Edition of Statistical Analysis with Missing Data ""An important contribution to the applied statistics literature.... I give the book high marks for unifying and making accessible much of the past and current work in this important area."" Ã¢ÂÂWilliam E. Strawderman, Rutgers University ""This book...provide[s] interesting real-life examples, stimulating end-of-chapter exercises, and up-to-date references. It should be on every applied statisticianÃ¢ÂÂs bookshelf."" Ã¢ÂÂThe Statistician ""The book should be studied in the statistical methods d
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
Analysis of Logic Programs Using Regular Tree Languages
DEFF Research Database (Denmark)
Gallagher, John Patrick
2012-01-01
The eld of nite tree automata provides fundamental notations and tools for reasoning about set of terms called regular or recognizable tree languages. We consider two kinds of analysis using regular tree languages, applied to logic programs. The rst approach is to try to discover automatically a ...... to the analysis is a program and a tree automaton, and the output is an abstract model of the program. These two contrasting abstract interpretations can be used in a wide range of analysis and verication problems.......The eld of nite tree automata provides fundamental notations and tools for reasoning about set of terms called regular or recognizable tree languages. We consider two kinds of analysis using regular tree languages, applied to logic programs. The rst approach is to try to discover automatically...... a tree automaton from a logic program, approximating its minimal Herbrand model. In this case the input for the analysis is a program, and the output is a tree automaton. The second approach is to expose or check properties of the program that can be expressed by a given tree automaton. The input...
Common pitfalls in statistical analysis: Clinical versus statistical significance
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In clinical research, study results, which are statistically significant are often interpreted as being clinically important. While statistical significance indicates the reliability of the study results, clinical significance reflects its impact on clinical practice. The third article in this series exploring pitfalls in statistical analysis clarifies the importance of differentiating between statistical significance and clinical significance. PMID:26229754
Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis
Mahoney, Michael W
2012-01-01
Database theory and database practice are typically the domain of computer scientists who adopt what may be termed an algorithmic perspective on their data. This perspective is very different than the more statistical perspective adopted by statisticians, scientific computers, machine learners, and other who work on what may be broadly termed statistical data analysis. In this article, I will address fundamental aspects of this algorithmic-statistical disconnect, with an eye to bridging the gap between these two very different approaches. A concept that lies at the heart of this disconnect is that of statistical regularization, a notion that has to do with how robust is the output of an algorithm to the noise properties of the input data. Although it is nearly completely absent from computer science, which historically has taken the input data as given and modeled algorithms discretely, regularization in one form or another is central to nearly every application domain that applies algorithms to noisy data. B...
Statistical methods for bioimpedance analysis
Directory of Open Access Journals (Sweden)
Christian Tronstad
2014-04-01
Full Text Available This paper gives a basic overview of relevant statistical methods for the analysis of bioimpedance measurements, with an aim to answer questions such as: How do I begin with planning an experiment? How many measurements do I need to take? How do I deal with large amounts of frequency sweep data? Which statistical test should I use, and how do I validate my results? Beginning with the hypothesis and the research design, the methodological framework for making inferences based on measurements and statistical analysis is explained. This is followed by a brief discussion on correlated measurements and data reduction before an overview is given of statistical methods for comparison of groups, factor analysis, association, regression and prediction, explained in the context of bioimpedance research. The last chapter is dedicated to the validation of a new method by different measures of performance. A flowchart is presented for selection of statistical method, and a table is given for an overview of the most important terms of performance when evaluating new measurement technology.
Statistical regularities in the rank-citation profile of individual scientists
Petersen, Alexander; Stanley, H. Eugene; Succi, Sauro
2011-03-01
Citation counts and paper tallies are ubiquitous in the achievement ratings of individual scientists. As a result, there have been many recent studies which propose measures for scientific impact (e.g. the h -index) and the distribution of impact measures among scientists. However, being just a single number, the h -index cannot account for the full impact information contained in an author's set of publications. Alternative ``single-number'' indices are also frequently proposed, but they too suffer from the shortfalls of not being comprehensive. In this talk I will discuss an alternative approach, which is to analyze the fundamental properties of the entire rank-citation profile (from which all single-value indices are derived). Using the complete publication careers of 200 highly-cited physicists and 100 Assistant professors, I will demonstrate remarkable statistical regularity in the functional form of the rank-citation profile ci (r) for each physicist i = 1 . . . 300 . We find that ci (r) can be approximated by a discrete generalized beta distribution over the entire range of ranks r , which allows for the characterization and comparison of ci (r) using a common framework. Since two scientists can have equivalent hi values while having different ci (r) , our results demonstrate the utility of a scaling parameter, βi , in conjunction with hi , to quantify a scientist's publication impact.
Kutnink, Timothy; Santrach, Amelia; Hockett, Sarah; Barcus, Scott; Petridis, Athanasios
2016-09-01
The time-dependent electromagnetically self-coupled Dirac equation is solved numerically by means of the staggered-leap-frog algorithm with reflecting boundary conditions. The stability region of the method versus the interaction strength and the spatial-grid size over time-step ratio is established. The expectation values of several dynamic operators are then evaluated as functions of time. These include the fermion and electromagnetic energies and the fermion dynamic mass, as the self-interacting spinors are no longer mass-eigenfunctions. There is a characteristic, non-exponential, oscillatory dependence leading to asymptotic constants of these expectation values. In the case of the fermion mass this amounts to renormalization. The dependence of the expectation values on the spatial-grid size is evaluated in detail. Statistical regularization, employing a canonical ensemble whose temperature is the inverse of the grid size, is used to remove the grid-size dependence and produce a finite result in the continuum limit.
Sparks, Rachel; Madabhushi, Anant
2012-03-01
Gleason patterns of prostate cancer histopathology, characterized primarily by morphological and architectural attributes of histological structures (glands and nuclei), have been found to be highly correlated with disease aggressiveness and patient outcome. Gleason patterns 4 and 5 are highly correlated with more aggressive disease and poorer patient outcome, while Gleason patterns 1-3 tend to reflect more favorable patient outcome. Because Gleason grading is done manually by a pathologist visually examining glass (or digital) slides, subtle morphologic and architectural differences of histological attributes may result in grading errors and hence cause high inter-observer variability. Recently some researchers have proposed computerized decision support systems to automatically grade Gleason patterns by using features pertaining to nuclear architecture, gland morphology, as well as tissue texture. Automated characterization of gland morphology has been shown to distinguish between intermediate Gleason patterns 3 and 4 with high accuracy. Manifold learning (ML) schemes attempt to generate a low dimensional manifold representation of a higher dimensional feature space while simultaneously preserving nonlinear relationships between object instances. Classification can then be performed in the low dimensional space with high accuracy. However ML is sensitive to the samples contained in the dataset; changes in the dataset may alter the manifold structure. In this paper we present a manifold regularization technique to constrain the low dimensional manifold to a specific range of possible manifold shapes, the range being determined via a statistical shape model of manifolds (SSMM). In this work we demonstrate applications of the SSMM in (1) identifying samples on the manifold which contain noise, defined as those samples which deviate from the SSMM, and (2) accurate out-of-sample extrapolation (OSE) of newly acquired samples onto a manifold constrained by the SSMM. We
Statistical Analysis for Performance Comparison
Directory of Open Access Journals (Sweden)
Priyanka Dutta
2013-07-01
Full Text Available Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runsinto performance problems at one time or another. This paper discusses about performance issues facedduring Pre Examination Process Automation System (PEPAS implemented in java technology. Thechallenges faced during the life cycle of the project and the mitigation actions performed. It compares 3java technologies and shows how improvements are made through statistical analysis in response time ofthe application. The paper concludes with result analysis.
Tikhonov regularization-based operational transfer path analysis
Cheng, Wei; Lu, Yingying; Zhang, Zhousuo
2016-06-01
To overcome ill-posed problems in operational transfer path analysis (OTPA), and improve the stability of solutions, this paper proposes a novel OTPA based on Tikhonov regularization, which considers both fitting degrees and stability of solutions. Firstly, fundamental theory of Tikhonov regularization-based OTPA is presented, and comparative studies are provided to validate the effectiveness on ill-posed problems. Secondly, transfer path analysis and source contribution evaluations for numerical cases studies on spherical radiating acoustical sources are comparatively studied. Finally, transfer path analysis and source contribution evaluations for experimental case studies on a test bed with thin shell structures are provided. This study provides more accurate transfer path analysis for mechanical systems, which can benefit for vibration reduction by structural path optimization. Furthermore, with accurate evaluation of source contributions, vibration monitoring and control by active controlling vibration sources can be effectively carried out.
Iterated Process Analysis over Lattice-Valued Regular Expressions
DEFF Research Database (Denmark)
Midtgaard, Jan; Nielson, Flemming; Nielson, Hanne Riis
2016-01-01
We present an iterated approach to statically analyze programs of two processes communicating by message passing. Our analysis operates over a domain of lattice-valued regular expressions, and computes increasingly better approximations of each process's communication behavior. Overall the work...... extends traditional semantics-based program analysis techniques to automatically reason about message passing in a manner that can simultaneously analyze both values of variables as well as message order, message content, and their interdependencies....
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Tools for Basic Statistical Analysis
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Facial Affect Recognition Using Regularized Discriminant Analysis-Based Algorithms
Directory of Open Access Journals (Sweden)
Cheng-Yuan Shih
2010-01-01
Full Text Available This paper presents a novel and effective method for facial expression recognition including happiness, disgust, fear, anger, sadness, surprise, and neutral state. The proposed method utilizes a regularized discriminant analysis-based boosting algorithm (RDAB with effective Gabor features to recognize the facial expressions. Entropy criterion is applied to select the effective Gabor feature which is a subset of informative and nonredundant Gabor features. The proposed RDAB algorithm uses RDA as a learner in the boosting algorithm. The RDA combines strengths of linear discriminant analysis (LDA and quadratic discriminant analysis (QDA. It solves the small sample size and ill-posed problems suffered from QDA and LDA through a regularization technique. Additionally, this study uses the particle swarm optimization (PSO algorithm to estimate optimal parameters in RDA. Experiment results demonstrate that our approach can accurately and robustly recognize facial expressions.
Optimal analysis of structures by concepts of symmetry and regularity
Kaveh, Ali
2013-01-01
Optimal analysis is defined as an analysis that creates and uses sparse, well-structured and well-conditioned matrices. The focus is on efficient methods for eigensolution of matrices involved in static, dynamic and stability analyses of symmetric and regular structures, or those general structures containing such components. Powerful tools are also developed for configuration processing, which is an important issue in the analysis and design of space structures and finite element models. Different mathematical concepts are combined to make the optimal analysis of structures feasible. Canonical forms from matrix algebra, product graphs from graph theory and symmetry groups from group theory are some of the concepts involved in the variety of efficient methods and algorithms presented. The algorithms elucidated in this book enable analysts to handle large-scale structural systems by lowering their computational cost, thus fulfilling the requirement for faster analysis and design of future complex systems. The ...
Local Behavior of Sparse Analysis Regularization: Applications to Risk Estimation
Vaiter, Samuel; Peyré, Gabriel; Dossal, Charles; Fadili, Jalal
2012-01-01
This paper studies the recovery of an unknown signal $x_0$ from low dimensional noisy observations $y = \\Phi x_0 + w$, where $\\Phi$ is an ill-posed linear operator and $w$ accounts for some noise. We focus our attention to sparse analysis regularization. The recovery is performed by minimizing the sum of a quadratic data fidelity term and the $\\lun$-norm of the correlations between the sought after signal and atoms in a given (generally overcomplete) dictionary. The $\\lun$ prior is weighted by a regularization parameter $\\lambda > 0$ that accounts for the noise level. In this paper, we prove that minimizers of this problem are piecewise-affine functions of the observations $y$ and the regularization parameter $\\lambda$. As a byproduct, we exploit these properties to get an objectively guided choice of $\\lambda$. More precisely, we propose an extension of the Generalized Stein Unbiased Risk Estimator (GSURE) and show that it is an unbiased estimator of an appropriately defined risk. This encompasses special ca...
François, Clément; Schön, Daniele
2014-02-01
There is increasing evidence that humans and other nonhuman mammals are sensitive to the statistical structure of auditory input. Indeed, neural sensitivity to statistical regularities seems to be a fundamental biological property underlying auditory learning. In the case of speech, statistical regularities play a crucial role in the acquisition of several linguistic features, from phonotactic to more complex rules such as morphosyntactic rules. Interestingly, a similar sensitivity has been shown with non-speech streams: sequences of sounds changing in frequency or timbre can be segmented on the sole basis of conditional probabilities between adjacent sounds. We recently ran a set of cross-sectional and longitudinal experiments showing that merging music and speech information in song facilitates stream segmentation and, further, that musical practice enhances sensitivity to statistical regularities in speech at both neural and behavioral levels. Based on recent findings showing the involvement of a fronto-temporal network in speech segmentation, we defend the idea that enhanced auditory learning observed in musicians originates via at least three distinct pathways: enhanced low-level auditory processing, enhanced phono-articulatory mapping via the left Inferior Frontal Gyrus and Pre-Motor cortex and increased functional connectivity within the audio-motor network. Finally, we discuss how these data predict a beneficial use of music for optimizing speech acquisition in both normal and impaired populations.
Statistical Analysis of Protein Ensembles
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical Analysis of Protein Ensembles
Directory of Open Access Journals (Sweden)
Gabriell eMáté
2014-04-01
Full Text Available As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Interactive facades analysis and synthesis of semi-regular facades
AlHalawani, Sawsan
2013-05-01
Urban facades regularly contain interesting variations due to allowed deformations of repeated elements (e.g., windows in different open or close positions) posing challenges to state-of-the-art facade analysis algorithms. We propose a semi-automatic framework to recover both repetition patterns of the elements and their individual deformation parameters to produce a factored facade representation. Such a representation enables a range of applications including interactive facade images, improved multi-view stereo reconstruction, facade-level change detection, and novel image editing possibilities. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.
Parametric statistical change point analysis
Chen, Jie
2000-01-01
This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study
Extended -Regular Sequence for Automated Analysis of Microarray Images
Directory of Open Access Journals (Sweden)
Jin Hee-Jeong
2006-01-01
Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.
Statistical Analysis of Tsunami Variability
Zolezzi, Francesca; Del Giudice, Tania; Traverso, Chiara; Valfrè, Giulio; Poggi, Pamela; Parker, Eric J.
2010-05-01
similar to that seen in ground motion attenuation correlations used for seismic hazard assessment. The second issue was intra-event variability. This refers to the differences in tsunami wave run-up along a section of coast during a single event. Intra-event variability investigated directly considering field observations. The tsunami events used in the statistical evaluation were selected on the basis of the completeness and reliability of the available data. Tsunami considered for the analysis included the recent and well surveyed tsunami of Boxing Day 2004 (Great Indian Ocean Tsunami), Java 2006, Okushiri 1993, Kocaeli 1999, Messina 1908 and a case study of several historic events in Hawaii. Basic statistical analysis was performed on the field observations from these tsunamis. For events with very wide survey regions, the run-up heights have been grouped in order to maintain a homogeneous distance from the source. Where more than one survey was available for a given event, the original datasets were maintained separately to avoid combination of non-homogeneous data. The observed run-up measurements were used to evaluate the minimum, maximum, average, standard deviation and coefficient of variation for each data set. The minimum coefficient of variation was 0.12 measured for the 2004 Boxing Day tsunami at Nias Island (7 data) while the maximum is 0.98 for the Okushiri 1993 event (93 data). The average coefficient of variation is of the order of 0.45.
Kostenko, Yuri T.; Shkvarko, Yuri V.
1994-06-01
The aim of this presentation is to address a new theoretic approach to the problem of the development of remote sensing imaging (RSI) nonlinear techniques that exploit the idea of fusion the experiment design and statistical regularization theory-based methods for inverse problems solution optimal/suboptimal in the mixed Bayesian-regularization setting. The basic purpose of such the information fusion-based methodology is twofold, namely, to design the appropriate system- oriented finite-dimensional model of the RSI experiment in the terms of projection schemes for wavefield inversion problems, and to derive the two-stage estimation techniques that provide the optimal/suboptimal restoration of the power distribution in the environment from the limited number of the wavefield measurements. We also discuss issues concerning the available control of some additional degrees of freedom while such an RSI experiment is conducted.
DEFF Research Database (Denmark)
Missov, Trifon I.; Schöley, Jonas
to this criterion admissible distributions are, for example, the gamma, the beta, the truncated normal, the log-logistic and the Weibull, while distributions like the log-normal and the inverse Gaussian do not satisfy this condition. In this article we show that models with admissible frailty distributions......Missov and Finkelstein (2011) prove an Abelian and its corresponding Tauberian theorem regarding distributions for modeling unobserved heterogeneity in fixed-frailty mixture models. The main property of such distributions is the regular variation at zero of their densities. According...... and a Gompertz baseline provide a better fit to adult human mortality data than the corresponding models with non-admissible frailty distributions. We implement estimation procedures for mixture models with a Gompertz baseline and frailty that follows a gamma, truncated normal, log-normal, or inverse Gaussian...
Directory of Open Access Journals (Sweden)
Tamara eMelmer
2013-04-01
Full Text Available The spatial characteristics of letters and their influence on readability and letter identification have been intensely studied during the last decades. There have been few studies, however, on statistical image properties that reflect more global aspects of text, for example properties that may relate to its aesthetic appeal. It has been shown that natural scenes and a large variety of visual artworks possess a scale-invariant Fourier power spectrum that falls off linearly with increasing frequency in log-log plots. We asked whether images of text share this property. As expected, the Fourier spectrum of images of regular typed or handwritten text is highly anisotropic, i.e. the spectral image properties in vertical, horizontal and oblique orientations differ. Moreover, the spatial frequency spectra of text images are not scale invariant in any direction. The decline is shallower in the low-frequency part of the spectrum for text than for aesthetic artworks, whereas, in the high-frequency part, it is steeper. These results indicate that, in general, images of regular text contain less global structure (low spatial frequencies relative to fine detail (high spatial frequencies than images of aesthetics artworks. Moreover, we studied images of text with artistic claim (ornate print and calligraphy and ornamental art. For some measures, these images assume average values intermediate between regular text and aesthetic artworks. Finally, to answer the question of whether the statistical properties measured by us are universal amongst humans or are subject to intercultural differences, we compared images from three different cultural backgrounds (Western, East Asian and Arabic. Results for different categories (regular text, aesthetic writing, ornamental art and fine art were similar across cultures.
Melmer, Tamara; Amirshahi, Seyed A; Koch, Michael; Denzler, Joachim; Redies, Christoph
2013-01-01
The spatial characteristics of letters and their influence on readability and letter identification have been intensely studied during the last decades. There have been few studies, however, on statistical image properties that reflect more global aspects of text, for example properties that may relate to its aesthetic appeal. It has been shown that natural scenes and a large variety of visual artworks possess a scale-invariant Fourier power spectrum that falls off linearly with increasing frequency in log-log plots. We asked whether images of text share this property. As expected, the Fourier spectrum of images of regular typed or handwritten text is highly anisotropic, i.e., the spectral image properties in vertical, horizontal, and oblique orientations differ. Moreover, the spatial frequency spectra of text images are not scale-invariant in any direction. The decline is shallower in the low-frequency part of the spectrum for text than for aesthetic artworks, whereas, in the high-frequency part, it is steeper. These results indicate that, in general, images of regular text contain less global structure (low spatial frequencies) relative to fine detail (high spatial frequencies) than images of aesthetics artworks. Moreover, we studied images of text with artistic claim (ornate print and calligraphy) and ornamental art. For some measures, these images assume average values intermediate between regular text and aesthetic artworks. Finally, to answer the question of whether the statistical properties measured by us are universal amongst humans or are subject to intercultural differences, we compared images from three different cultural backgrounds (Western, East Asian, and Arabic). Results for different categories (regular text, aesthetic writing, ornamental art, and fine art) were similar across cultures.
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
Statistical analysis of management data
Gatignon, Hubert
2013-01-01
This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.
Li, B
1995-01-01
We look at the high-lying eigenstates (from the 10,001st to the 13, 000th) in the Robnik billiard (defined as a quadratic conformal map of the unit disk) with the shape parameter \\lambda=0.15. All the 3,000 eigenstates have been numerically calculated and examined in the configuration space and in the phase space which - in comparison with the classical phase space - enabled a clear cut classification of energy levels into regular and irregular. This is the first successful separation of energy levels based on purely dynamical rather than special geometrical symmetry properties. We calculate the fractional measure of regular levels as \\rho_1=0.365\\pm 0.01 which is in remarkable agreement with the classical estimate \\rho_1=0.360\\pm 0.001. This finding confirms the Percival's (1973) classification scheme, the assumption in Berry-Robnik (1984) theory and the rigorous result by Lazutkin (1981,1991). The regular levels obey the Poissonian statistics quite well whereas the irregular sequence exhibits the fractional...
Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.
Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun
2017-01-01
Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.
Godoy-Lorite, Antonia; Sales-Pardo, Marta
2016-01-01
In social networks, individuals constantly drop ties and replace them by new ones in a highly unpredictable fashion. This highly dynamical nature of social ties has important implications for processes such as the spread of information or of epidemics. Several studies have demonstrated the influence of a number of factors on the intricate microscopic process of tie replacement, but the macroscopic long-term effects of such changes remain largely unexplored. Here we investigate whether, despite the inherent randomness at the microscopic level, there are macroscopic statistical regularities in the long-term evolution of social networks. In particular, we analyze the email network of a large organization with over 1,000 individuals throughout four consecutive years. We find that, although the evolution of individual ties is highly unpredictable, the macro-evolution of social communication networks follows well-defined statistical laws, characterized by exponentially decaying log-variations of the weight of socia...
Simulating the particle size distribution of rockfill materials based on its statistical regularity
Institute of Scientific and Technical Information of China (English)
YAN Zongling; QIU Xiande; YU Yongqiang
2003-01-01
The particle size distribution of rockfill is studied by using granular mechanics, mesomechanics and probability statistics to reveal the relationship of the distribution of particle size to that of the potential energy intensity before fragmentation,which finds out that the potential energy density has a linear relation to the logarithm of particle size and deduces that the distribution of the logarithm of particle size conforms to normal distribution because the distribution of the potential energy density does so. Based on this finding and by including the energy principle of rock fragmentation, the logarithm distribution model of particle size is formulated, which uncovers the natural characteristics of particle sizes on statistical distribution. Exploring the properties of the average value, the expectation, and the unbiased variance of particle size indicates that the expectation does notequal to the average value, but increases with increasing particle size and its ununiformity, and is always larger than the average value, and the unbiased variance increases as the ununiformity and geometric average value increase. A case study proves that the simulated results by the proposed logarithm distribution model accord with the actual data. It is concluded that the logarithm distribution model and Kuz-Ram model can be used to forecast the particle-size distribution of inartificial rockfill while for blasted rockfill, Kuz-Ram model is an option, and in combined application of the two models, it is necessary to do field tests to adjust some parameters of the model.
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Vapor Pressure Data Analysis and Statistics
2016-12-01
VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE...DATE XX-12-2016 2. REPORT TYPE Final 3. DATES COVERED (From - To) Nov 2015 – Apr 2016 4. TITLE Vapor Pressure Data Analysis and Statistics 5a...1 VAPOR PRESSURE DATA ANALYSIS AND STATISTICS 1. INTRODUCTION Knowledge of the vapor pressure of materials as a function of temperature is
A Statistical Analysis of Cryptocurrencies
Directory of Open Access Journals (Sweden)
Stephen Chan
2017-05-01
Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.
Asymptotic modal analysis and statistical energy analysis
Dowell, Earl H.
1988-07-01
Statistical Energy Analysis (SEA) is defined by considering the asymptotic limit of Classical Modal Analysis, an approach called Asymptotic Modal Analysis (AMA). The general approach is described for both structural and acoustical systems. The theoretical foundation is presented for structural systems, and experimental verification is presented for a structural plate responding to a random force. Work accomplished subsequent to the grant initiation focusses on the acoustic response of an interior cavity (i.e., an aircraft or spacecraft fuselage) with a portion of the wall vibrating in a large number of structural modes. First results were presented at the ASME Winter Annual Meeting in December, 1987, and accepted for publication in the Journal of Vibration, Acoustics, Stress and Reliability in Design. It is shown that asymptotically as the number of acoustic modes excited becomes large, the pressure level in the cavity becomes uniform except at the cavity boundaries. However, the mean square pressure at the cavity corner, edge and wall is, respectively, 8, 4, and 2 times the value in the cavity interior. Also it is shown that when the portion of the wall which is vibrating is near a cavity corner or edge, the response is significantly higher.
Statistical regularities of Carbon emission trading market: Evidence from European Union allowances
Zheng, Zeyu; Xiao, Rui; Shi, Haibo; Li, Guihong; Zhou, Xiaofeng
2015-05-01
As an emerging financial market, the trading value of carbon emission trading market has definitely increased. In recent years, the carbon emission allowances have already become a way of investment. They are bought and sold not only by carbon emitters but also by investors. In this paper, we analyzed the price fluctuations of the European Union allowances (EUA) futures in European Climate Exchange (ECX) market from 2007 to 2011. The symmetric and power-law probability density function of return time series was displayed. We found that there are only short-range correlations in price changes (return), while long-range correlations in the absolute of price changes (volatility). Further, detrended fluctuation analysis (DFA) approach was applied with focus on long-range autocorrelations and Hurst exponent. We observed long-range power-law autocorrelations in the volatility that quantify risk, and found that they decay much more slowly than the autocorrelation of return time series. Our analysis also showed that the significant cross correlations exist between return time series of EUA and many other returns. These cross correlations exist in a wide range of fields, including stock markets, energy concerned commodities futures, and financial futures. The significant cross-correlations between energy concerned futures and EUA indicate the physical relationship between carbon emission and energy production process. Additionally, the cross-correlations between financial futures and EUA indicate that the speculation behavior may become an important factor that can affect the price of EUA. Finally we modeled the long-range volatility time series of EUA with a particular version of the GARCH process, and the result also suggests long-range volatility autocorrelations.
Common pitfalls in statistical analysis: Logistic regression.
Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh
2017-01-01
Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.
Towards a Judgement-Based Statistical Analysis
Gorard, Stephen
2006-01-01
There is a misconception among social scientists that statistical analysis is somehow a technical, essentially objective, process of decision-making, whereas other forms of data analysis are judgement-based, subjective and far from technical. This paper focuses on the former part of the misconception, showing, rather, that statistical analysis…
Multistructure Statistical Model Applied To Factor Analysis
Bentler, Peter M.
1976-01-01
A general statistical model for the multivariate analysis of mean and covariance structures is described. Matrix calculus is used to develop the statistical aspects of one new special case in detail. This special case separates the confounding of principal components and factor analysis. (DEP)
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
BER analysis of regularized least squares for BPSK recovery
Ben Atitallah, Ismail
2017-06-20
This paper investigates the problem of recovering an n-dimensional BPSK signal x
Explorations in Statistics: The Analysis of Change
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Local regularity analysis of strata heterogeneities from sonic logs
Directory of Open Access Journals (Sweden)
S. Gaci
2010-09-01
Full Text Available Borehole logs provide geological information about the rocks crossed by the wells. Several properties of rocks can be interpreted in terms of lithology, type and quantity of the fluid filling the pores and fractures.
Here, the logs are assumed to be nonhomogeneous Brownian motions (nhBms which are generalized fractional Brownian motions (fBms indexed by depth-dependent Hurst parameters H(z. Three techniques, the local wavelet approach (LWA, the average-local wavelet approach (ALWA, and Peltier Algorithm (PA, are suggested to estimate the Hurst functions (or the regularity profiles from the logs.
First, two synthetic sonic logs with different parameters, shaped by the successive random additions (SRA algorithm, are used to demonstrate the potential of the proposed methods. The obtained Hurst functions are close to the theoretical Hurst functions. Besides, the transitions between the modeled layers are marked by Hurst values discontinuities. It is also shown that PA leads to the best Hurst value estimations.
Second, we investigate the multifractional property of sonic logs data recorded at two scientific deep boreholes: the pilot hole VB and the ultra deep main hole HB, drilled for the German Continental Deep Drilling Program (KTB. All the regularity profiles independently obtained for the logs provide a clear correlation with lithology, and from each regularity profile, we derive a similar segmentation in terms of lithological units. The lithological discontinuities (strata' bounds and faults contacts are located at the local extrema of the Hurst functions. Moreover, the regularity profiles are compared with the KTB estimated porosity logs, showing a significant relation between the local extrema of the Hurst functions and the fluid-filled fractures. The Hurst function may then constitute a tool to characterize underground heterogeneities.
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... Intermediate statistic is a link between Friedman test statistic and the multinomial statistic. The statistic is ... The null hypothesis Ho .... [7] Taplin, R.H., The Statistical Analysis of Preference Data, Applied Statistics, No. 4, pp.
Hypothesis testing and statistical analysis of microbiome
Directory of Open Access Journals (Sweden)
Yinglin Xia
2017-09-01
Full Text Available After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.
Statistical Analysis of English Noun Suffixes
Institute of Scientific and Technical Information of China (English)
王惠灵
2014-01-01
This study discusses the origin of English noun suffixes,including Old English,Latin,French and Greek.A statistical analysis of some typical noun suffixes is followed in two corpora,which focuses on their frequency and distribution.
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
Statistical Analysis Of Reconnaissance Geochemical Data From Orle District, ... The univariate methods used include frequency distribution and cumulative ... The possible mineral potential of the area include base metals (Pb, Zn, Cu, Mo, etc.) ...
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using Open......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Onishi, Akinari; Natsume, Kiyohisa
2013-01-01
This paper demonstrates a better classification performance of an ensemble classifier using a regularized linear discriminant analysis (LDA) for P300-based brain-computer interface (BCI). The ensemble classifier with an LDA is sensitive to the lack of training data because covariance matrices are estimated imprecisely. One of the solution against the lack of training data is to employ a regularized LDA. Thus we employed the regularized LDA for the ensemble classifier of the P300-based BCI. The principal component analysis (PCA) was used for the dimension reduction. As a result, an ensemble regularized LDA classifier showed significantly better classification performance than an ensemble un-regularized LDA classifier. Therefore the proposed ensemble regularized LDA classifier is robust against the lack of training data.
Schmidt decomposition and multivariate statistical analysis
Bogdanov, Yu. I.; Bogdanova, N. A.; Fastovets, D. V.; Luckichev, V. F.
2016-12-01
The new method of multivariate data analysis based on the complements of classical probability distribution to quantum state and Schmidt decomposition is presented. We considered Schmidt formalism application to problems of statistical correlation analysis. Correlation of photons in the beam splitter output channels, when input photons statistics is given by compound Poisson distribution is examined. The developed formalism allows us to analyze multidimensional systems and we have obtained analytical formulas for Schmidt decomposition of multivariate Gaussian states. It is shown that mathematical tools of quantum mechanics can significantly improve the classical statistical analysis. The presented formalism is the natural approach for the analysis of both classical and quantum multivariate systems and can be applied in various tasks associated with research of dependences.
The Analysis of Two-Way Functional Data Using Two-Way Regularized Singular Value Decompositions
Huang, Jianhua Z.
2009-12-01
Two-way functional data consist of a data matrix whose row and column domains are both structured, for example, temporally or spatially, as when the data are time series collected at different locations in space. We extend one-way functional principal component analysis (PCA) to two-way functional data by introducing regularization of both left and right singular vectors in the singular value decomposition (SVD) of the data matrix. We focus on a penalization approach and solve the nontrivial problem of constructing proper two-way penalties from oneway regression penalties. We introduce conditional cross-validated smoothing parameter selection whereby left-singular vectors are cross- validated conditional on right-singular vectors, and vice versa. The concept can be realized as part of an alternating optimization algorithm. In addition to the penalization approach, we briefly consider two-way regularization with basis expansion. The proposed methods are illustrated with one simulated and two real data examples. Supplemental materials available online show that several "natural" approaches to penalized SVDs are flawed and explain why so. © 2009 American Statistical Association.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
Instant Replay: Investigating statistical Analysis in Sports
Sidhu, Gagan
2011-01-01
Technology has had an unquestionable impact on the way people watch sports. As technology has evolved, so too has the knowledge of a casual sports fan. A direct result of this evolution is the amount of statistical analysis in sport. The goal of statistical analysis in sports is a simple one: to eliminate subjective analysis. Over the past four decades, statistics have slowly pervaded the viewing experience of sports. In this paper, we analyze previous work that proposed metrics and models that seek to evaluate various aspects of sports. The unifying goal of these works is an accurate representation of either the player or sport. We also look at work that investigates certain situations and their impact on the outcome a game. We conclude this paper with the discussion of potential future work in certain areas of sport..
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
About Statistical Analysis of Qualitative Survey Data
Directory of Open Access Journals (Sweden)
Stefan Loehnert
2010-01-01
Full Text Available Gathered data is frequently not in a numerical form allowing immediate appliance of the quantitative mathematical-statistical methods. In this paper are some basic aspects examining how quantitative-based statistical methodology can be utilized in the analysis of qualitative data sets. The transformation of qualitative data into numeric values is considered as the entrance point to quantitative analysis. Concurrently related publications and impacts of scale transformations are discussed. Subsequently, it is shown how correlation coefficients are usable in conjunction with data aggregation constrains to construct relationship modelling matrices. For illustration, a case study is referenced at which ordinal type ordered qualitative survey answers are allocated to process defining procedures as aggregation levels. Finally options about measuring the adherence of the gathered empirical data to such kind of derived aggregation models are introduced and a statistically based reliability check approach to evaluate the reliability of the chosen model specification is outlined.
Analysis of Variance with Summary Statistics in Microsoft® Excel®
Larson, David A.; Hsu, Ko-Cheng
2010-01-01
Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
Statistical Analysis of Random Simulations : Bootstrap Tutorial
Deflandre, D.; Kleijnen, J.P.C.
2002-01-01
The bootstrap is a simple but versatile technique for the statistical analysis of random simulations.This tutorial explains the basics of that technique, and applies it to the well-known M/M/1 queuing simulation.In that numerical example, different responses are studied.For some responses, bootstrap
Bayesian Statistics for Biological Data: Pedigree Analysis
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
Statistical Analysis of Thermal Analysis Margin
Garrison, Matthew B.
2011-01-01
NASA Goddard Space Flight Center requires that each project demonstrate a minimum of 5 C margin between temperature predictions and hot and cold flight operational limits. The bounding temperature predictions include worst-case environment and thermal optical properties. The purpose of this work is to: assess how current missions are performing against their pre-launch bounding temperature predictions and suggest any possible changes to the thermal analysis margin rules
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
Graph Laplacian Regularization for Image Denoising: Analysis in the Continuous Domain.
Pang, Jiahao; Cheung, Gene
2017-04-01
Inverse imaging problems are inherently underdetermined, and hence, it is important to employ appropriate image priors for regularization. One recent popular prior-the graph Laplacian regularizer-assumes that the target pixel patch is smooth with respect to an appropriately chosen graph. However, the mechanisms and implications of imposing the graph Laplacian regularizer on the original inverse problem are not well understood. To address this problem, in this paper, we interpret neighborhood graphs of pixel patches as discrete counterparts of Riemannian manifolds and perform analysis in the continuous domain, providing insights into several fundamental aspects of graph Laplacian regularization for image denoising. Specifically, we first show the convergence of the graph Laplacian regularizer to a continuous-domain functional, integrating a norm measured in a locally adaptive metric space. Focusing on image denoising, we derive an optimal metric space assuming non-local self-similarity of pixel patches, leading to an optimal graph Laplacian regularizer for denoising in the discrete domain. We then interpret graph Laplacian regularization as an anisotropic diffusion scheme to explain its behavior during iterations, e.g., its tendency to promote piecewise smooth signals under certain settings. To verify our analysis, an iterative image denoising algorithm is developed. Experimental results show that our algorithm performs competitively with state-of-the-art denoising methods, such as BM3D for natural images, and outperforms them significantly for piecewise smooth images.
Statistical Tools for Forensic Analysis of Toolmarks
Energy Technology Data Exchange (ETDEWEB)
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
EEG based Autism Diagnosis Using Regularized Fisher Linear Discriminant Analysis
Directory of Open Access Journals (Sweden)
Mahmoud I. Kamel
2012-04-01
Full Text Available Diagnosis of autism is one of the difficult problems facing researchers. To reveal the discriminative pattern between autistic and normal children via electroencephalogram (EEG analysis is a big challenge. The feature extraction is averaged Fast Fourier Transform (FFT with the Regulated Fisher Linear Discriminant (RFLD classifier. Gaussinaty condition for the optimality of Regulated Fisher Linear Discriminant (RFLD has been achieved by a well-conditioned appropriate preprocessing of the data, as well as optimal shrinkage technique for the Lambda parameter. Winsorised Filtered Data gave the best result.
Statistical Analysis of Iberian Peninsula Megaliths Orientations
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Analysis of regular structures third degree based on chordal rings
DEFF Research Database (Denmark)
Bujnowski, Slawomir; Dubalski, Bozydar; Pedersen, Jens Myrup
2009-01-01
This paper presents an analysis of modified chordal rings third degree (CR3m) and modified double ring structure (N2Rm), which can be used as models of real networks. The proposed solutions are novel and different from the ones currently used since they have two chords of different lengths....... In the first part of paper, formulas for the basic parameters diameter and average path length were derived using optimal/ideal graphs, and used for indicating transmission properties of the structures. These analytical results were confirmed by comparison to a large number of computations on real graphs. In...... the second part, these parameters were compared to the parameters of standard topologies, showing that the distances are shorter when having two different chord lengths....
Multivariate analysis: A statistical approach for computations
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Statistical quality control through overall vibration analysis
Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence
L0 Regularized Stationary-time Estimation for Crowd Analysis.
Yi, Shuai; Wang, Xiaogang; Lu, Cewu; Jia, Jiaya; Li, Hongsheng
2016-04-29
In this paper, we tackle the problem of stationary crowd analysis which is as important as modeling mobile groups in crowd scenes and finds many important applications in crowd surveillance. Our key contribution is to propose a robust algorithm for estimating how long a foreground pixel becomes stationary. It is much more challenging than only subtracting background because failure at a single frame due to local movement of objects, lighting variation, and occlusion could lead to large errors on stationary-time estimation. To achieve robust and accurate estimation, sparse constraints along spatial and temporal dimensions are jointly added by mixed partials (which are second-order gradients) to shape a 3D stationary-time map. It is formulated as an L0 optimization problem. Besides background subtraction, it distinguishes among different foreground objects, which are close or overlapped in the spatio-temporal space by using a locally shared foreground codebook. The proposed technologies are further demonstrated through three applications. 1) Based on the results of stationary-time estimation, twelve descriptors are proposed to detect four types of stationary crowd activities. 2) The averaged stationary-time map is estimated to analyze crowd scene structures. 3) The result of stationary-time estimation is also used to study the influence of stationary crowd groups to traffic patterns.
The L(1/2) regularization approach for survival analysis in the accelerated failure time model.
Chai, Hua; Liang, Yong; Liu, Xiao-Ying
2015-09-01
The analysis of high-dimensional and low-sample size microarray data for survival analysis of cancer patients is an important problem. It is a huge challenge to select the significantly relevant bio-marks from microarray gene expression datasets, in which the number of genes is far more than the size of samples. In this article, we develop a robust prediction approach for survival time of patient by a L(1/2) regularization estimator with the accelerated failure time (AFT) model. The L(1/2) regularization could be seen as a typical delegate of L(q)(0regularization methods and it has shown many attractive features. In order to optimize the problem of the relevant gene selection in high-dimensional biological data, we implemented the L(1/2) regularized AFT model by the coordinate descent algorithm with a renewed half thresholding operator. The results of the simulation experiment showed that we could obtain more accurate and sparse predictor for survival analysis by the L(1/2) regularized AFT model compared with other L1 type regularization methods. The proposed procedures are applied to five real DNA microarray datasets to efficiently predict the survival time of patient based on a set of clinical prognostic factors and gene signatures.
Statistical mechanics analysis of sparse data.
Habeck, Michael
2011-03-01
Inferential structure determination uses Bayesian theory to combine experimental data with prior structural knowledge into a posterior probability distribution over protein conformational space. The posterior distribution encodes everything one can say objectively about the native structure in the light of the available data and additional prior assumptions and can be searched for structural representatives. Here an analogy is drawn between the posterior distribution and the canonical ensemble of statistical physics. A statistical mechanics analysis assesses the complexity of a structure calculation globally in terms of ensemble properties. Analogs of the free energy and density of states are introduced; partition functions evaluate the consistency of prior assumptions with data. Critical behavior is observed with dwindling restraint density, which impairs structure determination with too sparse data. However, prior distributions with improved realism ameliorate the situation by lowering the critical number of observations. An in-depth analysis of various experimentally accessible structural parameters and force field terms will facilitate a statistical approach to protein structure determination with sparse data that avoids bias as much as possible.
Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
Analysis of a Regularized Bingham Model with Pressure-Dependent Yield Stress
El Khouja, Nazek; Roquet, Nicolas; Cazacliu, Bogdan
2015-12-01
The goal of this article is to provide some essential results for the solution of a regularized viscoplastic frictional flow model adapted from the extensive mathematical analysis of the Bingham model. The Bingham model is a standard for the description of viscoplastic flows and it is widely used in many application areas. However, wet granular viscoplastic flows necessitate the introduction of additional non-linearities and coupling between velocity and stress fields. This article proposes a step toward a frictional coupling, characterized by a dependence of the yield stress to the pressure field. A regularized version of this viscoplastic frictional model is analysed in the framework of stationary flows. Existence, uniqueness and regularity are investigated, as well as finite-dimensional and algorithmic approximations. It is shown that the model can be solved and approximated as far as a frictional parameter is small enough. Getting similar results for the non-regularized model remains an issue. Numerical investigations are postponed to further works.
Statistical analysis of sleep spindle occurrences.
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical error analysis of reactivity measurement
Energy Technology Data Exchange (ETDEWEB)
Thammaluckan, Sithisak; Hah, Chang Joo [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)
2013-10-15
After statistical analysis, it was confirmed that each group were sampled from same population. It is observed in Table 7 that the mean error decreases as core size increases. Application of bias factor obtained from this research reduces mean error further. The point kinetic model had been used to measure control rod worth without 3D spatial information of neutron flux or power distribution, which causes inaccurate result. Dynamic Control rod Reactivity Measurement (DCRM) was employed to take into account of 3D spatial information of flux in the point kinetics model. The measured bank worth probably contains some uncertainty such as methodology uncertainty and measurement uncertainty. Those uncertainties may varies with size of core and magnitude of reactivity. The goal of this research is to investigate the effect of core size and magnitude of control rod worth on the error of reactivity measurement using statistics.
On quantum statistics in data analysis
Pavlovic, Dusko
2008-01-01
Originally, quantum probability theory was developed to analyze statistical phenomena in quantum systems, where classical probability theory does not apply, because the lattice of measurable sets is not necessarily distributive. On the other hand, it is well known that the lattices of concepts, that arise in data analysis, are in general also non-distributive, albeit for completely different reasons. In his recent book, van Rijsbergen argues that many of the logical tools developed for quantum systems are also suitable for applications in information retrieval. I explore the mathematical support for this idea on an abstract vector space model, covering several forms of data analysis (information retrieval, data mining, collaborative filtering, formal concept analysis...), and roughly based on an idea from categorical quantum mechanics. It turns out that quantum (i.e., noncommutative) probability distributions arise already in this rudimentary mathematical framework. Moreover, a Bell-type inequality is formula...
von Larcher, Thomas; Blome, Therese; Klein, Rupert; Schneider, Reinhold; Wolf, Sebastian; Huber, Benjamin
2016-04-01
Handling high-dimensional data sets like they occur e.g. in turbulent flows or in multiscale behaviour of certain types in Geosciences are one of the big challenges in numerical analysis and scientific computing. A suitable solution is to represent those large data sets in an appropriate compact form. In this context, tensor product decomposition methods currently emerge as an important tool. One reason is that these methods often enable one to attack high-dimensional problems successfully, another that they allow for very compact representations of large data sets. We follow the novel Tensor-Train (TT) decomposition method to support the development of improved understanding of the multiscale behavior and the development of compact storage schemes for solutions of such problems. One long-term goal of the project is the construction of a self-consistent closure for Large Eddy Simulations (LES) of turbulent flows that explicitly exploits the tensor product approach's capability of capturing self-similar structures. Secondly, we focus on a mixed deterministic-stochastic subgrid scale modelling strategy currently under development for application in Finite Volume Large Eddy Simulation (LES) codes. Advanced methods of time series analysis for the databased construction of stochastic models with inherently non-stationary statistical properties and concepts of information theory based on a modified Akaike information criterion and on the Bayesian information criterion for the model discrimination are used to construct surrogate models for the non-resolved flux fluctuations. Vector-valued auto-regressive models with external influences form the basis for the modelling approach [1], [2], [4]. Here, we present the reconstruction capabilities of the two modeling approaches tested against 3D turbulent channel flow data computed by direct numerical simulation (DNS) for an incompressible, isothermal fluid at Reynolds number Reτ = 590 (computed by [3]). References [1] I
Multiscale statistical analysis of coronal solar activity
Gamborino, Diana; Martinell, Julio J
2016-01-01
Multi-filter images from the solar corona are used to obtain temperature maps which are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions we show that the multiscale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also be extracted from the analysis.
Statistical Hot Channel Analysis for the NBSR
Energy Technology Data Exchange (ETDEWEB)
Cuadra A.; Baek J.
2014-05-27
A statistical analysis of thermal limits has been carried out for the research reactor (NBSR) at the National Institute of Standards and Technology (NIST). The objective of this analysis was to update the uncertainties of the hot channel factors with respect to previous analysis for both high-enriched uranium (HEU) and low-enriched uranium (LEU) fuels. Although uncertainties in key parameters which enter into the analysis are not yet known for the LEU core, the current analysis uses reasonable approximations instead of conservative estimates based on HEU values. Cumulative distribution functions (CDFs) were obtained for critical heat flux ratio (CHFR), and onset of flow instability ratio (OFIR). As was done previously, the Sudo-Kaminaga correlation was used for CHF and the Saha-Zuber correlation was used for OFI. Results were obtained for probability levels of 90%, 95%, and 99.9%. As an example of the analysis, the results for both the existing reactor with HEU fuel and the LEU core show that CHFR would have to be above 1.39 to assure with 95% probability that there is no CHF. For the OFIR, the results show that the ratio should be above 1.40 to assure with a 95% probability that OFI is not reached.
Statistical energy analysis of similarly coupled systems
Institute of Scientific and Technical Information of China (English)
ZHANG Jian
2002-01-01
Based on the principle of Statistical Energy Analysis (SEA) for non-conservatively coupled dynamical systems under non-correlative or correlative excitations, energy relationship between two similar SEA systems is established in the paper. The energy relationship is verified theoretically and experimentally from two similar SEA systems i.e., the structure of a coupled panel-beam and that of a coupled panel-sideframe, in the cases of conservative coupling and non-conservative coupling respectively. As an application of the method, relationship between noise power radiated from two similar cutting systems is studied. Results show that there are good agreements between the theory and the experiments, and the method is valuable to analysis of dynamical problems associated with a complicated system from that with a simple one.
STATISTICAL ANALYSIS OF PUBLIC ADMINISTRATION PAY
Directory of Open Access Journals (Sweden)
Elena I. Dobrolyubova
2014-01-01
Full Text Available This article reviews the progress achieved inimproving the pay system in public administration and outlines the key issues to be resolved.The cross-country comparisons presented inthe article suggest high differentiation in pay levels depending on position held. In fact,this differentiation in Russia exceeds one in OECD almost twofold The analysis of theinternal pay structure demonstrates that thelow share of the base pay leads to perversenature of ‘stimulation elements’ of the paysystem which in fact appear to be used mostlyfor compensation purposes. The analysis of regional statistical data demonstrates thatdespite high differentiation among regionsin terms of their revenue potential, averagepublic ofﬁcial pay is strongly correlated withthe average regional pay.
Wavelet and statistical analysis for melanoma classification
Nimunkar, Amit; Dhawan, Atam P.; Relue, Patricia A.; Patwardhan, Sachin V.
2002-05-01
The present work focuses on spatial/frequency analysis of epiluminesence images of dysplastic nevus and melanoma. A three-level wavelet decomposition was performed on skin-lesion images to obtain coefficients in the wavelet domain. A total of 34 features were obtained by computing ratios of the mean, variance, energy and entropy of the wavelet coefficients along with the mean and standard deviation of image intensity. An unpaired t-test for a normal distribution based features and the Wilcoxon rank-sum test for non-normal distribution based features were performed for selecting statistically correlated features. For our data set, the statistical analysis of features reduced the feature set from 34 to 5 features. For classification, the discriminant functions were computed in the feature space using the Mahanalobis distance. ROC curves were generated and evaluated for false positive fraction from 0.1 to 0.4. Most of the discrimination functions provided a true positive rate for melanoma of 93% with a false positive rate up to 21%.
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
Multivariate statistical analysis of wildfires in Portugal
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Non-negative matrix analysis in x-ray spectromicroscopy: choosing regularizers
Mak, Rachel; Wild, Stefan M.; Jacobsen, Chris
2016-01-01
In x-ray spectromicroscopy, a set of images can be acquired across an absorption edge to reveal chemical speciation. We previously described the use of non-negative matrix approximation methods for improved classification and analysis of these types of data. We present here an approach to find appropriate values of regularization parameters for this optimization approach. PMID:27041779
Statistical analysis of sleep spindle occurrences.
Directory of Open Access Journals (Sweden)
Dagmara Panas
Full Text Available Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical Analysis of Bus Networks in India
Chatterjee, Atanu; Ramadurai, Gitakrishnan
2015-01-01
Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \\textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe tha...
Blog Content and User Engagement - An Insight Using Statistical Analysis.
Directory of Open Access Journals (Sweden)
Apoorva Vikrant Kulkarni
2013-06-01
Full Text Available Since the past few years organizations have increasingly realized the value of social media in positioning, propagating and marketing the product/service and organization itself. Today every organization be it small or big has realized the essence of creating a space in the World Wide Web. Social Media through its multifaceted platforms has enabled the organizations to propagate their brands. There are a number of social media networks which are helpful in spreading the message to customers. Many organizations are having full time web analytics teams that are regularly trying to ensure that prospectivecustomers are visiting their organization through various forms of social media. Web analytics is foreseen as a tool for Business Intelligence by organizations and there are a large number of analytics tools available for monitoring the visibility of a particular brand on the web. For example, Google has its ownanalytic tool that is very widely used. There are number of free as well as paid analytical tools available on the internet. The objective of this paper is to study what content in a blog present in the social media creates a greater impact on user engagement. The study statistically analyzes the relation between content of the blog and user engagement. The statistical analysis was carried out on a blog of a reputed management institute in Pune to arrive at conclusions.
On two methods of statistical image analysis
Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, KL
1999-01-01
The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition, smooth
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
An R package for statistical provenance analysis
Vermeesch, Pieter; Resentini, Alberto; Garzanti, Eduardo
2016-05-01
This paper introduces provenance, a software package within the statistical programming environment R, which aims to facilitate the visualisation and interpretation of large amounts of sedimentary provenance data, including mineralogical, petrographic, chemical and isotopic provenance proxies, or any combination of these. provenance comprises functions to: (a) calculate the sample size required to achieve a given detection limit; (b) plot distributional data such as detrital zircon U-Pb age spectra as Cumulative Age Distributions (CADs) or adaptive Kernel Density Estimates (KDEs); (c) plot compositional data as pie charts or ternary diagrams; (d) correct the effects of hydraulic sorting on sandstone petrography and heavy mineral composition; (e) assess the settling equivalence of detrital minerals and grain-size dependence of sediment composition; (f) quantify the dissimilarity between distributional data using the Kolmogorov-Smirnov and Sircombe-Hazelton distances, or between compositional data using the Aitchison and Bray-Curtis distances; (e) interpret multi-sample datasets by means of (classical and nonmetric) Multidimensional Scaling (MDS) and Principal Component Analysis (PCA); and (f) simplify the interpretation of multi-method datasets by means of Generalised Procrustes Analysis (GPA) and 3-way MDS. All these tools can be accessed through an intuitive query-based user interface, which does not require knowledge of the R programming language. provenance is free software released under the GPL-2 licence and will be further expanded based on user feedback.
Development of statistical models for data analysis
Energy Technology Data Exchange (ETDEWEB)
Downham, D.Y.
2000-07-01
Incidents that cause, or could cause, injury to personnel, and that satisfy specific criteria, are reported to the Offshore Safety Division (OSD) of the Health and Safety Executive (HSE). The underlying purpose of this report is to improve ways of quantifying risk, a recommendation in Lord Cullen's report into the Piper Alpha disaster. Records of injuries and hydrocarbon releases from 1 January, 1991, to 31 March 1996, are analysed, because the reporting of incidents was standardised after 1990. Models are identified for risk assessment and some are applied. The appropriate analyses of one or two factors (or variables) are tests of uniformity or of independence. Radar graphs are used to represent some temporal variables. Cusums are applied for the analysis of incident frequencies over time, and could be applied for regular monitoring. Log-linear models for Poisson-distributed data are identified as being suitable for identifying 'non-random' combinations of more than two factors. Some questions cannot be addressed with the available data: for example, more data are needed to assess the risk of injury per employee in a time interval. If the questions are considered sufficiently important, resources could be assigned to obtain the data. Some of the main results from the analyses are as follows: the cusum analyses identified a change-point at the end of July 1993, when the reported number of injuries reduced by 40%. Injuries were more likely to occur between 8am and 12am or between 2pm and 5pm than at other times: between 2pm and 3pm the number of injuries was almost twice the average and was more than three fold the smallest. No seasonal effects in the numbers of injuries were identified. Three-day injuries occurred more frequently on the 5th, 6th and 7th days into a tour of duty than on other days. Three-day injuries occurred less frequently on the 13th and 14th days of a tour of duty. An injury classified as 'lifting or craning' was
[Formula: see text] regularity properties of singular parameterizations in isogeometric analysis.
Takacs, T; Jüttler, B
2012-11-01
Isogeometric analysis (IGA) is a numerical simulation method which is directly based on the NURBS-based representation of CAD models. It exploits the tensor-product structure of 2- or 3-dimensional NURBS objects to parameterize the physical domain. Hence the physical domain is parameterized with respect to a rectangle or to a cube. Consequently, singularly parameterized NURBS surfaces and NURBS volumes are needed in order to represent non-quadrangular or non-hexahedral domains without splitting, thereby producing a very compact and convenient representation. The Galerkin projection introduces finite-dimensional spaces of test functions in the weak formulation of partial differential equations. In particular, the test functions used in isogeometric analysis are obtained by composing the inverse of the domain parameterization with the NURBS basis functions. In the case of singular parameterizations, however, some of the resulting test functions do not necessarily fulfill the required regularity properties. Consequently, numerical methods for the solution of partial differential equations cannot be applied properly. We discuss the regularity properties of the test functions. For one- and two-dimensional domains we consider several important classes of singularities of NURBS parameterizations. For specific cases we derive additional conditions which guarantee the regularity of the test functions. In addition we present a modification scheme for the discretized function space in case of insufficient regularity. It is also shown how these results can be applied for computational domains in higher dimensions that can be parameterized via sweeping.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Mitrut
2006-03-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Statistical Analysis of Bus Networks in India
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future. PMID:27992590
A statistical trend analysis of ozonesonde data
Tiao, G. C.; Pedrick, J. H.; Allenby, G. M.; Reinsel, G. C.; Mateer, C. L.
1986-01-01
A detailed statistical analysis of monthly averages of ozonesond readings is performed to assess trends in ozone in the troposphere and the lower to midstratosphere. Regression time series models, which include seasonal and trend factors, are estimated for 13 stations located mainly in the midlatitudes of the Northern Hemisphere. At each station, trend estimates are calculated for 14 'fractional' Umkehr layers covering the altitude range from 0 to 33 km. For the 1970-1982 period, the main findings indicate an overall negative trend in ozonesonde data in the lower stratosphere (15-21 km) of about -0.5 percent per year, and some evidence of a positive trend in the troposphere (0-5 km) of about 0.8 percent per year. An in-depth sensitivity study of the trend estimates is performed with respect to various correction procedures used to normalize ozonesonde readings to Dobson total ozone measurements. The main results indicate that the negative trend findings in the 15- to 21-km altitude region are robust to the normalization procedures considered.
Asymptotic analysis of a pile-up of regular edge dislocation walls
Hall, Cameron L.
2011-12-01
The idealised problem of a pile-up of regular dislocation walls (that is, of planes each containing an infinite number of parallel, identical and equally spaced dislocations) was presented by Roy et al. [A. Roy, R.H.J. Peerlings, M.G.D. Geers, Y. Kasyanyuk, Materials Science and Engineering A 486 (2008) 653-661] as a prototype for understanding the importance of discrete dislocation interactions in dislocation-based plasticity models. They noted that analytic solutions for the dislocation wall density are available for a pile-up of regular screw dislocation walls, but that numerical methods seem to be necessary for investigating regular edge dislocation walls. In this paper, we use the techniques of discrete-to-continuum asymptotic analysis to obtain a detailed description of a pile-up of regular edge dislocation walls. To leading order, we find that the dislocation wall density is governed by a simple differential equation and that boundary layers are present at both ends of the pile-up. © 2011 Elsevier B.V.
Web-Based Statistical Sampling and Analysis
Quinn, Anne; Larson, Karen
2016-01-01
Consistent with the Common Core State Standards for Mathematics (CCSSI 2010), the authors write that they have asked students to do statistics projects with real data. To obtain real data, their students use the free Web-based app, Census at School, created by the American Statistical Association (ASA) to help promote civic awareness among school…
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and ...
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
. The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs...... has much to offer in analyzing the policy process....
Freud, Erez; Ganel, Tzvi; Avidan, Galia
2015-11-15
fMRI adaptation (fMRIa), the attenuation of fMRI signal which follows repeated presentation of a stimulus, is a well-documented phenomenon. Yet, the underlying neural mechanisms supporting this effect are not fully understood. Recently, short-term perceptual expectations, induced by specific experimental settings, were shown to play an important modulating role in fMRIa. Here we examined the role of long-term expectations, based on 3D structural statistical regularities, in the modulation of fMRIa. To this end, human participants underwent fMRI scanning while performing a same-different task on pairs of possible (regular, expected) objects and spatially impossible (irregular, unexpected) objects. We hypothesized that given the spatial irregularity of impossible objects in relation to real-world visual experience, the visual system would always generate a prediction which is biased to the possible version of the objects. Consistently, fMRIa effects in the lateral occipital cortex (LOC) were found for possible, but not for impossible objects. Additionally, in alternating trials the order of stimulus presentation modulated LOC activity. That is, reduced activation was observed in trials in which the impossible version of the object served as the prime object (i.e. first object) and was followed by the possible version compared to the reverse order. These results were also supported by the behavioral advantage observed for trials that were primed by possible objects. Together, these findings strongly emphasize the importance of perceptual expectations in object representation and provide novel evidence for the role of real-world statistical regularities in eliciting fMRIa.
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Directory of Open Access Journals (Sweden)
Golugula Abhishek
2011-12-01
Full Text Available Abstract Background Multimodal data, especially imaging and non-imaging data, is being routinely acquired in the context of disease diagnostics; however, computational challenges have limited the ability to quantitatively integrate imaging and non-imaging data channels with different dimensionalities and scales. To the best of our knowledge relatively few attempts have been made to quantitatively fuse such data to construct classifiers and none have attempted to quantitatively combine histology (imaging and proteomic (non-imaging measurements for making diagnostic and prognostic predictions. The objective of this work is to create a common subspace to simultaneously accommodate both the imaging and non-imaging data (and hence data corresponding to different scales and dimensionalities, called a metaspace. This metaspace can be used to build a meta-classifier that produces better classification results than a classifier that is based on a single modality alone. Canonical Correlation Analysis (CCA and Regularized CCA (RCCA are statistical techniques that extract correlations between two modes of data to construct a homogeneous, uniform representation of heterogeneous data channels. In this paper, we present a novel modification to CCA and RCCA, Supervised Regularized Canonical Correlation Analysis (SRCCA, that (1 enables the quantitative integration of data from multiple modalities using a feature selection scheme, (2 is regularized, and (3 is computationally cheap. We leverage this SRCCA framework towards the fusion of proteomic and histologic image signatures for identifying prostate cancer patients at the risk of 5 year biochemical recurrence following radical prostatectomy. Results A cohort of 19 grade, stage matched prostate cancer patients, all of whom had radical prostatectomy, including 10 of whom had biochemical recurrence within 5 years of surgery and 9 of whom did not, were considered in this study. The aim was to construct a lower
Regular Functions with Values in Ternary Number System on the Complex Clifford Analysis
Directory of Open Access Journals (Sweden)
Ji Eun Kim
2013-01-01
Full Text Available We define a new modified basis i^ which is an association of two bases, e1 and e2. We give an expression of the form z=x0+ i ^z0-, where x0 is a real number and z0- is a complex number on three-dimensional real skew field. And we research the properties of regular functions with values in ternary field and reduced quaternions by Clifford analysis.
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2015-01-01
techniques, for instance, 2D-Linear Discriminant Analysis (2D-LDA). Furthermore, an iterative algorithm based on the alternating direction method of multipliers is developed. The algorithm approximately solves RGED with monotonically decreasing convergence and at an acceptable speed for results of modest......We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature extraction and discriminant analysis...... accuracy. Numerical experiments based on four data sets of different types of images show that RGED has competitive classification performance with existing multidimensional and sparse techniques of discriminant analysis....
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Hoffmeyer, P.
Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data (100%) and the lower tail (30%) of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. 8 different databases are analysed. The results show that 2......-parameter Weibull (and Normal) distributions give the best fits to the data available, especially if tail fits are used whereas the LogNormal distribution generally gives poor fit and larger coefficients of variation, especially if tail fits are used....
Directory of Open Access Journals (Sweden)
Aphichat Chamratrithirong
Full Text Available OBJECTIVE: This study aims to determine factors associated with levels of condom use among heterosexual Thai males in sex with regular partners and in sex with casual partners. METHODS: The data used in this study are from the national probability sample of the 2006 National Sexual Behavior Study, the third nationally representative cross-sectional survey in Thailand. A subtotal of 2,281 men were analyzed in the study, including young (18-24 and older (25-59 adults who were residents of rural areas of Thailand, non-Bangkok urban areas, and Bangkok. Two outcomes of interest for this analysis are reported condom use in the past 12 months by males in relationships with the most recent regular and casual partners who were not sex workers. Chi-square statistics, bivariate regressions and the proportional odds regression models are used in the analysis. RESULTS: Condom use for men with their regular partner is revealed to be positively related to education, knowledge of condom effectiveness, and pro-condom strategy, and negatively related to non-professional employment, status of registered marriage, and short relationship duration. Condom use with casual partner is positively determined by education, condom knowledge, non-professional occupation, short relationship duration, and lack of history of paid sex. CONCLUSION: The national survey emphasized the importance of risk perceptions and condom motivations variables in explaining condom use among men in Thailand. These factors include not only education and knowledge of condom effectiveness and pro-condom strategy but also types of partners and their relationship context and characteristics. Program intervention to promote condom use in Thailand in this new era of predominant casual sex rather than sex with sex workers has to take into account more dynamic partner-based strategies than in the past history of the epidemics in Thailand.
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Analysis of Regularly and Irregularly Sampled Spatial, Multivariate, and Multi-temporal Data
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
1994-01-01
analysis techniques in this context. Geostatistics is described in Chapter 1. Tools as the semivariogram, the cross-semivariogram and different types of kriging are described. As an independent re-invention 2-D sample semivariograms, cross-semivariograms and cova functions, and modelling of 2-D sample semi...... maximize the variance represented by each component, MAFs maximize the spatial autocorrelation represented by each component, and MNFs maximize a measure of signal-to-noise ratio represented by each component. In the literature MAF/MNF analysis is described for regularly gridded data only. Here...
An off-shell I.R. regularization strategy in the analysis of collinear divergences
Becchi, Carlo M
2011-01-01
We present a method for the analysis of singularities of Feynman amplitudes based on the Speer sector decomposition of the Schwinger parametric integrals combined with the Mellin-Barnes transform. The sector decomposition method is described in some details. We suggest the idea of applying the method to the analysis of collinear singularities in inclusive QCD cross sections in the mass-less limit regularizing the forward amplitudes by an off-shell choice of the initial particle momenta. It is shown how the suggested strategy works in the well known case of the one loop corrections to Deep Inelastic Scattering.
Statistical Analysis Of Data Sets Legislative Type
Directory of Open Access Journals (Sweden)
Gheorghe Săvoiu
2013-06-01
Full Text Available This paper identifies some characteristic statistical aspects of the annual legislation’s dynamics and structure in the socio-economic system that had defined Romania, over the last two decades. After a brief introduction devoted to the concepts of social and economic system (SES and societal computerized management (SCM in Romania, first section describes the indicators, the specific database and the investigative method and a second section presents some descriptive statistics on the suggestive abnormality of the data series on the legislation of the last 20 years. A final remark underlines the difficult context of Romania’s legislative adjustment to EU requirements.
Statistical analysis of protein kinase specificity determinants
DEFF Research Database (Denmark)
Kreegipuu, Andres; Blom, Nikolaj; Brunak, Søren;
1998-01-01
The site and sequence specificity of protein kinase, as well as the role of the secondary structure and surface accessibility of the phosphorylation sites on substrate proteins, was statistically analyzed. The experimental data were collected from the literature and are available on the World Wide...
Statistical Analysis in Dental Research Papers.
1983-08-08
Clinical Trials of Agents used in the Prevention and Treatment of Periodontal Diseases (5). Standards sumary data must be included statistical methods...situations under clinical conditions. Research: qualitative research, as in joint tomography , where the results and conclusions are not amenable to
Statistical analysis of medical data using SAS
Der, Geoff
2005-01-01
An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Institute of Scientific and Technical Information of China (English)
Yibin XIAO; Guoji TANG; Xianjun LONG; Nanjing HUANG
2015-01-01
This paper studies the Browder-Tikhonov regularization of a second-order evolution hemivariational inequality (SOEHVI) with non-coercive operators. With dual-ity mapping, the regularized formulations and a derived first-order evolution hemivaria-tional inequality (FOEHVI) for the problem considered are presented. By applying the Browder-Tikhonov regularization method to the derived FOEHVI, a sequence of regular-ized solutions to the regularized SOEHVI is constructed, and the strong convergence of the whole sequence of regularized solutions to a solution to the problem is proved.
Fundamentals of statistical experimental design and analysis
Easterling, Robert G
2015-01-01
Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Directory of Open Access Journals (Sweden)
Areej M. Abduldaim
2013-01-01
Full Text Available We introduced and studied -regular modules as a generalization of -regular rings to modules as well as regular modules (in the sense of Fieldhouse. An -module is called -regular if for each and , there exist and a positive integer such that . The notion of -pure submodules was introduced to generalize pure submodules and proved that an -module is -regular if and only if every submodule of is -pure iff is a -regular -module for each maximal ideal of . Many characterizations and properties of -regular modules were given. An -module is -regular iff is a -regular ring for each iff is a -regular ring for finitely generated module . If is a -regular module, then .
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Synthesis and structural analysis of a regular Cu-Mg-Al hydrotalcite-like compound
WU, Jian-Song; XIAO, Ying-Kai; Liu, Yu-ping; XU, Wan-Bang
2011-01-01
A fine-quality, regular Cu-Mg-Al hydrotalcite-like compound was synthesized via the glycothermal method using CuCl2 \\cdot 2H2O, MgCl2 \\cdot 6H2O, AlCl3 \\cdot 6H2O, and Na2CO3 as raw materials and sodium hydroxide as the precipitant. Hydrotalcite samples were characterized by X-ray diffraction, scanning electron microscopy, transmission electron microscopy, Fourier transform infrared spectroscopy, thermogravimetric-differential thermal analysis, and Brunauer-Emmett-Teller N2 surface ar...
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Statistical analysis of life history calendar data.
Eerola, Mervi; Helske, Satu
2016-04-01
The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.
Statistical analysis of Contact Angle Hysteresis
Janardan, Nachiketa; Panchagnula, Mahesh
2015-11-01
We present the results of a new statistical approach to determining Contact Angle Hysteresis (CAH) by studying the nature of the triple line. A statistical distribution of local contact angles on a random three-dimensional drop is used as the basis for this approach. Drops with randomly shaped triple lines but of fixed volumes were deposited on a substrate and their triple line shapes were extracted by imaging. Using a solution developed by Prabhala et al. (Langmuir, 2010), the complete three dimensional shape of the sessile drop was generated. A distribution of the local contact angles for several such drops but of the same liquid-substrate pairs is generated. This distribution is a result of several microscopic advancing and receding processes along the triple line. This distribution is used to yield an approximation of the CAH associated with the substrate. This is then compared with measurements of CAH by means of a liquid infusion-withdrawal experiment. Static measurements are shown to be sufficient to measure quasistatic contact angle hysteresis of a substrate. The approach also points towards the relationship between microscopic triple line contortions and CAH.
Book review: Statistical Analysis and Modelling of Spatial Point Patterns
DEFF Research Database (Denmark)
Møller, Jesper
2009-01-01
Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912......Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912...
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Statistical methods for categorical data analysis
Powers, Daniel
2008-01-01
This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/
Statistical analysis: the need, the concept, and the usage
Directory of Open Access Journals (Sweden)
Naduvilath Thomas
1998-01-01
Full Text Available In general, better understanding of the need and usage of statistics would benefit the medical community in India. This paper explains why statistical analysis is needed, and what is the conceptual basis for it. Ophthalmic data are used as examples. The concept of sampling variation is explained to further corroborate the need for statistical analysis in medical research. Statistical estimation and testing of hypothesis which form the major components of statistical inference are construed. Commonly reported univariate and multivariate statistical tests are explained in order to equip the ophthalmologist with basic knowledge of statistics for better understanding of research data. It is felt that this understanding would facilitate well designed investigations ultimately leading to higher quality practice of ophthalmology in our country.
Hamada, Daisuke; Yamamoto, Hiroki; Saiki, Jun
2017-08-01
Grapheme-color synesthesia is a neurological phenomenon where visual perception of letters and numbers stimulates perception of a specific color. Grapheme-color correspondences have been shown to be systematically associated with grapheme properties, including visual shape difference, ordinality, and frequency. However, the contributions of grapheme factors differ across individuals. In this study, we applied multilevel analysis to test whether individual differences in regularities of grapheme-color associations could be explained by individual styles of processing grapheme properties. These processing styles are reflected by the type of synesthetic experience. Specifically, we hypothesized that processing focusing on shape differences would be associated with projector synesthetes, while processing focusing on ordinality or familiarity would be associated with associator synesthetes. The analysis revealed that ordinality and familiarity factors were expressed more strongly among associators than among projectors. This finding suggests that grapheme-color associations are partly determined by the type of synesthetic experience. Copyright © 2017 Elsevier Inc. All rights reserved.
Statistical analysis of concrete quality testing results
Directory of Open Access Journals (Sweden)
Jevtić Dragica
2014-01-01
Full Text Available This paper statistically investigates the testing results of compressive strength and density of control concrete specimens tested in the Laboratory for materials, Faculty of Civil Engineering, University of Belgrade, during 2012. The total number of 4420 concrete specimens were tested, which were sampled on different locations - either on concrete production site (concrete plant, or concrete placement location (construction site. To be exact, these samples were made of concrete which was produced on 15 concrete plants, i.e. placed in at 50 different reinforced concrete structures, built during 2012 by 22 different contractors. It is a known fact that the achieved values of concrete compressive strength are very important, both for quality and durability assessment of concrete inside the structural elements, as well as for calculation of their load-bearing capacity limit. Together with the compressive strength testing results, the data concerning requested (designed concrete class, matching between the designed and the achieved concrete quality, concrete density values and frequency of execution of concrete works during 2012 were analyzed.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Statistical Smoothing Methods and Image Analysis
1988-12-01
83 - 111. Rosenfeld, A. and Kak, A.C. (1982). Digital Picture Processing. Academic Press,Qrlando. Serra, J. (1982). Image Analysis and Mat hematical ...hypothesis testing. IEEE Trans. Med. Imaging, MI-6, 313-319. Wicksell, S.D. (1925) The corpuscle problem. A mathematical study of a biometric problem
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
2002-01-01
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
The Statistical Analysis of Failure Time Data
Kalbfleisch, John D
2011-01-01
Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology
Structural meta-analysis of regular human insulin in pharmaceutical formulations.
Fávero-Retto, Maely P; Palmieri, Leonardo C; Souza, Tatiana A C B; Almeida, Fábio C L; Lima, Luís Mauricio T R
2013-11-01
We have studied regular acting, wild-type human insulin at potency of 100 U/mL from four different pharmaceutical products directly from their final finished formulation by the combined use of mass spectrometry (MS), dynamic light scattering (DLS), small-angle X-ray scattering (SAXS), nuclear magnetic resonance (NMR), and single-crystal protein crystallography (PX). All products showed similar oligomeric assembly in solution as judged by DLS and SAXS measurements. The NMR spectra were compatible with well folded proteins, showing close conformational identity for the human insulin in the four products. Crystallographic assays conducted with the final formulated products resulted in all insulin crystals belonging to the R3 space group with two a dimer in the asymmetric unit, both with the B-chain in the T configuration. Meta-analysis of the 24 crystal structures solved from the four distinct insulin products revealed close similarity between them regardless of variables such as biological origin, product batch, country origin of the product, and analytical approach, revealing a low conformational variability for the converging insulin structural ensemble. We propose the use of MS, SAXS, NMR fingerprint, and PX as a precise chemical and structural proof of folding identity of regular insulin in the final, formulated product.
Statistic analysis of millions of digital photos
Wueller, Dietmar; Fageth, Reiner
2008-02-01
The analysis of images has always been an important aspect in the quality enhancement of photographs and photographic equipment. Due to the lack of meta data it was mostly limited to images taken by experts under predefined conditions and the analysis was also done by experts or required psychophysical tests. With digital photography and the EXIF1 meta data stored in the images, a lot of information can be gained from a semiautomatic or automatic image analysis if one has access to a large number of images. Although home printing is becoming more and more popular, the European market still has a few photofinishing companies who have access to a large number of images. All printed images are stored for a certain period of time adding up to several million images on servers every day. We have utilized the images to answer numerous questions and think that these answers are useful for increasing image quality by optimizing the image processing algorithms. Test methods can be modified to fit typical user conditions and future developments can be pointed towards ideal directions.
EXTREME PROGRAMMING PROJECT PERFORMANCE MANAGEMENT BY STATISTICAL EARNED VALUE ANALYSIS
Wei Lu; Li Lu
2013-01-01
As an important project type of Agile Software Development, the performance evaluation and prediction for eXtreme Programming project has significant meanings. Targeting on the short release life cycle and concurrent multitask features, a statistical earned value analysis model is proposed. Based on the traditional concept of earned value analysis, the statistical earned value analysis model introduced Elastic Net regression function and Laplacian hierarchical model to construct a Bayesian El...
An analysis of radio pulsar nulling statistics
Biggs, James D.
1992-01-01
Survival analysis methods are used to seek correlations between the fraction of null pulsars and other pulsar characteristics for an ensemble of 72 radio pulsars. The strongest correlation is found between the null fraction and the pulse period, suggesting that nulling is a manifestation of a faltering emission mechanism. Correlations are also found between the fraction of null pulses and other parameters that have a strong dependence on the pulse period. The results presented here suggest that nulling is broad-band and may ultimately be explained in terms of polar cap models of pulsar emission.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Improved statistics for genome-wide interaction analysis.
Ueki, Masao; Cordell, Heather J
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al
A statistical analysis of UK financial networks
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
STATISTICAL BAYESIAN ANALYSIS OF EXPERIMENTAL DATA.
Directory of Open Access Journals (Sweden)
AHLAM LABDAOUI
2012-12-01
Full Text Available The Bayesian researcher should know the basic ideas underlying Bayesian methodology and the computational tools used in modern Bayesian econometrics. Some of the most important methods of posterior simulation are Monte Carlo integration, importance sampling, Gibbs sampling and the Metropolis- Hastings algorithm. The Bayesian should also be able to put the theory and computational tools together in the context of substantive empirical problems. We focus primarily on recent developments in Bayesian computation. Then we focus on particular models. Inevitably, we combine theory and computation in the context of particular models. Although we have tried to be reasonably complete in terms of covering the basic ideas of Bayesian theory and the computational tools most commonly used by the Bayesian, there is no way we can cover all the classes of models used in econometrics. We propose to the user of analysis of variance and linear regression model.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Mishra, Spandan; Vanli, O. Arda; Huffer, Fred W.; Jung, Sungmoon
2016-04-01
In this study we propose a regularized linear discriminant analysis approach for damage detection which does not require an intermediate feature extraction step and therefore more efficient in handling data with high-dimensionality. A robust discriminant model is obtained by shrinking of the covariance matrix to a diagonal matrix and thresholding redundant predictors without hurting the predictive power of the model. The shrinking and threshold parameters of the discriminant function (decision boundary) are estimated to minimize the classification error. Furthermore, it is shown how the damage classification achieved by the proposed method can be extended to multiple sensors by following a Bayesian decision-fusion formulation. The detection probability of each sensor is used as a prior condition to estimate the posterior detection probability of the entire network and the posterior detection probability is used as a quantitative basis to make the final decision about the damage.
QRS DETECTION OF ECG - A STATISTICAL ANALYSIS
Directory of Open Access Journals (Sweden)
I.S. Siva Rao
2015-03-01
Full Text Available Electrocardiogram (ECG is a graphical representation generated by heart muscle. ECG plays an important role in diagnosis and monitoring of heart’s condition. The real time analyzer based on filtering, beat recognition, clustering, classification of signal with maximum few seconds delay can be done to recognize the life threatening arrhythmia. ECG signal examines and study of anatomic and physiologic facets of the entire cardiac muscle. The inceptive task for proficient scrutiny is the expulsion of noise. It is attained by the use of wavelet transform analysis. Wavelets yield temporal and spectral information concurrently and offer stretchability with a possibility of wavelet functions of different properties. This paper is concerned with the extraction of QRS complexes of ECG signals using Discrete Wavelet Transform based algorithms aided with MATLAB. By removing the inconsistent wavelet transform coefficient, denoising is done in ECG signal. In continuation, QRS complexes are identified and in which each peak can be utilized to discover the peak of separate waves like P and T with their derivatives. Here we put forth a new combinatory algorithm builded on using Pan-Tompkins' method and multi-wavelet transform.
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
Explorations in Statistics: The Analysis of Ratios and Normalized Data
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
The Importance of Statistical Modeling in Data Analysis and Inference
Rollins, Derrick, Sr.
2017-01-01
Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…
Tóth, L Fejes; Ulam, S; Stark, M
1964-01-01
Regular Figures concerns the systematology and genetics of regular figures. The first part of the book deals with the classical theory of the regular figures. This topic includes description of plane ornaments, spherical arrangements, hyperbolic tessellations, polyhedral, and regular polytopes. The problem of geometry of the sphere and the two-dimensional hyperbolic space are considered. Classical theory is explained as describing all possible symmetrical groupings in different spaces of constant curvature. The second part deals with the genetics of the regular figures and the inequalities fo
Practical application and statistical analysis of titrimetric monitoring ...
African Journals Online (AJOL)
Practical application and statistical analysis of titrimetric monitoring of water and ... The resulting raw data were further processed with an Excel-based program. ... As such the type of component and the concentration can be determined.
Statistical Analysis of the Exchange Rate of Bitcoin: e0133678
National Research Council Canada - National Science Library
Jeffrey Chu; Saralees Nadarajah; Stephen Chan
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar...
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Meta analysis a guide to calibrating and combining statistical evidence
Kulinskaya, Elena; Staudte, Robert G
2008-01-01
Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence acts as a source of basic methods for scientists wanting to combine evidence from different experiments. The authors aim to promote a deeper understanding of the notion of statistical evidence.The book is comprised of two parts - The Handbook, and The Theory. The Handbook is a guide for combining and interpreting experimental evidence to solve standard statistical problems. This section allows someone with a rudimentary knowledge in general statistics to apply the methods. The Theory provides the motivation, theory and results of simulation experiments to justify the methodology.This is a coherent introduction to the statistical concepts required to understand the authors' thesis that evidence in a test statistic can often be calibrated when transformed to the right scale.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Institute of Scientific and Technical Information of China (English)
HU Chang-sheng; ZHAO Wei-min; MA Qiang
2009-01-01
To analyze the stress of the guiding & positioning board and the effectiveness of the guiding & positioning device, aeenrding to guiding & positioning device's operational principle and structure, the guiding & positioning board's motion regular was analyzed bydiagrammatical method based on 2 postulated conditions. Considering about the working conditions' change, simulations in 5 different kinds of working conditions were done to cheek the correctness of the motion regulars obtained by diagrammatical method. Simulation results prove that the motion regulars are right, the postulated conditions have no effect on the obtained motion regulars. According to the simulation results, the motion processs's characters were drawn out at the same time.
Statistical multiresolution analysis in amplitude-frequency domain
Institute of Scientific and Technical Information of China (English)
SUN Hong; GUAN Bao; Henri Maitre
2004-01-01
A concept of statistical multiresolution analysis in amplitude-frequency domain is proposed, which is to employ the wavelet transform on the statistical character of a signal in amplitude domain. In terms of the theorem of generalized ergodicity, an algorithm to estimate the transform coefficients based on the amplitude statistical multiresolution analysis (AMA) is presented. The principle of applying the AMA to Synthetic Aperture Radar (SAR) image processing is described, and the good experimental results imply that the AMA is an efficient tool for processing of speckled signals modeled by the multiplicative noise.
Basic statistical tools in research and data analysis
Ali, Zulfiqar; Bhaskar, S Bala
2016-01-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis.
Ali, Zulfiqar; Bhaskar, S Bala
2016-09-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Botelho, Ana; Canto, Ana; Leão, Célia; Cunha, Mónica V
2015-01-01
Typical CRISPR (clustered, regularly interspaced, short palindromic repeat) regions are constituted by short direct repeats (DRs), interspersed with similarly sized non-repetitive spacers, derived from transmissible genetic elements, acquired when the cell is challenged with foreign DNA. The analysis of the structure, in number and nature, of CRISPR spacers is a valuable tool for molecular typing since these loci are polymorphic among strains, originating characteristic signatures. The existence of CRISPR structures in the genome of the members of Mycobacterium tuberculosis complex (MTBC) enabled the development of a genotyping method, based on the analysis of the presence or absence of 43 oligonucleotide spacers separated by conserved DRs. This method, called spoligotyping, consists on PCR amplification of the DR chromosomal region and recognition after hybridization of the spacers that are present. The workflow beneath this methodology implies that the PCR products are brought onto a membrane containing synthetic oligonucleotides that have complementary sequences to the spacer sequences. Lack of hybridization of the PCR products to a specific oligonucleotide sequence indicates absence of the correspondent spacer sequence in the examined strain. Spoligotyping gained great notoriety as a robust identification and typing tool for members of MTBC, enabling multiple epidemiological studies on human and animal tuberculosis.
Design and motion analysis of a novel coupled mechanism based on a regular triangular bipyramid
Directory of Open Access Journals (Sweden)
Huifang Gao
2016-11-01
Full Text Available Traditional methods and theories on synthesizing parallel mechanisms are not applicable to related researches on hybrid mechanisms, thus hampering the design of innovative coupled mechanisms. Polyhedrons with attractive appearance and particular geometrical construction provide many choices for coupled inventions. A novel mechanism with one translational degree of freedom based on a regular triangular bipyramid is proposed in this article. First, the basic equivalent geometrical model is spliced with new-designed components substituting for vertexes and edges by revolution joints (R-pairs only. The expected motion for the basic coupled model can be achieved by adding links to modify the constraint sets and arrange spatial allocation of an elementary loop based on the screw theory. Then, the mobility of one branch is calculated to investigate the movability of the novel structure, and a Denavit–Hartenberg (D-H model with properties of symmetry is implemented to investigate the inverse kinematic analysis. Furthermore, a numerical example is given to verify the correctness of analysis results and related motion simulation is conducted to illustrate the potential application of the proposed novel system as an executing manipulator for mobile robots.
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obta
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic t
Statistical Analysis of Processes of Bankruptcy is in Ukraine
Berest Marina Nikolaevna
2012-01-01
The statistical analysis of processes of bankruptcy in Ukraine is conducted. Quantitative and high-quality indexes, characterizing efficiency of functioning of institute of bankruptcy of enterprises, are analyzed; the analysis of processes, related to bankruptcy of enterprises, being in state administration is conducted.
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
A Divergence Statistics Extension to VTK for Performance Analysis.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
A novel statistic for genome-wide interaction analysis.
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
2010-09-23
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDRanalysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.
Theoretical analysis and experimental study of oxygen transfer under regular and non-breaking waves
Institute of Scientific and Technical Information of China (English)
尹则高; 梁丙臣; 王乐
2013-01-01
The dissolved oxygen concentration is an important index of water quality, and the atmosphere is one of the important sources of the dissolved oxygen. In this paper, the mass conservation law and the dimensional analysis method are employed to study the oxygen transfer under regular and non-breaking waves, and a unified oxygen transfer coefficient equation is obtained with consi-deration of the effect of kinetic energy and wave period. An oxygen transfer experiment for the intermediate depth water wave is per-formed to measure the wave parameters and the dissolved oxygen concentration. The experimental data and the least squares method are used to determine the constant in the oxygen transfer coefficient equation. The experimental data and the previous reported data are also used to further validate the oxygen transfer coefficient, and the agreement is satisfactory. The unified equation shows that the oxygen transfer coefficient increases with the increase of a parameter coupled with the wave height and the wave length, but it de-creases with the increase of the wave period, which has a much greater influence on the oxygen transfer coefficient than the coupled parameter.
[The physiological analysis of cross adaptation to regular cold exposure and physical activities].
Son'kin, V D; Iakushkin, A V; Akimov, E B; Andreev, R S; Kalenov, Iu N; Kozlov, A V
2014-01-01
Research is devoted to the comparative analysis of results of cold adaptation and physical training. The adaptive shifts occurring in an organism under the influence of a hardening (douche by a cold shower 2 times a day 2 minutes long within 6 weeks) and running training on the treadmill (30 minutes at 70-80% of individual VO2max, 3 times a week, within 6 weeks) were compared at 6 the same subjects. The interval between the two cycles of training was no less than 3 months. The indicators registered during ramp test and standard cold exposure test before and after each cycle of trainings were compared. It is shown that patterns of adaptive shifts at adaptation to factors of various modality strongly differ. Shifts at adaptation to physical activities were as a whole more expressed, than at adaptation to regular cold exposition. An individual variety of adaptive reactions suggests the feasibility of developing new approaches to the theory of the adaptation, connected with studying of physiological individuality.
Perturbative analysis of the Schwinger model (QED{sub 2}) for gauge non-invariant regularizations
Energy Technology Data Exchange (ETDEWEB)
Sifuentes, Rodolsfo Casana; Silva Neto, Marcelo Barbosa da; Dias, Sebastiao Alves [Centro Brasileiro de Pesquisas Fisicas , CBPF, Rio de Janeiro, RJ (Brazil). Dept. de Teorias de Campos e Particulas
1997-12-31
In this article we consider the Schwinger model for gauge non-invariant regularization and study the perturbative behaviour of some relevant correlation functions. (author) 6 refs.; e-mail: casana, silvanet, tiao at cbpfsu1.cat.cbpf.br
Complexity of software trustworthiness and its dynamical statistical analysis methods
Institute of Scientific and Technical Information of China (English)
ZHENG ZhiMing; MA ShiLong; LI Wei; JIANG Xin; WEI Wei; MA LiLi; TANG ShaoTing
2009-01-01
Developing trusted softwares has become an important trend and a natural choice in the development of software technology and applications.At present,the method of measurement and assessment of software trustworthiness cannot guarantee safe and reliable operations of software systems completely and effectively.Based on the dynamical system study,this paper interprets the characteristics of behaviors of software systems and the basic scientific problems of software trustworthiness complexity,analyzes the characteristics of complexity of software trustworthiness,and proposes to study the software trustworthiness measurement in terms of the complexity of software trustworthiness.Using the dynamical statistical analysis methods,the paper advances an invariant-measure based assessment method of software trustworthiness by statistical indices,and hereby provides a dynamical criterion for the untrustworthiness of software systems.By an example,the feasibility of the proposed dynamical statistical analysis method in software trustworthiness measurement is demonstrated using numerical simulations and theoretical analysis.
Towards proper sampling and statistical analysis of defects
Directory of Open Access Journals (Sweden)
Cetin Ali
2014-06-01
Full Text Available Advancements in applied statistics with great relevance to defect sampling and analysis are presented. Three main issues are considered; (i proper handling of multiple defect types, (ii relating sample data originating from polished inspection surfaces (2D to finite material volumes (3D, and (iii application of advanced extreme value theory in statistical analysis of block maximum data. Original and rigorous, but practical mathematical solutions are presented. Finally, these methods are applied to make prediction regarding defect sizes in a steel alloy containing multiple defect types.
Adaptive strategy for the statistical analysis of connectomes.
Directory of Open Access Journals (Sweden)
Djalel Eddine Meskaldji
Full Text Available We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes. Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.
Statistical Error analysis of Nucleon-Nucleon phenomenological potentials
Perez, R Navarro; Arriola, E Ruiz
2014-01-01
Nucleon-Nucleon potentials are commonplace in nuclear physics and are determined from a finite number of experimental data with limited precision sampling the scattering process. We study the statistical assumptions implicit in the standard least squares fitting procedure and apply, along with more conventional tests, a tail sensitive quantile-quantile test as a simple and confident tool to verify the normality of residuals. We show that the fulfilment of normality tests is linked to a judicious and consistent selection of a nucleon-nucleon database. These considerations prove crucial to a proper statistical error analysis and uncertainty propagation. We illustrate these issues by analyzing about 8000 proton-proton and neutron-proton scattering published data. This enables the construction of potentials meeting all statistical requirements necessary for statistical uncertainty estimates in nuclear structure calculations.
Data analysis using the Gnu R system for statistical computation
Energy Technology Data Exchange (ETDEWEB)
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Energy Technology Data Exchange (ETDEWEB)
Fargnoli, H.G.; Sampaio, Marcos; Nemes, M.C. [Federal University of Minas Gerais, ICEx, Physics Department, P.O. Box 702, Belo Horizonte, MG (Brazil); Hiller, B. [Coimbra University, Faculty of Science and Technology, Physics Department, Center of Computational Physics, Coimbra (Portugal); Baeta Scarpelli, A.P. [Setor Tecnico-Cientifico, Departamento de Policia Federal, Lapa, Sao Paulo (Brazil)
2011-05-15
We present both an ultraviolet and an infrared regularization independent analysis in a symmetry preserving framework for the N=1 Super Yang-Mills beta function to two loop order. We show explicitly that off-shell infrared divergences as well as the overall two loop ultraviolet divergence cancel out, whilst the beta function receives contributions of infrared modes. (orig.)
How little data is enough? Phase-diagram analysis of sparsity-regularized X-ray computed tomography
DEFF Research Database (Denmark)
Jørgensen, Jakob Sauer; Sidky, E. Y.
2015-01-01
We introduce phase-diagram analysis, a standard tool in compressed sensing (CS), to the X-ray computed tomography (CT) community as a systematic method for determining how few projections suffice for accurate sparsity-regularized reconstruction. In CS, a phase diagram is a convenient way to study...
Regularized discriminate analysis for breast mass detection on full field digital mammograms
Wei, Jun; Sahiner, Berkman; Zhang, Yiheng; Chan, Heang-Ping; Hadjiiski, Lubomir M.; Zhou, Chuan; Ge, Jun; Wu, Yi-Ta
2006-03-01
In computer-aided detection (CAD) applications, an important step is to design a classifier for the differentiation of the abnormal from the normal structures. We have previously developed a stepwise linear discriminant analysis (LDA) method with simplex optimization for this purpose. In this study, our goal was to investigate the performance of a regularized discriminant analysis (RDA) classifier in combination with a feature selection method for classification of the masses and normal tissues detected on full field digital mammograms (FFDM). The feature selection scheme combined a forward stepwise feature selection process and a backward stepwise feature elimination process to obtain the best feature subset. An RDA classifier and an LDA classifier in combination with this new feature selection method were compared to an LDA classifier with stepwise feature selection. A data set of 130 patients containing 260 mammograms with 130 biopsy-proven masses was used. All cases had two mammographic views. The true locations of the masses were identified by experienced radiologists. To evaluate the performance of the classifiers, we randomly divided the data set into two independent sets of approximately equal size for training and testing. The training and testing were performed using the 2-fold cross validation method. The detection performance of the CAD system was assessed by free response receiver operating characteristic (FROC) analysis. The average test FROC curve was obtained by averaging the FP rates at the same sensitivity along the two corresponding test FROC curves from the 2-fold cross validation. At the case-based sensitivities of 90%, 80% and 70% on the test set, our RDA classifier with the new feature selection scheme achieved an FP rate of 1.8, 1.1, and 0.6 FPs/image, respectively, compared to 2.1, 1.4, and 0.8 FPs/image with stepwise LDA with simplex optimization. Our results indicate that RDA in combination with the sequential forward inclusion
Local Regularity Analysis with Wavelet Transform in Gear Tooth Failure Detection
Nissilä, Juhani
2017-09-01
Diagnosing gear tooth and bearing failures in industrial power transition situations has been studied a lot but challenges still remain. This study aims to look at the problem from a more theoretical perspective. Our goal is to find out if the local regularity i.e. smoothness of the measured signal can be estimated from the vibrations of epicyclic gearboxes and if the regularity can be linked to the meshing events of the gear teeth. Previously it has been shown that the decreasing local regularity of the measured acceleration signals can reveal the inner race faults in slowly rotating bearings. The local regularity is estimated from the modulus maxima ridges of the signal's wavelet transform. In this study, the measurements come from the epicyclic gearboxes of the Kelukoski water power station (WPS). The very stable rotational speed of the WPS makes it possible to deduce that the gear mesh frequencies of the WPS and a frequency related to the rotation of the turbine blades are the most significant components in the spectra of the estimated local regularity signals.
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Original position statistic distribution analysis study of low alloy steel continuous casting billet
Institute of Scientific and Technical Information of China (English)
WANG; Haizhou; ZHAO; Pei; CHEN; Jiwen; LI; Meiling; YANG; Z
2005-01-01
The homogeneity of low alloy steels continuous casting billet obtained under different technological conditions has been investigated by original position statistic distribution analysis technique. On the basis of systematic analysis of ten thousands primary optical signals at the corresponding original positions, the quantitative statistic distribution information of each element was obtained. The biggest degrees of segregation of low alloy steel continuous casting billet were calculated accurately according to the quantitative distribution maps of the contents. It was suggested that the weight ratio in a certain content range was used to judge the homogeneity of the materials, and the two models -- the total weight ratio of contents (the degree of statistic homogeneity, H) within the permissive content range (C0±R) and the median value confidence extension ratio (the degree of statistic segregation, S) at 95% of confidence limit of weight ratio -- were put forward. The two models reflect the composition and state distribution regularity of the metal materials in a large region. The difference between the sample with high columnar crystal and the sample with high equiaxed crystal has been studied by using the two models.
Processing and statistical analysis of soil-root images
Razavi, Bahar S.; Hoang, Duyen; Kuzyakov, Yakov
2016-04-01
Importance of the hotspots such as rhizosphere, the small soil volume that surrounds and is influenced by plant roots, calls for spatially explicit methods to visualize distribution of microbial activities in this active site (Kuzyakov and Blagodatskaya, 2015). Zymography technique has previously been adapted to visualize the spatial dynamics of enzyme activities in rhizosphere (Spohn and Kuzyakov, 2014). Following further developing of soil zymography -to obtain a higher resolution of enzyme activities - we aimed to 1) quantify the images, 2) determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). To this end, we incubated soil-filled rhizoboxes with maize Zea mays L. and without maize (control box) for two weeks. In situ soil zymography was applied to visualize enzymatic activity of β-glucosidase and phosphatase at soil-root interface. Spatial resolution of fluorescent images was improved by direct application of a substrate saturated membrane to the soil-root system. Furthermore, we applied "spatial point pattern analysis" to determine whether the pattern (e.g. distribution of hotspots in space) is clumped (aggregated) or regular (dispersed). Our results demonstrated that distribution of hotspots at rhizosphere is clumped (aggregated) compare to control box without plant which showed regular (dispersed) pattern. These patterns were similar in all three replicates and for both enzymes. We conclude that improved zymography is promising in situ technique to identify, analyze, visualize and quantify spatial distribution of enzyme activities in the rhizosphere. Moreover, such different patterns should be considered in assessments and modeling of rhizosphere extension and the corresponding effects on soil properties and functions. Key words: rhizosphere, spatial point pattern, enzyme activity, zymography, maize.
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...
Advanced Statistical and Data Analysis Tools for Astrophysics
Kashyap, V.; Scargle, Jeffrey D. (Technical Monitor)
2001-01-01
The goal of the project is to obtain, derive, and develop statistical and data analysis tools that would be of use in the analyses of high-resolution, high-sensitivity data that are becoming available with new instruments. This is envisioned as a cross-disciplinary effort with a number of collaborators.
A New Statistic for Variable Selection in Questionnaire Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Jun-hua; FANG Wei-wu
2001-01-01
In this paper, a new statistic is proposed for variable selection which is one of the important problems in analysis of questionnaire data. Contrasting to other methods, the approach introduced here can be used not only for two groups of samples but can also be easily generalized to the multi-group case.
Measures of radioactivity: a tool for understanding statistical data analysis
Montalbano, Vera
2012-01-01
A learning path on radioactivity in the last class of high school is presented. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources are monitored for days or weeks. The data are analyzed in order to understand the importance of statistical analysis in modern physics.
Statistical Analysis of Hypercalcaemia Data related to Transferability
DEFF Research Database (Denmark)
Frølich, Anne; Nielsen, Bo Friis
2005-01-01
In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...
AstroStat - A VO Tool for Statistical Analysis
Kembhavi, Ajit K; Kale, Tejas; Jagade, Santosh; Vibhute, Ajay; Garg, Prerak; Vaghmare, Kaustubh; Navelkar, Sharmad; Agrawal, Tushar; Nandrekar, Deoyani; Shaikh, Mohasin
2015-01-01
AstroStat is an easy-to-use tool for performing statistical analysis on data. It has been designed to be compatible with Virtual Observatory (VO) standards thus enabling it to become an integral part of the currently available collection of VO tools. A user can load data in a variety of formats into AstroStat and perform various statistical tests using a menu driven interface. Behind the scenes, all analysis is done using the public domain statistical software - R and the output returned is presented in a neatly formatted form to the user. The analyses performable include exploratory tests, visualizations, distribution fitting, correlation & causation, hypothesis testing, multivariate analysis and clustering. The tool is available in two versions with identical interface and features - as a web service that can be run using any standard browser and as an offline application. AstroStat will provide an easy-to-use interface which can allow for both fetching data and performing power statistical analysis on ...
Introduction to Statistics and Data Analysis With Computer Applications I.
Morris, Carl; Rolph, John
This document consists of unrevised lecture notes for the first half of a 20-week in-house graduate course at Rand Corporation. The chapter headings are: (1) Histograms and descriptive statistics; (2) Measures of dispersion, distance and goodness of fit; (3) Using JOSS for data analysis; (4) Binomial distribution and normal approximation; (5)…
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Multivariate statistical analysis of precipitation chemistry in Northwestern Spain
Energy Technology Data Exchange (ETDEWEB)
Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)
1993-07-01
149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.
An analysis of electrical impedance tomography with applications to Tikhonov regularization
Jin, Bangti
2012-01-16
This paper analyzes the continuum model/complete electrode model in the electrical impedance tomography inverse problem of determining the conductivity parameter from boundary measurements. The continuity and differentiability of the forward operator with respect to the conductivity parameter in L p-norms are proved. These analytical results are applied to several popular regularization formulations, which incorporate a priori information of smoothness/sparsity on the inhomogeneity through Tikhonov regularization, for both linearized and nonlinear models. Some important properties, e.g., existence, stability, consistency and convergence rates, are established. This provides some theoretical justifications of their practical usage. © EDP Sciences, SMAI, 2012.
Regularization in kernel learning
Mendelson, Shahar; 10.1214/09-AOS728
2010-01-01
Under mild assumptions on the kernel, we obtain the best known error rates in a regularized learning scenario taking place in the corresponding reproducing kernel Hilbert space (RKHS). The main novelty in the analysis is a proof that one can use a regularization term that grows significantly slower than the standard quadratic growth in the RKHS norm.
HistFitter software framework for statistical data analysis
Baak, M.; Besjes, G. J.; Côté, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-04-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface.
Numerical analysis of regular waves over an onshore oscillating water column
Energy Technology Data Exchange (ETDEWEB)
Davyt, D.P.; Teixeira, P.R.F. [Universidade Federal do Rio Grande (FURG), RS (Brazil)], E-mail: pauloteixeira@furg.br; Ramalhais, R. [Universidade Nova de Lisboa, Caparica (Portugal). Fac. de Ciencias e Tecnologia; Didier, E. [Laboratorio Nacional de Engenharia Civil, Lisboa (Portugal)], E-mail: edidier@lnec.pt
2010-07-01
The potential of wave energy along coastal areas is a particularly attractive option in regions of high latitude, such as the coasts of northern Europe, North America, New Zealand, Chile and Argentina where high densities of annual average wave energy are found (typically between 40 and 100 kW/m of wave front). Power estimated in the south of Brazil is 30kW/m, creating a possible alternative of source energy in the region. There are many types and designs of equipment to capture energy from waves under analysis, such as the oscillating water column type (OWC) which has been one of the first to be developed and installed at sea. Despite being one of the most analyzed wave energy converter devices, there are few case studies using numerical simulation. In this context, the numerical analysis of regular waves over an onshore OWC is the main objective of this paper. The numerical models FLUINCO and FLUENT are used for achieving this goal. The FLUINCO model is based on RANS equations which are discretized using the two-step semi-implicit Taylor-Galerkin method. An arbitrary Lagrangian Eulerian formulation is used to enable the solution of problems involving free surface movements. The FLUENT code (version 6.3.26) is based on the finite volume method to solve RANS equations. Volume of Fluid method (VOF) is used for modeling free surface flows. Time integration is achieved by a second order implicit scheme, momentum equations are discretized using MUSCL scheme and HRIC (High Resolution Interface Capturing) scheme is used for convective term of VOF transport equation. The case study consists of a 10.m deep channel with a 10 m wide chamber at its end. One meter high waves with different periods are simulated. Comparisons between FLUINCO and FLUENT results are presented. Free surface elevation inside the chamber; velocity distribution and streamlines; amplification factor (relation between wave height inside the chamber and incident wave height); phase angle (angular
Common pitfalls in statistical analysis: Linear regression analysis.
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Common pitfalls in statistical analysis: Linear regression analysis
Directory of Open Access Journals (Sweden)
Rakesh Aggarwal
2017-01-01
Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Statistical analysis of absorptive laser damage in dielectric thin films
Energy Technology Data Exchange (ETDEWEB)
Budgor, A.B.; Luria-Budgor, K.F.
1978-09-11
The Weibull distribution arises as an example of the theory of extreme events. It is commonly used to fit statistical data arising in the failure analysis of electrical components and in DC breakdown of materials. This distribution is employed to analyze time-to-damage and intensity-to-damage statistics obtained when irradiating thin film coated samples of SiO/sub 2/, ZrO/sub 2/, and Al/sub 2/O/sub 3/ with tightly focused laser beams. The data used is furnished by Milam. The fit to the data is excellent; and least squared correlation coefficients greater than 0.9 are often obtained.
Comparison of Statistical Models for Regional Crop Trial Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Qun-yuan; KONG Fan-ling
2002-01-01
Based on the review and comparison of main statistical analysis models for estimating varietyenvironment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model ＞ AMMI model ＞ PCA model ＞ Treatment Means (TM) model ＞ Linear Regression (LR) model ＞ Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Banerjee, Anirban
2016-01-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
[Some basic aspects in statistical analysis of visual acuity data].
Ren, Ze-Qin
2007-06-01
All visual acuity charts used currently have their own shortcomings. Therefore, it is difficult for ophthalmologists to evaluate visual acuity data. Many problems present in the use of statistical methods for handling visual acuity data in clinical research. The quantitative relationship between visual acuity and visual angle varied in different visual acuity charts. The type of visual acuity and visual angle are different from each other. Therefore, different statistical methods should be used for different data sources. A correct understanding and analysis of visual acuity data could be obtained only after the elucidation of these aspects.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2017-09-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
Do Labour Market Programmes Necessarily Crowd out Regular Employment? A Matching Model Analysis
Miller, J.G.
1996-01-01
It is often claimed that the usage of labour market programmes will necessarily crowd out regular employment (see, for example, Holmlund & Lindén (1993)). As a result, it could be argued that, despite their probable negative impact on unemployment, the overall benefits of using labour market program
Intelligent analysis of chaos roughness in regularity of walk for a two legged robot
Energy Technology Data Exchange (ETDEWEB)
Kaygisiz, Burak H. [Guidance and Control Division, TUBITAK-SAGE, 684 Sokak anka Evleri No. 67/30, Cayyolu, Ankara (Turkey)]. E-mail: burak.kaygisiz@gmail.com; Erkmen, Ismet [Electrical and Electronics Engineering, Middle East Technical University (Turkey)]. E-mail: erkmen@metu.edu.tr; Erkmen, Aydan M. [Electrical and Electronics Engineering, Middle East Technical University (Turkey)]. E-mail: aydan@metu.edu.tr
2006-07-15
We describe in this paper a new approach to the identification of the chaotic boundaries of regular (periodic and quasiperiodic) regions in nonlinear systems, using cell mapping equipped with measures of fractal dimension and rough sets. The proposed fractal-rough set approach considers a state space divided into cells where cell trajectories are determined using cell to cell mapping technique. All image cells in the state space, equipped with their individual fractal dimension are then classified as being members of lower approximation, upper approximation or boundary region of regular regions with the help of rough set theory. The rough set with fractal dimension as its attribute is used to model the uncertainty of the regular regions, treated as sets of cells in this paper. This uncertainty is then smoothed by a reinforcement learning algorithm in order to enrich regular regions that are used for control. Our approach is applied to the walking control of a two legged robot, which fails very frequently due to chaotic behavior.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-01
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
Statistical Analysis of SAR Sea Clutter for Classification Purposes
Directory of Open Access Journals (Sweden)
Jaime Martín-de-Nicolás
2014-09-01
Full Text Available Statistical analysis of radar clutter has always been one of the topics, where more effort has been put in the last few decades. These studies were usually focused on finding the statistical models that better fitted the clutter distribution; however, the goal of this work is not the modeling of the clutter, but the study of the suitability of the statistical parameters to carry out a sea state classification. In order to achieve this objective and provide some relevance to this study, an important set of maritime and coastal Synthetic Aperture Radar data is considered. Due to the nature of the acquisition of data by SAR sensors, speckle noise is inherent to these data, and a specific study of how this noise affects the clutter distribution is also performed in this work. In pursuit of a sense of wholeness, a thorough study of the most suitable statistical parameters, as well as the most adequate classifier is carried out, achieving excellent results in terms of classification success rates. These concluding results confirm that a sea state classification is not only viable, but also successful using statistical parameters different from those of the best modeling distribution and applying a speckle filter, which allows a better characterization of the parameters used to distinguish between different sea states.
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Salvat, I; Zaldivar, P; Monterde, S; Montull, S; Miralles, I; Castel, A
2017-03-01
Multidisciplinary treatments have shown to be effective for fibromyalgia. We report detailed functional outcomes of patients with fibromyalgia who attended a 3-month Multidisciplinary treatment program. The hypothesis was that patients would have increased functional status, physical activity level, and exercise regularity after attending this program. We performed a retrospective analysis of a randomized, simple blinded clinical trial. The inclusion criteria consisted of female sex, a diagnosis of fibromyalgia, age 18-60 and 3-8 years of schooling. Measures from the Fibromyalgia Impact Questionnaire (FIQ) and the COOP/WONCA Functional Health Assessment Charts (WONCA) were obtained before and at the end of the treatment and at 3-, 6-, and 12-month follow-ups. Patients recorded their number of steps per day with pedometers. They performed the six-minute walk test (6 MW) before and after treatment. In total, 155 women participated in the study. Their median (interquartile interval) FIQ score was 68.0 (53.0-77.0) at the beginning of the treatment, and the difference between the Multidisciplinary and Control groups was statistically and clinically significant in all of the measures (except the 6-month follow-up). The WONCA charts showed significant clinical improvements in the Multidisciplinary group, with physical fitness in the normal range across almost all values. In that group, steps/day showed more regularity, and the 6 MW results showed improvement of -33.00 (-59.8 to -8.25) m, and the differences from the Control group were statistically significant. The patients who underwent the Multidisciplinary treatment had improved functional status, physical activity level, and exercise regularity. The functional improvements were maintained 1 year after treatment completion.
Statistical analysis of the precision of the Match method
Directory of Open Access Journals (Sweden)
R. Lehmann
2005-05-01
Full Text Available The Match method quantifies chemical ozone loss in the polar stratosphere. The basic idea consists in calculating the forward trajectory of an air parcel that has been probed by an ozone measurement (e.g., by an ozone sonde or satellite and finding a second ozone measurement close to this trajectory. Such an event is called a ''match''. A rate of chemical ozone destruction can be obtained by a statistical analysis of several tens of such match events. Information on the uncertainty of the calculated rate can be inferred from the scatter of the ozone mixing ratio difference (second measurement minus first measurement associated with individual matches. A standard analysis would assume that the errors of these differences are statistically independent. However, this assumption may be violated because different matches can share a common ozone measurement, so that the errors associated with these match events become statistically dependent. Taking this effect into account, we present an analysis of the uncertainty of the final Match result. It has been applied to Match data from the Arctic winters 1995, 1996, 2000, and 2003. For these ozone-sonde Match studies the effect of the error correlation on the uncertainty estimates is rather small: compared to a standard error analysis, the uncertainty estimates increase by 15% on average. However, the effect is more pronounced for typical satellite Match analyses: for an Antarctic satellite Match study (2003, the uncertainty estimates increase by 60% on average.
Pashkevich, Anatoly; Chablat, Damien
2007-01-01
The Orthoglide is a Delta-type PKM dedicated to 3-axis rapid machining applications that was originally developed at IRCCyN in 2000-2001 to meet the advantages of both serial 3-axis machines (regular workspace and homogeneous performances) and parallel kinematic architectures (good dynamic performances and stiffness). This machine has three fixed parallel linear joints that are mounted orthogonally. The geometric parameters of the Orthoglide were defined as function of the size of a prescribed cubic Cartesian workspace that is free of singularities and internal collision. The interesting features of the Orthoglide are a regular Cartesian workspace shape, uniform performances in all directions and good compactness. In this paper, a new method is proposed to analyze the stiffness of overconstrained Delta-type manipulators, such as the Orthoglide. The Orthoglide is then benchmarked according to geometric, kinematic and stiffness criteria: workspace to footprint ratio, velocity and force transmission factors, sen...
Towards Advanced Data Analysis by Combining Soft Computing and Statistics
Gil, María; Sousa, João; Verleysen, Michel
2013-01-01
Soft computing, as an engineering science, and statistics, as a classical branch of mathematics, emphasize different aspects of data analysis. Soft computing focuses on obtaining working solutions quickly, accepting approximations and unconventional approaches. Its strength lies in its flexibility to create models that suit the needs arising in applications. In addition, it emphasizes the need for intuitive and interpretable models, which are tolerant to imprecision and uncertainty. Statistics is more rigorous and focuses on establishing objective conclusions based on experimental data by analyzing the possible situations and their (relative) likelihood. It emphasizes the need for mathematical methods and tools to assess solutions and guarantee performance. Combining the two fields enhances the robustness and generalizability of data analysis methods, while preserving the flexibility to solve real-world problems efficiently and intuitively.
Statistical Analysis of Ship Collisions with Bridges in China Waterway
Institute of Scientific and Technical Information of China (English)
DAI Tong-yu; NIE Wu; LIU Ying-jie; WANG Li-ping
2002-01-01
Having carried out investigations on ship collision accidents with bridges in waterway in China, a database of ship collision with bridge (SCB) is developed in this paper. It includes detailed information about more than 200 accidents near ship's waterways in the last four decades, in which ships collided with the bridges. Based on the information a statistical analysis is presented tentatively. The increase in frequency of ship collision with bridges appears, and the accident quantity of the barge system is more than that of single ship. The main reason of all the factors for ship collision with bridge is the human errors, which takes up 70%. The quantity of the accidents happened during flooding period shows over 3～6 times compared with the period from March to June in a year. The probability follows the normal distribution according to statistical analysis. Visibility, span between piers also have an effect on the frequency of the accidents.
Collagen morphology and texture analysis: from statistics to classification
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-07-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
GNSS Spoofing Detection Based on Signal Power Measurements: Statistical Analysis
Directory of Open Access Journals (Sweden)
V. Dehghanian
2012-01-01
Full Text Available A threat to GNSS receivers is posed by a spoofing transmitter that emulates authentic signals but with randomized code phase and Doppler values over a small range. Such spoofing signals can result in large navigational solution errors that are passed onto the unsuspecting user with potentially dire consequences. An effective spoofing detection technique is developed in this paper, based on signal power measurements and that can be readily applied to present consumer grade GNSS receivers with minimal firmware changes. An extensive statistical analysis is carried out based on formulating a multihypothesis detection problem. Expressions are developed to devise a set of thresholds required for signal detection and identification. The detection processing methods developed are further manipulated to exploit incidental antenna motion arising from user interaction with a GNSS handheld receiver to further enhance the detection performance of the proposed algorithm. The statistical analysis supports the effectiveness of the proposed spoofing detection technique under various multipath conditions.
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.
2012-04-30
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
Statistical analysis on reliability and serviceability of caterpillar tractor
Institute of Scientific and Technical Information of China (English)
WANG Jinwu; LIU Jiafu; XU Zhongxiang
2007-01-01
For further understanding reliability and serviceability of tractor and to furnish scientific and technical theories, based on the promotion and application of it, the following experiments and statistical analysis on reliability (reliability and MTBF) serviceability (service and MTTR) of Donfanghong-1002 and Dongfanghong-802 were conducted. The result showed that the intervals of average troubles of these two tractors were 182.62 h and 160.2 h, respectively, and the weakest assembly of them was engine part.
Common pitfalls in statistical analysis: Odds versus risk
Ranganathan, Priya; Aggarwal, Rakesh; Pramesh, C. S.
2015-01-01
In biomedical research, we are often interested in quantifying the relationship between an exposure and an outcome. “Odds” and “Risk” are the most common terms which are used as measures of association between variables. In this article, which is the fourth in the series of common pitfalls in statistical analysis, we explain the meaning of risk and odds and the difference between the two. PMID:26623395
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Directory of Open Access Journals (Sweden)
Jeffrey Chu
Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Price, Erin P; Smith, Helen; Huygens, Flavia; Giffard, Philip M
2007-05-01
A novel method for genotyping the clustered, regularly interspaced short-palindromic-repeat (CRISPR) locus of Campylobacter jejuni is described. Following real-time PCR, CRISPR products were subjected to high-resolution melt (HRM) analysis, a new technology that allows precise melt profile determination of amplicons. This investigation shows that the CRISPR HRM assay provides a powerful addition to existing C. jejuni genotyping methods and emphasizes the potential of HRM for genotyping short sequence repeats in other species.
Statistical methods of SNP data analysis with applications
Bulinski, Alexander; Shashkin, Alexey; Yaskov, Pavel
2011-01-01
Various statistical methods important for genetic analysis are considered and developed. Namely, we concentrate on the multifactor dimensionality reduction, logic regression, random forests and stochastic gradient boosting. These methods and their new modifications, e.g., the MDR method with "independent rule", are used to study the risk of complex diseases such as cardiovascular ones. The roles of certain combinations of single nucleotide polymorphisms and external risk factors are examined. To perform the data analysis concerning the ischemic heart disease and myocardial infarction the supercomputer SKIF "Chebyshev" of the Lomonosov Moscow State University was employed.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2009-01-01
An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat
Constrained and regularized system identification
Directory of Open Access Journals (Sweden)
Tor A. Johansen
1998-04-01
Full Text Available Prior knowledge can be introduced into system identification problems in terms of constraints on the parameter space, or regularizing penalty functions in a prediction error criterion. The contribution of this work is mainly an extension of the well known FPE (Final Production Error statistic to the case when the system identification problem is constrained and contains a regularization penalty. The FPECR statistic (Final Production Error with Constraints and Regularization is of potential interest as a criterion for selection of both regularization parameters and structural parameters such as order.
Statistical Analysis of 30 Years Rainfall Data: A Case Study
Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.
2017-07-01
Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.
HistFitter: a flexible framework for statistical data analysis
Besjes, G J; Côté, D; Koutsman, A; Lorenz, J M; Short, D
2015-01-01
HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms.HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configura...
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
The Effects of Statistical Analysis Software and Calculators on Statistics Achievement
Christmann, Edwin P.
2009-01-01
This study compared the effects of microcomputer-based statistical software and hand-held calculators on the statistics achievement of university males and females. The subjects, 73 graduate students enrolled in univariate statistics classes at a public comprehensive university, were randomly assigned to groups that used either microcomputer-based…
DEFF Research Database (Denmark)
Hansen, Lars Kai; Rasmussen, Carl Edward; Svarer, C.
1994-01-01
Regularization, e.g., in the form of weight decay, is important for training and optimization of neural network architectures. In this work the authors provide a tool based on asymptotic sampling theory, for iterative estimation of weight decay parameters. The basic idea is to do a gradient descent...... in the estimated generalization error with respect to the regularization parameters. The scheme is implemented in the authors' Designer Net framework for network training and pruning, i.e., is based on the diagonal Hessian approximation. The scheme does not require essential computational overhead in addition...... to what is needed for training and pruning. The viability of the approach is demonstrated in an experiment concerning prediction of the chaotic Mackey-Glass series. The authors find that the optimized weight decays are relatively large for densely connected networks in the initial pruning phase, while...
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Self-Contained Statistical Analysis of Gene Sets
Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher
2016-01-01
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
Statistical wind analysis for near-space applications
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
Statistical methods for the detection and analysis of radioactive sources
Klumpp, John
We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.
A Statistical Analysis of Lunisolar-Earthquake Connections
Rüegg, Christian Michael-André
2012-11-01
Despite over a century of study, the relationship between lunar cycles and earthquakes remains controversial and difficult to quantitatively investigate. Perhaps as a consequence, major earthquakes around the globe are frequently followed by "prediction claim", using lunar cycles, that generate media furore and pressure scientists to provide resolute answers. The 2010-2011 Canterbury earthquakes in New Zealand were no exception; significant media attention was given to lunar derived earthquake predictions by non-scientists, even though the predictions were merely "opinions" and were not based on any statistically robust temporal or causal relationships. This thesis provides a framework for studying lunisolar earthquake temporal relationships by developing replicable statistical methodology based on peer reviewed literature. Notable in the methodology is a high accuracy ephemeris, called ECLPSE, designed specifically by the author for use on earthquake catalogs and a model for performing phase angle analysis.
Statistical analysis of subjective preferences for video enhancement
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
ATP binding to a multisubunit enzyme: statistical thermodynamics analysis
Zhang, Yunxin
2012-01-01
Due to inter-subunit communication, multisubunit enzymes usually hydrolyze ATP in a concerted fashion. However, so far the principle of this process remains poorly understood. In this study, from the viewpoint of statistical thermodynamics, a simple model is presented. In this model, we assume that the binding of ATP will change the potential of the corresponding enzyme subunit, and the degree of this change depends on the state of its adjacent subunits. The probability of enzyme in a given state satisfies the Boltzmann's distribution. Although it looks much simple, this model can fit the recent experimental data of chaperonin TRiC/CCT well. From this model, the dominant state of TRiC/CCT can be obtained. This study provided a new way to understand biophysical processes by statistical thermodynamics analysis.
Statistical analysis of effective singular values in matrix rank determination
Konstantinides, Konstantinos; Yao, Kung
1988-01-01
A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given.
Statistical methods for data analysis in particle physics
Lista, Luca
2015-01-01
This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data
[Statistical analysis of DNA sequences nearby splicing sites].
Korzinov, O M; Astakhova, T V; Vlasov, P K; Roĭtberg, M A
2008-01-01
Recognition of coding regions within eukaryotic genomes is one of oldest but yet not solved problems of bioinformatics. New high-accuracy methods of splicing sites recognition are needed to solve this problem. A question of current interest is to identify specific features of nucleotide sequences nearby splicing sites and recognize sites in sequence context. We performed a statistical analysis of human genes fragment database and revealed some characteristics of nucleotide sequences in splicing sites neighborhood. Frequencies of all nucleotides and dinucleotides in splicing sites environment were computed and nucleotides and dinucleotides with extremely high\\low occurrences were identified. Statistical information obtained in this work can be used in further development of the methods of splicing sites annotation and exon-intron structure recognition.
The NIRS Analysis Package: noise reduction and statistical inference.
Fekete, Tomer; Rubin, Denis; Carlson, Joshua M; Mujica-Parodi, Lilianne R
2011-01-01
Near infrared spectroscopy (NIRS) is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological), motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP). Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.
On Statistical Analysis of Neuroimages with Imperfect Registration
Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas
2016-01-01
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
Directory of Open Access Journals (Sweden)
Anne de la Hunty
2013-03-01
Full Text Available Objective: To review systematically the evidence on breakfast cereal consumption and obesity in children and adolescents and assess whether the regular consumption of breakfast cereals could help to prevent excessive weight gain. Methods: A systematic review and meta-analysis of studies relating breakfast cereal consumption to BMI, BMI z-scores and prevalence of obesity as the outcomes. Results: 14 papers met the inclusion criteria. The computed effect size for mean BMI between high consumers and low or non-consumers over all 25 study subgroups was -1.13 kg/m2 (95% CI -0.81, -1.46, p Conclusion: Overall, the evidence reviewed is suggestive that regular consumption of breakfast cereals results in a lower BMI and a reduced likelihood of being overweight in children and adolescents. However, more evidence from long-term trials and investigations into mechanisms is needed to eliminate possible confounding factors and determine causality.
Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.
Barrett, Julie A; Siegel, Jay A; Goodpaster, John V
2011-01-01
This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.
Managing Performance Analysis with Dynamic Statistical Projection Pursuit
Energy Technology Data Exchange (ETDEWEB)
Vetter, J.S.; Reed, D.A.
2000-05-22
Computer systems and applications are growing more complex. Consequently, performance analysis has become more difficult due to the complex, transient interrelationships among runtime components. To diagnose these types of performance issues, developers must use detailed instrumentation to capture a large number of performance metrics. Unfortunately, this instrumentation may actually influence the performance analysis, leading the developer to an ambiguous conclusion. In this paper, we introduce a technique for focusing a performance analysis on interesting performance metrics. This technique, called dynamic statistical projection pursuit, identifies interesting performance metrics that the monitoring system should capture across some number of processors. By reducing the number of performance metrics, projection pursuit can limit the impact of instrumentation on the performance of the target system and can reduce the volume of performance data.
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Directory of Open Access Journals (Sweden)
Walaa Ibrahim Gabr
2015-03-01
Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.
GIS-BASED SPATIAL STATISTICAL ANALYSIS OF COLLEGE GRADUATES EMPLOYMENT
Directory of Open Access Journals (Sweden)
R. Tang
2012-07-01
Full Text Available It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004–2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Gis-Based Spatial Statistical Analysis of College Graduates Employment
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
The features of Drosophila core promoters revealed by statistical analysis
Directory of Open Access Journals (Sweden)
Trifonov Edward N
2006-06-01
Full Text Available Abstract Background Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. Results Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE, and Motif Ten Element (MTE, as well as core elements discovered in Human (TFIIB Recognition Element (BRE and Downstream Core Element (DCE. Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s. Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. Conclusion We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.
Statistical design and analysis of RNA sequencing data.
Auer, Paul L; Doerge, R W
2010-06-01
Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.
Statistical Analysis of Designed Experiments Theory and Applications
Tamhane, Ajit C
2012-01-01
A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the
Statistical energy analysis of complex structures, phase 2
Trudell, R. W.; Yano, L. I.
1980-01-01
A method for estimating the structural vibration properties of complex systems in high frequency environments was investigated. The structure analyzed was the Materials Experiment Assembly, (MEA), which is a portion of the OST-2A payload for the space transportation system. Statistical energy analysis (SEA) techniques were used to model the structure and predict the structural element response to acoustic excitation. A comparison of the intial response predictions and measured acoustic test data is presented. The conclusions indicate that: the SEA predicted the response of primary structure to acoustic excitation over a wide range of frequencies; and the contribution of mechanically induced random vibration to the total MEA is not significant.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2014-01-01
An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent TasksThe first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferentia
Multi-scale statistical analysis of coronal solar activity
Gamborino, Diana; del-Castillo-Negrete, Diego; Martinell, Julio J.
2016-07-01
Multi-filter images from the solar corona are used to obtain temperature maps that are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions, we show that the multi-scale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also to be extracted from the analysis.
Spatial Analysis Along Networks Statistical and Computational Methods
Okabe, Atsuyuki
2012-01-01
In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process
Feature statistic analysis of ultrasound images of liver cancer
Huang, Shuqin; Ding, Mingyue; Zhang, Songgeng
2007-12-01
In this paper, a specific feature analysis of liver ultrasound images including normal liver, liver cancer especially hepatocellular carcinoma (HCC) and other hepatopathy is discussed. According to the classification of hepatocellular carcinoma (HCC), primary carcinoma is divided into four types. 15 features from single gray-level statistic, gray-level co-occurrence matrix (GLCM), and gray-level run-length matrix (GLRLM) are extracted. Experiments for the discrimination of each type of HCC, normal liver, fatty liver, angioma and hepatic abscess have been conducted. Corresponding features to potentially discriminate them are found.
STATISTIC ANALYSIS OF INTERNATIONAL TOURISM ON ROMANIAN SEASIDE
Directory of Open Access Journals (Sweden)
MIRELA SECARĂ
2010-01-01
Full Text Available In order to meet European and international touristic competition standards, modernization, re-establishment and development of Romanian tourism are necessary as well as creation of modern touristic products that are competitive on this market. The use of modern methods of statistic analysis in the field of tourism facilitates the achievement of systems of information that are the instruments for: evaluation of touristic demand and touristic supply, follow-up of touristic services of each touring form, follow-up of transportation services, leisure activities, hotel accommodation, touristic market study, and a complex flexible system of management and accountancy.
Statistics for proteomics: experimental design and 2-DE differential analysis.
Chich, Jean-François; David, Olivier; Villers, Fanny; Schaeffer, Brigitte; Lutomski, Didier; Huet, Sylvie
2007-04-15
Proteomics relies on the separation of complex protein mixtures using bidimensional electrophoresis. This approach is largely used to detect the expression variations of proteins prepared from two or more samples. Recently, attention was drawn on the reliability of the results published in literature. Among the critical points identified were experimental design, differential analysis and the problem of missing data, all problems where statistics can be of help. Using examples and terms understandable by biologists, we describe how a collaboration between biologists and statisticians can improve reliability of results and confidence in conclusions.
A Probabilistic Rain Diagnostic Model Based on Cyclone Statistical Analysis
Iordanidou, V.; A. G. Koutroulis; I. K. Tsanis
2014-01-01
Data from a dense network of 69 daily precipitation gauges over the island of Crete and cyclone climatological analysis over middle-eastern Mediterranean are combined in a statistical approach to develop a rain diagnostic model. Regarding the dataset, 0.5 × 0.5, 33-year (1979–2011) European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) is used. The cyclone tracks and their characteristics are identified with the aid of Melbourne University algorithm (MS scheme). T...
Research and Development on Food Nutrition Statistical Analysis Software System
Directory of Open Access Journals (Sweden)
Du Li
2013-12-01
Full Text Available Designing and developing a set of food nutrition component statistical analysis software can realize the automation of nutrition calculation, improve the nutrition processional professional’s working efficiency and achieve the informatization of the nutrition propaganda and education. In the software development process, the software engineering method and database technology are used to calculate the human daily nutritional intake and the intelligent system is used to evaluate the user’s health condition. The experiment can show that the system can correctly evaluate the human health condition and offer the reasonable suggestion, thus exploring a new road to solve the complex nutrition computational problem with information engineering.
Coxeter, H S M
1973-01-01
Polytopes are geometrical figures bounded by portions of lines, planes, or hyperplanes. In plane (two dimensional) geometry, they are known as polygons and comprise such figures as triangles, squares, pentagons, etc. In solid (three dimensional) geometry they are known as polyhedra and include such figures as tetrahedra (a type of pyramid), cubes, icosahedra, and many more; the possibilities, in fact, are infinite! H. S. M. Coxeter's book is the foremost book available on regular polyhedra, incorporating not only the ancient Greek work on the subject, but also the vast amount of information
A theoretical analysis to current exponent variation regularity and electromigration-induced failure
Wang, Yuexing; Yao, Yao
2017-02-01
The electric current exponent, typically with j-n form, is a key parameter to predict electromigration-induced failure lifetime. It is experimentally observed that the current exponent depends on different damage mechanisms. In the current research, the physical mechanisms including void initiation, void growth, and joule heating effect are all taken into account to investigate the current exponent variation regularity. Furthermore, a physically based model to predict the mean time to failure is developed and the traditional Black's equation is improved with clear physical meaning. It is found that the solution to the resulting void initiation and growth equation yields a current exponent of 2 and 1, respectively. On the other hand, joule heating plays an important role in failure time prediction and will induce the current exponent n > 2 based on the traditional semi-empirical model. The predictions are in agreement with the experimental results.
Fabrice, Delbary
2016-01-01
Compartmental models based on tracer mass balance are extensively used in clinical and pre-clinical nuclear medicine in order to obtain quantitative information on tracer metabolism in the biological tissue. This paper is the second of a series of two that deal with the problem of tracer coefficient estimation via compartmental modelling in an inverse problem framework. While the previous work was devoted to the discussion of identifiability issues for 2, 3 and n-dimension compartmental systems, here we discuss the problem of numerically determining the tracer coefficients by means of a general regularized Multivariate Gauss Newton scheme. In this paper, applications concerning cerebral, hepatic and renal functions are considered, involving experimental measurements on FDG-PET data on different set of murine models.
A probabilistic analysis of wind gusts using extreme value statistics
Energy Technology Data Exchange (ETDEWEB)
Friederichs, Petra; Bentzien, Sabrina; Lenz, Anne; Krampitz, Rebekka [Meteorological Inst., Univ. of Bonn (Germany); Goeber, Martin [Deutscher Wetterdienst, Offenbach (Germany)
2009-12-15
The spatial variability of wind gusts is probably as large as that of precipitation, but the observational weather station network is much less dense. The lack of an area-wide observational analysis hampers the forecast verification of wind gust warnings. This article develops and compares several approaches to derive a probabilistic analysis of wind gusts for Germany. Such an analysis provides a probability that a wind gust exceeds a certain warning level. To that end we have 5 years of observations of hourly wind maxima at about 140 weather stations of the German weather service at our disposal. The approaches are based on linear statistical modeling using generalized linear models, extreme value theory and quantile regression. Warning level exceedance probabilities are estimated in response to predictor variables such as the observed mean wind or the operational analysis of the wind velocity at a height of 10 m above ground provided by the European Centre for Medium Range Weather Forecasts (ECMWF). The study shows that approaches that apply to the differences between the recorded wind gust and the mean wind perform better in terms of the Brier skill score (which measures the quality of a probability forecast) than those using the gust factor or the wind gusts only. The study points to the benefit from using extreme value theory as the most appropriate and theoretically consistent statistical model. The most informative predictors are the observed mean wind, but also the observed gust velocities recorded at the neighboring stations. Out of the predictors used from the ECMWF analysis, the wind velocity at 10 m above ground is the most informative predictor, whereas the wind shear and the vertical velocity provide no additional skill. For illustration the results for January 2007 and during the winter storm Kyrill are shown. (orig.)
Statistical analysis of personal radiofrequency electromagnetic field measurements with nondetects.
Röösli, Martin; Frei, Patrizia; Mohler, Evelyn; Braun-Fahrländer, Charlotte; Bürgi, Alfred; Fröhlich, Jürg; Neubauer, Georg; Theis, Gaston; Egger, Matthias
2008-09-01
Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution. Copyright 2008 Wiley-Liss, Inc.
Zou, Yong; Donner, Reik V; Thiel, Marco; Kurths, Jürgen
2016-02-01
Recurrence in the phase space of complex systems is a well-studied phenomenon, which has provided deep insights into the nonlinear dynamics of such systems. For dissipative systems, characteristics based on recurrence plots have recently attracted much interest for discriminating qualitatively different types of dynamics in terms of measures of complexity, dynamical invariants, or even structural characteristics of the underlying attractor's geometry in phase space. Here, we demonstrate that the latter approach also provides a corresponding distinction between different co-existing dynamical regimes of the standard map, a paradigmatic example of a low-dimensional conservative system. Specifically, we show that the recently developed approach of recurrence network analysis provides potentially useful geometric characteristics distinguishing between regular and chaotic orbits. We find that chaotic orbits in an intermittent laminar phase (commonly referred to as sticky orbits) have a distinct geometric structure possibly differing in a subtle way from those of regular orbits, which is highlighted by different recurrence network properties obtained from relatively short time series. Thus, this approach can help discriminating regular orbits from laminar phases of chaotic ones, which presents a persistent challenge to many existing chaos detection techniques.
Zou, Yong; Donner, Reik V.; Thiel, Marco; Kurths, Jürgen
2016-02-01
Recurrence in the phase space of complex systems is a well-studied phenomenon, which has provided deep insights into the nonlinear dynamics of such systems. For dissipative systems, characteristics based on recurrence plots have recently attracted much interest for discriminating qualitatively different types of dynamics in terms of measures of complexity, dynamical invariants, or even structural characteristics of the underlying attractor's geometry in phase space. Here, we demonstrate that the latter approach also provides a corresponding distinction between different co-existing dynamical regimes of the standard map, a paradigmatic example of a low-dimensional conservative system. Specifically, we show that the recently developed approach of recurrence network analysis provides potentially useful geometric characteristics distinguishing between regular and chaotic orbits. We find that chaotic orbits in an intermittent laminar phase (commonly referred to as sticky orbits) have a distinct geometric structure possibly differing in a subtle way from those of regular orbits, which is highlighted by different recurrence network properties obtained from relatively short time series. Thus, this approach can help discriminating regular orbits from laminar phases of chaotic ones, which presents a persistent challenge to many existing chaos detection techniques.
Zhao, Yan-qing; Teng, Jing; Yang, Hong-jun
2015-05-01
To analyze the prescription and medication regularities of traditional Chinese medicines in the treatment of melancholia in the Chinese journal full text database (CNKI), Wanfang Data knowledge service platform, VIP, Chinese biomedical literature database (CBM) in based on the traditional Chinese medicine inheritance support platform software, in order to provide reference for further mining traditional Chinese medicines for the treatment of melancholia and new drug development. The traditional Chinese medicine inheritance support platform software V2.0 was used to establish the prescription database of traditional Chinese medicines for treating melancholia. The software integrated data mining method was adopted to analyze four Qis, five flavors, meridian distribution, frequency statistics, syndrome distribution, composition regularity and new prescriptions. Totally 358 prescriptions for treating melancholia were analyzed to determine the frequency of prescription drugs, commonly used drug pairs and combinations and develop 22 new prescriptions. According to this study, prescriptions for treating depression collected in modern literature databases mainly have the effects in soothing liver and resolving melancholia, strengthening spleen and eliminating phlegm, activating and replenishing blood, regulating liver qi, tonifying spleen qi, clearing heat and purging heat, soothing the mind, nourishing yin and tonifying kidney, with neutral drug property and sweet or bitter flavor, and follow the melancholia treatment principle of "regulating qi and opening the mind, regulating qi and empathy".
How little data is enough? Phase-diagram analysis of sparsity-regularized X-ray computed tomography.
Jørgensen, J S; Sidky, E Y
2015-06-13
We introduce phase-diagram analysis, a standard tool in compressed sensing (CS), to the X-ray computed tomography (CT) community as a systematic method for determining how few projections suffice for accurate sparsity-regularized reconstruction. In CS, a phase diagram is a convenient way to study and express certain theoretical relations between sparsity and sufficient sampling. We adapt phase-diagram analysis for empirical use in X-ray CT for which the same theoretical results do not hold. We demonstrate in three case studies the potential of phase-diagram analysis for providing quantitative answers to questions of undersampling. First, we demonstrate that there are cases where X-ray CT empirically performs comparably with a near-optimal CS strategy, namely taking measurements with Gaussian sensing matrices. Second, we show that, in contrast to what might have been anticipated, taking randomized CT measurements does not lead to improved performance compared with standard structured sampling patterns. Finally, we show preliminary results of how well phase-diagram analysis can predict the sufficient number of projections for accurately reconstructing a large-scale image of a given sparsity by means of total-variation regularization.
STATISTICAL ANALYSIS OF THE TM- MODEL VIA BAYESIAN APPROACH
Directory of Open Access Journals (Sweden)
Muhammad Aslam
2012-11-01
Full Text Available The method of paired comparisons calls for the comparison of treatments presented in pairs to judges who prefer the better one based on their sensory evaluations. Thurstone (1927 and Mosteller (1951 employ the method of maximum likelihood to estimate the parameters of the Thurstone-Mosteller model for the paired comparisons. A Bayesian analysis of the said model using the non-informative reference (Jeffreys prior is presented in this study. The posterior estimates (means and joint modes of the parameters and the posterior probabilities comparing the two parameters are obtained for the analysis. The predictive probabilities that one treatment (Ti in preferred to any other treatment (Tj in a future single comparison are also computed. In addition, the graphs of the marginal posterior distributions of the individual parameter are drawn. The appropriateness of the model is also tested using the Chi-Square test statistic.
On Understanding Statistical Data Analysis in Higher Education
Montalbano, Vera
2012-01-01
Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of graphs are all competences necessary for achieving an in-depth data analysis. In higher education, these topics are usually fragmentized in different courses, the interdisciplinary integration can lack, some caution in the use of these topics can missing or be misunderstood. Students are often obliged to acquire these skills by themselves during the preparation of the final experimental thesis. A proposal for a learning path on nuclear phenomena is presented in order to develop these scientific competences in physics courses. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources can be monitored for long time in a physics laboratory. The data are collected and analyzed in a computer lab i...
Statistical analysis of cascading failures in power grids
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Laboratory; Pfitzner, Rene [Los Alamos National Laboratory; Turitsyn, Konstantin [Los Alamos National Laboratory
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systems consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.
Topics in statistical data analysis for high-energy physics
Cowan, G
2013-01-01
These lectures concern two topics that are becoming increasingly important in the analysis of High Energy Physics (HEP) data: Bayesian statistics and multivariate methods. In the Bayesian approach we extend the interpretation of probability to cover not only the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in HEP in recent years: the boosted decision tree (BDT).
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
Energy Technology Data Exchange (ETDEWEB)
Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-29
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.
Processes and subdivisions in diogenites, a multivariate statistical analysis
Harriott, T. A.; Hewins, R. H.
1984-01-01
Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
Statistical learning analysis in neuroscience: aiming for transparency
Directory of Open Access Journals (Sweden)
Michael Hanke
2010-05-01
Full Text Available Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires ``neuroscience-aware'' technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here we review its features and applicability to various neural data modalities.
Statistical learning analysis in neuroscience: aiming for transparency.
Hanke, Michael; Halchenko, Yaroslav O; Haxby, James V; Pollmann, Stefan
2010-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires "neuroscience-aware" technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities.
Multivariate Statistical Analysis Applied in Wine Quality Evaluation
Directory of Open Access Journals (Sweden)
Jieling Zou
2015-08-01
Full Text Available This study applies multivariate statistical approaches to wine quality evaluation. With 27 red wine samples, four factors were identified out of 12 parameters by principal component analysis, explaining 89.06% of the total variance of data. As iterative weights calculated by the BP neural network revealed little difference from weights determined by information entropy method, the latter was chosen to measure the importance of indicators. Weighted cluster analysis performs well in classifying the sample group further into two sub-clusters. The second cluster of red wine samples, compared with its first, was lighter in color, tasted thinner and had fainter bouquet. Weighted TOPSIS method was used to evaluate the quality of wine in each sub-cluster. With scores obtained, each sub-cluster was divided into three grades. On the whole, the quality of lighter red wine was slightly better than the darker category. This study shows the necessity and usefulness of multivariate statistical techniques in both wine quality evaluation and parameter selection.
Statistical analysis of magnetically soft particles in magnetorheological elastomers
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2–15 wt% (0.27–2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
Visualization methods for statistical analysis of microarray clusters
Directory of Open Access Journals (Sweden)
Li Kai
2005-05-01
Full Text Available Abstract Background The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. Results We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets and is available at http://function.princeton.edu/GeneVAnD. Conclusion Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters.
Statistical analysis of bound companions in the Coma cluster
Mendelin, Martin; Binggeli, Bruno
2017-08-01
Aims: The rich and nearby Coma cluster of galaxies is known to have substructure. We aim to create a more detailed picture of this substructure by searching directly for bound companions around individual giant members. Methods: We have used two catalogs of Coma galaxies, one covering the cluster core for a detailed morphological analysis, another covering the outskirts. The separation limit between possible companions (secondaries) and giants (primaries) is chosen as MB = -19 and MR = -20, respectively for the two catalogs. We have created pseudo-clusters by shuffling positions or velocities of the primaries and search for significant over-densities of possible companions around giants by comparison with the data. This method was developed and applied first to the Virgo cluster. In a second approach we introduced a modified nearest neighbor analysis using several interaction parameters for all galaxies. Results: We find evidence for some excesses due to possible companions for both catalogs. Satellites are typically found among the faintest dwarfs (MB type giants (spirals) in the outskirts, which is expected in an infall scenario of cluster evolution. A rough estimate for an upper limit of bound galaxies within Coma is 2-4%, to be compared with 7% for Virgo. Conclusions: The results agree well with the expected low frequency of bound companions in a regular cluster such as Coma. To exploit the data more fully and reach more detailed insights into the physics of cluster evolution we suggest applying the method also to model clusters created by N-body simulations for comparison.
Directory of Open Access Journals (Sweden)
Kok VC
2015-03-01
Full Text Available Victor C Kok,1,2 Jorng-Tzong Horng,2,3 Hsu-Kai Huang,3 Tsung-Ming Chao,4 Ya-Fang Hong5 1Division of Medical Oncology, Department of Internal Medicine, Kuang Tien General Hospital, Taichung, Taiwan; 2Department of Biomedical Informatics, Asia University Taiwan, Taichung, Taiwan; 3Department of Computer Science and Information Engineering, National Central University, Jhongli, Taiwan; 4Statistics Unit, Department of Applied Geomatics, Chien Hsin University, Jhongli, Taiwan; 5Institute of Molecular Biology, Academia Sinica, Nankang, Taipei, Taiwan Background: Recent studies have shown that inhaled corticosteroids (ICS can exert anti-inflammatory effects for chronic airway diseases, and several observational studies suggest that they play a role as cancer chemopreventive agents, particularly against lung cancer. We aimed to examine whether regular ICS use was associated with a reduced risk for future malignancy in patients with newly diagnosed adult-onset asthma. Methods: We used a population-based cohort study between 2001 and 2008 with appropriate person-time analysis. Participants were followed up until the first incident of cancer, death, or to the end of 2008. The Cox model was used to derive an adjusted hazard ratio (aHR for cancer development. Kaplan–Meier cancer-free survival curves of two groups were compared. Results: The exposed group of 2,117 regular ICS users and the nonexposed group of 17,732 non-ICS users were assembled. After 7,365 (mean, 3.5 years; standard deviation 2.1 and 73,789 (mean, 4.1 years; standard deviation 2.4 person-years of follow-up for the ICS users and the comparator group of non-ICS users, respectively, the aHR for overall cancer was nonsignificantly elevated at 1.33 with 95% confidence interval (CI, 1.00–1.76, P=0.0501. The Kaplan–Meier curves for overall cancer-free proportions of both groups were not significant (log-rank, P=0.065. Synergistic interaction of concurrent presence of regular ICS use was
Statistical Models and Methods for Network Meta-Analysis.
Madden, L V; Piepho, H-P; Paul, P A
2016-08-01
Meta-analysis, the methodology for analyzing the results from multiple independent studies, has grown tremendously in popularity over the last four decades. Although most meta-analyses involve a single effect size (summary result, such as a treatment difference) from each study, there are often multiple treatments of interest across the network of studies in the analysis. Multi-treatment (or network) meta-analysis can be used for simultaneously analyzing the results from all the treatments. However, the methodology is considerably more complicated than for the analysis of a single effect size, and there have not been adequate explanations of the approach for agricultural investigations. We review the methods and models for conducting a network meta-analysis based on frequentist statistical principles, and demonstrate the procedures using a published multi-treatment plant pathology data set. A major advantage of network meta-analysis is that correlations of estimated treatment effects are automatically taken into account when an appropriate model is used. Moreover, treatment comparisons may be possible in a network meta-analysis that are not possible in a single study because all treatments of interest may not be included in any given study. We review several models that consider the study effect as either fixed or random, and show how to interpret model-fitting output. We further show how to model the effect of moderator variables (study-level characteristics) on treatment effects, and present one approach to test for the consistency of treatment effects across the network. Online supplemental files give explanations on fitting the network meta-analytical models using SAS.
中国古代的统计分析%Statistical analysis in ancient China
Institute of Scientific and Technical Information of China (English)
莫曰达
2003-01-01
Analyzing social and economic problems through statistics is one of an important aspects of statistics thoughts in ancient China. This paper demonstrates some situations of statistical analysis in ancient China.
The system for statistical analysis of logistic information
Directory of Open Access Journals (Sweden)
Khayrullin Rustam Zinnatullovich
2015-05-01
Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development
Statistical analysis of the breaking processes of Ni nanowires
Energy Technology Data Exchange (ETDEWEB)
Garcia-Mochales, P [Departamento de Fisica de la Materia Condensada, Facultad de Ciencias, Universidad Autonoma de Madrid, c/ Francisco Tomas y Valiente 7, Campus de Cantoblanco, E-28049-Madrid (Spain); Paredes, R [Centro de Fisica, Instituto Venezolano de Investigaciones CientIficas, Apartado 20632, Caracas 1020A (Venezuela); Pelaez, S; Serena, P A [Instituto de Ciencia de Materiales de Madrid, Consejo Superior de Investigaciones CientIficas, c/ Sor Juana Ines de la Cruz 3, Campus de Cantoblanco, E-28049-Madrid (Spain)], E-mail: pedro.garciamochales@uam.es
2008-06-04
We have performed a massive statistical analysis on the breaking behaviour of Ni nanowires using molecular dynamic simulations. Three stretching directions, five initial nanowire sizes and two temperatures have been studied. We have constructed minimum cross-section histograms and analysed for the first time the role played by monomers and dimers. The shape of such histograms and the absolute number of monomers and dimers strongly depend on the stretching direction and the initial size of the nanowire. In particular, the statistical behaviour of the breakage final stages of narrow nanowires strongly differs from the behaviour obtained for large nanowires. We have analysed the structure around monomers and dimers. Their most probable local configurations differ from those usually appearing in static electron transport calculations. Their non-local environments show disordered regions along the nanowire if the stretching direction is [100] or [110]. Additionally, we have found that, at room temperature, [100] and [110] stretching directions favour the appearance of non-crystalline staggered pentagonal structures. These pentagonal Ni nanowires are reported in this work for the first time. This set of results suggests that experimental Ni conducting histograms could show a strong dependence on the orientation and temperature.
Statistical Scalability Analysis of Communication Operations in Distributed Applications
Energy Technology Data Exchange (ETDEWEB)
Vetter, J S; McCracken, M O
2001-02-27
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for individual communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.
Statistical models of video structure for content analysis and characterization.
Vasconcelos, N; Lippman, A
2000-01-01
Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries.
Frequency of PSV inspection optmization using statistical data analysis
Directory of Open Access Journals (Sweden)
Alexandre Guimarães Botelho
2015-12-01
Full Text Available The present paper shows how qualitative analytical methodologies can be enhanced by statistical failure data analysis of process equipment in order to select an appropriate and cost-effective maintenance policy to reduce equipment life cycle cost. As such, a case study was carried out with failure and maintenance data from a sample of pressure safety valves (PSV of a PETROBRAS’s oil and gas production unit. Data was classified according to a failure mode and effect analysis— FMEA, and adjusted using a weibull distribution. The results show the possibility of reduction of maintenance frequency representing 29% of event reduction, without increasing risk, as well as evidencing the potential failures which must be blocked by the inspection plan.
Supermarket Analysis Based On Product Discount and Statistics
Directory of Open Access Journals (Sweden)
Komal Kumawat
2014-03-01
Full Text Available E-commerce has been growing rapidly. Its domain can provide all the right ingredients for successful data mining and it is a significant domain of data mining. E commerce refers to buying and selling of products or services over electronic systems such as internet. Various e commerce systems give discount on product and allow user to buy product online. The basic idea used here is to predict the product sale based on discount applied to the product. Our analysis concentrates on how customer behaves when discount is allotted to him. We have developed a model which finds the customer behaviour when discount is applied to the product. This paper elaborates upon how a different technique like session, click stream is used to collect user data online based on discount applied to the product and how statistics is applied to data set to see the variation in the data.
Higher order statistical moment application for solar PV potential analysis
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
Statistical analysis of $k$-nearest neighbor collaborative recommendation
Biau, Gérard; Rouvière, Laurent; 10.1214/09-AOS759
2010-01-01
Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of user...
Statistical uncertainty analysis of radon transport in nonisothermal, unsaturated soils
Energy Technology Data Exchange (ETDEWEB)
Holford, D.J.; Owczarski, P.C.; Gee, G.W.; Freeman, H.D.
1990-10-01
To accurately predict radon fluxes soils to the atmosphere, we must know more than the radium content of the soil. Radon flux from soil is affected not only by soil properties, but also by meteorological factors such as air pressure and temperature changes at the soil surface, as well as the infiltration of rainwater. Natural variations in meteorological factors and soil properties contribute to uncertainty in subsurface model predictions of radon flux, which, when coupled with a building transport model, will also add uncertainty to predictions of radon concentrations in homes. A statistical uncertainty analysis using our Rn3D finite-element numerical model was conducted to assess the relative importance of these meteorological factors and the soil properties affecting radon transport. 10 refs., 10 figs., 3 tabs.
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
This paper discusses inference for I(2) variables in a VAR model. The estimation procedure suggested consists of two reduced rank regressions. The asymptotic distribution of the proposed estimators of the cointegrating coefficients is mixed Gaussian, which implies that asymptotic inference can...... be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
To be certain about the uncertainty: Bayesian statistics for (13) C metabolic flux analysis.
Theorell, Axel; Leweke, Samuel; Wiechert, Wolfgang; Nöh, Katharina
2017-07-11
(13) C Metabolic Fluxes Analysis ((13) C MFA) remains to be the most powerful approach to determine intracellular metabolic reaction rates. Decisions on strain engineering and experimentation heavily rely upon the certainty with which these fluxes are estimated. For uncertainty quantification, the vast majority of (13) C MFA studies relies on confidence intervals from the paradigm of Frequentist statistics. However, it is well known that the confidence intervals for a given experimental outcome are not uniquely defined. As a result, confidence intervals produced by different methods can be different, but nevertheless equally valid. This is of high relevance to (13) C MFA, since practitioners regularly use three different approximate approaches for calculating confidence intervals. By means of a computational study with a realistic model of the central carbon metabolism of E. coli, we provide strong evidence that confidence intervals used in the field depend strongly on the technique with which they were calculated and, thus, their use leads to misinterpretation of the flux uncertainty. In order to provide a better alternative to confidence intervals in (13) C MFA, we demonstrate that credible intervals from the paradigm of Bayesian statistics give more reliable flux uncertainty quantifications which can be readily computed with high accuracy using Markov chain Monte Carlo. In addition, the widely applied chi-square test, as a means of testing whether the model reproduces the data, is examined closer. © 2017 Wiley Periodicals, Inc.
Igarashi, Yutaka; Nogami, Yoshie
2017-01-01
Background No meta-analysis has examined the effect of regular aquatic exercise on blood pressure. The purpose of this study was to perform a meta-analysis to evaluate the effects of regular aquatic exercise on blood pressure. Design A meta-analysis of randomized controlled trials. Methods Databases were searched for literature published up to April 2017. The randomized controlled trials analysed involved healthy adults, an intervention group that only performed aquatic exercise and a control group that did not exercise, no other intervention, and trials indicated mean systolic blood pressure or diastolic blood pressure. The net change in blood pressure was calculated from each trial, and the changes in blood pressure were pooled by a random effects model, and the risk of heterogeneity was evaluated. Subgroup analysis of subjects with hypertension, subjects who performed endurance exercise (or not), and subjects who only swam (or not) was performed, and the net changes in blood pressure were pooled. Results The meta-analysis examined 14 trials involving 452 subjects. Pooled net changes in blood pressure improved significantly (systolic blood pressure -8.4 mmHg; diastolic blood pressure -3.3 mmHg) and the changes in systolic blood pressure contained significant heterogeneity. When subjects were limited to those with hypertension, those who performed endurance exercise and subjects who did not swim, pooled net changes in systolic and diastolic blood pressure decreased significantly, but the heterogeneity of systolic blood pressure did not improve. Conclusion Like exercise on land, aquatic exercise should have a beneficial effect by lowering blood pressure. In addition, aquatic exercise should lower the blood pressure of subjects with hypertension, and other forms of aquatic exercise besides swimming should also lower blood pressure.
From random sphere packings to regular pillar arrays: analysis of transverse dispersion.
Daneyko, Anton; Hlushkou, Dzmitry; Khirevich, Siarhei; Tallarek, Ulrich
2012-09-28
We study the impact of microscopic order on transverse dispersion in the interstitial void space of bulk (unconfined) chromatographic beds by numerical simulations of incompressible fluid flow and mass transport of a passive tracer. Our study includes polydisperse random sphere packings (computer-generated with particle size distributions of modern core-shell and sub-2 μm particles), the macropore space morphology of a physically reconstructed silica monolith, and computer-generated regular pillar arrays. These bed morphologies are analyzed by their velocity probability density distributions, transient dispersion behavior, and the dependence of asymptotic transverse dispersion coefficients on the mobile phase velocity. In our work, the spherical particles, the monolith skeleton, and the cylindrical pillars are all treated as impermeable solid phase (nonporous) and the tracer is unretained, to focus on the impact of microscopic order on flow and (particularly transverse) hydrodynamic dispersion in the interstitial void space. The microscopic order of the pillar arrays causes their velocity probability density distributions to start and end abruptly, their transient dispersion coefficients to oscillate, and the asymptotic transverse dispersion coefficients to plateau out of initial power law behavior. The microscopically disordered beds, by contrast, follow power law behavior over the whole investigated velocity range, for which we present refined equations (i.e., Eq.(13) and the data in Table 2 for the polydisperse sphere packings; Eq.(17) for the silica monolith). The bulk bed morphologies and their intrinsic differences addressed in this work determine how efficient a bed can relax the transverse concentration gradients caused by wall effects, which exist in all confined separation media used in chromatographic practice. Whereas the effect of diffusion on transverse dispersion decreases and ultimately disappears at increasing velocity with the microscopically
Microcomputers: Statistical Analysis Software. Evaluation Guide Number 5.
Gray, Peter J.
This guide discusses six sets of features to examine when purchasing a microcomputer-based statistics program: hardware requirements; data management; data processing; statistical procedures; printing; and documentation. While the current statistical packages have several negative features, they are cost saving and convenient for small to moderate…
Pethe, Shirish A.; Kaul, Ashwani; Dhere, Neelkanth G.
2008-08-01
Current accelerated tests of photovoltaic (PV) modules mostly prevent infant mortality but cannot duplicate changes occurring in the field nor can predict useful lifetime. Therefore, monitoring of field-deployed PV modules was undertaken at FSEC with goals to assess their performance in hot and humid climate under high voltage operation and to correlate the PV performance with the meteorological parameters. This paper presents performance analysis of U.S. Company manufactured thin film a-Si:H PV modules that are encapsulated using flexible front sheets and framed to be outdoor tested. Statistical data analysis of PV parameters along with meteorological parameters, monitored continuously, is carried out on regular basis with PVUSA type regression analysis. Current-voltage (I-V) characteristic of module arrays that are obtained periodically complement the continuous data monitoring. Moreover, effect of high voltage bias and ambient parameters on leakage current in PV modules on individual modules is studied. Any degradation occurring during initial 18 months could not be assessed due to data acquisition and hurricane problems. No significant degradation was observed in the performance of PV modules during the subsequent 30-months. It is planned to continue this study for a prolonged period so as to serve as basis for their long-term warranties.
Application of Integration of Spatial Statistical Analysis with GIS to Regional Economic Analysis
Institute of Scientific and Technical Information of China (English)
CHEN Fei; DU Daosheng
2004-01-01
This paper summarizes a few spatial statistical analysis methods for to measuring spatial autocorrelation and spatial association, discusses the criteria for the identification of spatial association by the use of global Moran Coefficient, Local Moran and Local Geary. Furthermore, a user-friendly statistical module, combining spatial statistical analysis methods with GIS visual techniques, is developed in Arcview using Avenue. An example is also given to show the usefulness of this module in identifying and quantifying the underlying spatial association patterns between economic units.
Regularization and Migration Policy in Europe
Directory of Open Access Journals (Sweden)
Philippe de Bruycker
2001-05-01
Full Text Available The following pages present, in a general way, the contents of Regularization of illegal immigrants in the European Union, which includes a comparative synthesis and statistical information for each of the eight countries involved; a description of actions since the beginning of the year 2000; and a systematic analysis of the different categories of foreigners, the types of regularization carried out, and the rules that have governed these actions.In relation to regularization, the author considers the political coherence of the actions taken by the member states as well as how they relate to two ever more crucial aspects of immigration policy –the integration of legal resident immigrants and the fight againstillegal immigration in the context of a control of migratory flows.
Statistical Analysis of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2013-01-31
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Shine, E.
2012-03-14
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their
Statistical Analysis Of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2012-08-01
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
RFI detection by automated feature extraction and statistical analysis
Winkel, B.; Kerp, J.; Stanko, S.
2007-01-01
In this paper we present an interference detection toolbox consisting of a high dynamic range Digital Fast-Fourier-Transform spectrometer (DFFT, based on FPGA-technology) and data analysis software for automated radio frequency interference (RFI) detection. The DFFT spectrometer allows high speed data storage of spectra on time scales of less than a second. The high dynamic range of the device assures constant calibration even during extremely powerful RFI events. The software uses an algorithm which performs a two-dimensional baseline fit in the time-frequency domain, searching automatically for RFI signals superposed on the spectral data. We demonstrate, that the software operates successfully on computer-generated RFI data as well as on real DFFT data recorded at the Effelsberg 100-m telescope. At 21-cm wavelength RFI signals can be identified down to the 4σ_rms level. A statistical analysis of all RFI events detected in our observational data revealed that: (1) mean signal strength is comparable to the astronomical line emission of the Milky Way, (2) interferences are polarised, (3) electronic devices in the neighbourhood of the telescope contribute significantly to the RFI radiation. We also show that the radiometer equation is no longer fulfilled in presence of RFI signals.
RFI detection by automated feature extraction and statistical analysis
Winkel, B; Stanko, S; Winkel, Benjamin; Kerp, Juergen; Stanko, Stephan
2006-01-01
In this paper we present an interference detection toolbox consisting of a high dynamic range Digital Fast-Fourier-Transform spectrometer (DFFT, based on FPGA-technology) and data analysis software for automated radio frequency interference (RFI) detection. The DFFT spectrometer allows high speed data storage of spectra on time scales of less than a second. The high dynamic range of the device assures constant calibration even during extremely powerful RFI events. The software uses an algorithm which performs a two-dimensional baseline fit in the time-frequency domain, searching automatically for RFI signals superposed on the spectral data. We demonstrate, that the software operates successfully on computer-generated RFI data as well as on real DFFT data recorded at the Effelsberg 100-m telescope. At 21-cm wavelength RFI signals can be identified down to the 4-sigma level. A statistical analysis of all RFI events detected in our observational data revealed that: (1) mean signal strength is comparable to the a...
Statistical analysis of plasma thermograms measured by differential scanning calorimetry.
Fish, Daniel J; Brewood, Greg P; Kim, Jong Sung; Garbett, Nichola C; Chaires, Jonathan B; Benight, Albert S
2010-11-01
Melting curves of human plasma measured by differential scanning calorimetry (DSC), known as thermograms, have the potential to markedly impact diagnosis of human diseases. A general statistical methodology is developed to analyze and classify DSC thermograms to analyze and classify thermograms. Analysis of an acquired thermogram involves comparison with a database of empirical reference thermograms from clinically characterized diseases. Two parameters, a distance metric, P, and correlation coefficient, r, are combined to produce a 'similarity metric,' ρ, which can be used to classify unknown thermograms into pre-characterized categories. Simulated thermograms known to lie within or fall outside of the 90% quantile range around a median reference are also analyzed. Results verify the utility of the methods and establish the apparent dynamic range of the metric ρ. Methods are then applied to data obtained from a collection of plasma samples from patients clinically diagnosed with SLE (lupus). High correspondence is found between curve shapes and values of the metric ρ. In a final application, an elementary classification rule is implemented to successfully analyze and classify unlabeled thermograms. These methods constitute a set of powerful yet easy to implement tools for quantitative classification, analysis and interpretation of DSC plasma melting curves.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
A Morphological and Statistical Analysis of Ansae in Barred Galaxies
Martinez-Valpuesta, I; Buta, R
2007-01-01
Many barred galaxies show a set of symmetric enhancements at the ends of the stellar bar, called {\\it ansae}, or the ``handles'' of the bar. The ansa bars have been in the literature for some decades, but their origin has still not been specifically addressed, although, they could be related to the growth process of bars. But even though ansae have been known for a long time, no statistical analysis of their relative frequency of occurrence has been performed yet. Similarly, there has been no study of the varieties in morphology of ansae even though significant morphological variations are known to characterise the features. In this paper, we make a quantitative analysis of the occurrence of ansae in barred galaxies, making use of {\\it The de Vaucouleurs Atlas of Galaxies} by Buta and coworkers. We find that $\\sim 40%$ of SB0's show ansae in their bars, thus confirming that ansae are common features in barred lenticulars. The ansa frequency decreases dramatically with later types, and hardly any ansae are fou...
Statistical analysis of the operating parameters which affect cupola emissions
Energy Technology Data Exchange (ETDEWEB)
Davis, J.W.; Draper, A.B.
1977-12-01
A sampling program was undertaken to determine the operating parameters which affected air pollution emission from gray iron foundry cupolas. The experimental design utilized the analysis of variance routine. Four independent variables were selected for examination on the basis of previous work reported in the literature. These were: (1) blast rate; (2) iron-coke ratio; (3) blast temperature; and (4) cupola size. The last variable was chosen since it most directly affects melt rate. Emissions from cupolas for which concern has been expressed are particle matter and carbon monoxide. The dependent variables were, therefore, particle loading, particle size distribution, and carbon monoxide concentration. Seven production foundries were visited and samples taken under conditions prescribed by the experimental plan. The data obtained from these tests were analyzed using the analysis of variance and other statistical techniques where applicable. The results indicated that blast rate, blast temperature, and cupola size affected particle emissions and the latter two also affected the particle size distribution. The particle size information was also unique in that it showed a consistent particle size distribution at all seven foundaries with a sizable fraction of the particles less than 1.0 micrometers in diameter.
Criminal victimization in Ukraine: analysis of statistical data
Directory of Open Access Journals (Sweden)
Serhiy Nezhurbida
2007-12-01
Full Text Available The article is based on the analysis of statistical data provided by law-enforcement, judicial and other bodies of Ukraine. The given analysis allows us to give an accurate quantity of a current status of crime victimization in Ukraine, to characterize its basic features (level, rate, structure, dynamics, and etc.. L’article se concentre sur l’analyse des données statystiques fournies par les institutions de contrôle sociale (forces de police et magistrature et par d’autres organes institutionnels ukrainiens. Les analyses effectuées attirent l'attention sur la situation actuelle des victimes du crime en Ukraine et aident à délinéer leur principales caractéristiques (niveau, taux, structure, dynamiques, etc.L’articolo si basa sull’analisi dei dati statistici forniti dalle agenzie del controllo sociale (forze dell'ordine e magistratura e da altri organi istituzionali ucraini. Le analisi effettuate forniscono molte informazioni sulla situazione attuale delle vittime del crimine in Ucraina e aiutano a delinearne le caratteristiche principali (livello, tasso, struttura, dinamiche, ecc..
Higher order statistical frequency domain decomposition for operational modal analysis
Nita, G. M.; Mahgoub, M. A.; Sharyatpanahi, S. G.; Cretu, N. C.; El-Fouly, T. M.
2017-02-01
Experimental methods based on modal analysis under ambient vibrational excitation are often employed to detect structural damages of mechanical systems. Many of such frequency domain methods, such as Basic Frequency Domain (BFD), Frequency Domain Decomposition (FFD), or Enhanced Frequency Domain Decomposition (EFFD), use as first step a Fast Fourier Transform (FFT) estimate of the power spectral density (PSD) associated with the response of the system. In this study it is shown that higher order statistical estimators such as Spectral Kurtosis (SK) and Sample to Model Ratio (SMR) may be successfully employed not only to more reliably discriminate the response of the system against the ambient noise fluctuations, but also to better identify and separate contributions from closely spaced individual modes. It is shown that a SMR-based Maximum Likelihood curve fitting algorithm may improve the accuracy of the spectral shape and location of the individual modes and, when combined with the SK analysis, it provides efficient means to categorize such individual spectral components according to their temporal dynamics as coherent or incoherent system responses to unknown ambient excitations.
Statistical analysis of emotions and opinions at Digg website
Pohorecki, Piotr; Mitrovic, Marija; Paltoglou, Georgios; Holyst, Janusz A
2012-01-01
We performed statistical analysis on data from the Digg.com website, which enables its users to express their opinion on news stories by taking part in forum-like discussions as well as directly evaluate previous posts and stories by assigning so called "diggs". Owing to fact that the content of each post has been annotated with its emotional value, apart from the strictly structural properties, the study also includes an analysis of the average emotional response of the posts commenting the main story. While analysing correlations at the story level, an interesting relationship between the number of diggs and the number of comments received by a story was found. The correlation between the two quantities is high for data where small threads dominate and consistently decreases for longer threads. However, while the correlation of the number of diggs and the average emotional response tends to grow for longer threads, correlations between numbers of comments and the average emotional response are almost zero. ...
Statistical Power Flow Analysis of an Imperfect Ribbed Cylinder
Blakemore, M.; Woodhouse, J.; Hardie, D. J. W.
1999-05-01
Prediction of the noise transmitted from machinery and flow sources on a submarine to the sonar arrays poses a complex problem. Vibrations in the pressure hull provide the main transmission mechanism. The pressure hull is characterised by a very large number of modes over the frequency range of interest (at least 100,000) and by high modal overlap, both of which place its analysis beyond the scope of finite element or boundary element methods. A method for calculating the transmission is presented, which is broadly based on Statistical Energy Analysis, but extended in two important ways: (1) a novel subsystem breakdown which exploits the particular geometry of a submarine pressure hull; (2) explicit modelling of energy density variation within a subsystem due to damping. The method takes account of fluid-structure interaction, the underlying pass/stop band characteristics resulting from the near-periodicity of the pressure hull construction, the effect of vibration isolators such as bulkheads, and the cumulative effect of irregularities (e.g., attachments and penetrations).
Statistical analysis of cone penetration resistance of railway ballast
Saussine, Gilles; Dhemaied, Amine; Delforge, Quentin; Benfeddoul, Selim
2017-06-01
Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd) increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd) increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd) increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd) sensitivity is important and increases with depth. The (qd) distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd) distribution for clean ballast; they increase by 50% the average value of (qd) for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd) distribution.
RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database
Directory of Open Access Journals (Sweden)
Andronescu Mirela
2008-08-01
Full Text Available Abstract Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.
Analysis of Statistical Distributions of Energization Overvoltages of EHV Cables
DEFF Research Database (Denmark)
Ohno, Teruo; Ametani, Akihiro; Bak, Claus Leth
Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived the statist......Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived...... the statistical distribution of energization overvoltages for EHV cables and have made clear their characteristics compared with those of the overhead lines. This paper identifies the causes and physical meanings of the characteristics so that it becomes possible to use the obtained statistical distribution...... for the determination of insulation levels of cable systems....
Le, Cui; Wanxi, Peng; Zhengjun, Sun; Lili, Shang; Guoning, Chen
2014-07-01
Bamboo is a radial gradient variation composite material against parasitology and vector biology, but the vascular bundles in inner layer are evenly distributed. The objective is to determine the regular size pattern and Weibull statistical analysis of the vascular bundle tensile strength in inner layer of Moso bamboo. The size and shape of vascular bundles in inner layer are similar, with an average area about 0.1550 mm2. A statistical evaluation of the tensile strength of vascular bundle was conducted by means of Weibull statistics, the results show that the Weibull modulus m is 6.1121 and the accurate reliability assessment of vascular bundle is determined.
Generating Feature Spaces for Linear Algorithms with Regularized Sparse Kernel Slow Feature Analysis
Böhmer, W.; Grünewälder, S.; Nickisch, H.; Obermayer, K.
2013-01-01
Without non-linear basis functions many problems can not be solved by linear algorithms. This article proposes a method to automatically construct such basis functions with slow feature analysis (SFA). Non-linear optimization of this unsupervised learning method generates an orthogonal basis on the
Norberg, Peder; Gaztanaga, Enrique; Croton, Darren J
2008-01-01
We present a test of different error estimators for 2-point clustering statistics, appropriate for present and future large galaxy redshift surveys. Using an ensemble of very large dark matter LambdaCDM N-body simulations, we compare internal error estimators (jackknife and bootstrap) to external ones (Monte-Carlo realizations). For 3-dimensional clustering statistics, we find that none of the internal error methods investigated are able to reproduce neither accurately nor robustly the errors of external estimators on 1 to 25 Mpc/h scales. The standard bootstrap overestimates the variance of xi(s) by ~40% on all scales probed, but recovers, in a robust fashion, the principal eigenvectors of the underlying covariance matrix. The jackknife returns the correct variance on large scales, but significantly overestimates it on smaller scales. This scale dependence in the jackknife affects the recovered eigenvectors, which tend to disagree on small scales with the external estimates. Our results have important implic...
Statistical Analysis of Data with Non-Detectable Values
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical
Institute of Scientific and Technical Information of China (English)
WANG; Wei
2001-01-01
［1］ Nagaev, A. V., Integral limit theorems for large deviations when Cramer's condition is not fulfilled I, II, Theory Prob. Appl., 1969, 14: 51-64, 193-208.［2］ Nagaev, A. V., Limit theorems for large deviations where Cramer's conditions are violated (In Russian), Izv. Akad. Nauk USSR Ser., Fiz-Mat Nauk., 1969, 7: 17.［3］ Heyde, C. C., A contribution to the theory of large deviations for sums of independent random variables, Z. Wahrscheinlichkeitsth, 1967, 7: 303.［4］ Heyde, C. C., On large deviation probabilities for sums of random variables which are not attracted to the normal law, Ann. Math. Statist., 1967, 38: 1575.［5］ Heyde, C. C., On large deviation probabilities in the case of attraction to a nonnormal stable law, Sanky, 1968, 30: 253.［6］ Nagaev, S. V., Large deviations for sums of independent random variables, in Sixth Prague Conf. on Information Theory, Random Processes and Statistical Decision Functions, Prague: Academic, 1973, 657674.［7］ Nagaev, S. V., Large deviations of sums of independent random variables, Ann. Prob., 1979, 7: 745.［8］ Embrechts, P., Klüppelberg, C., Mikosch, T., Modelling Extremal Events for Insurance and Finance, Berlin-Heidelberg: Springer-Verlag, 1997.［9］ Cline, D. B. H., Hsing, T., Large deviation probabilities for sums and maxima of random variables with heavy or subexponential tails, Preprint, Texas A&M University, 1991.［10］ Klüppelberg, C., Mikosch, T., Large deviations of heavy-tailed random sums with applications to insurance and finance, J. Appl. Prob., 1997, 34: 293.
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
Analysis of Statistical Methods Currently used in Toxicology Journals
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-01-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and in...
Analysis of Statistical Distributions of Energization Overvoltages of EHV Cables
DEFF Research Database (Denmark)
Ohno, Teruo; Ametani, Akihiro; Bak, Claus Leth;
Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived...... the statistical distribution of energization overvoltages for EHV cables and have made clear their characteristics compared with those of the overhead lines. This paper identifies the causes and physical meanings of the characteristics so that it becomes possible to use the obtained statistical distribution...... for the determination of insulation levels of cable systems....
Analysis and Evaluation of Statistical Models for Integrated Circuits Design
Directory of Open Access Journals (Sweden)
Sáenz-Noval J.J.
2011-10-01
Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.
Significance analysis and statistical mechanics: an application to clustering.
Łuksza, Marta; Lässig, Michael; Berg, Johannes
2010-11-26
This Letter addresses the statistical significance of structures in random data: given a set of vectors and a measure of mutual similarity, how likely is it that a subset of these vectors forms a cluster with enhanced similarity among its elements? The computation of this cluster p value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder and multiple-testing statistics in clustering and related problems. In an application to gene expression data, we find a remarkable link between the statistical significance of a cluster and the functional relationships between its genes.
A Statistical Aggregation Engine for Climatology and Trend Analysis
Chapman, D. R.; Simon, T. A.; Halem, M.
2014-12-01
Fundamental climate data records (FCDRs) from satellite instruments often span tens to hundreds of terabytes or even petabytes in scale. These large volumes make it difficult to aggregate or summarize their climatology and climate trends. It is especially cumbersome to supply the full derivation (provenance) of these aggregate calculations. We present a lightweight and resilient software platform, Gridderama that simplifies the calculation of climatology by exploiting the "Data-Cube" topology often present in earth observing satellite records. By using the large array storage (LAS) paradigm, Gridderama allows the analyst to more easily produce a series of aggregate climate data products at progressively coarser spatial and temporal resolutions. Furthermore, provenance tracking and extensive visualization capabilities allow the analyst to track down and correct for data problems such as missing data and outliers that may impact the scientific results. We have developed and applied Gridderama to calculate a trend analysis of 55 Terabytes of AIRS Level 1b infrared radiances, and show statistically significant trending in the greenhouse gas absorption bands as observed by AIRS over the 2003-2012 decade. We will extend this calculation to show regional changes in CO2 concentration from AIRS over the 2003-2012 decade by using a neural network retrieval algorithm.
Statistical Analysis of Loss of Offsite Power Events
Directory of Open Access Journals (Sweden)
Andrija Volkanovski
2016-01-01
Full Text Available This paper presents the results of the statistical analysis of the loss of offsite power events (LOOP registered in four reviewed databases. The reviewed databases include the IRSN (Institut de Radioprotection et de Sûreté Nucléaire SAPIDE database and the GRS (Gesellschaft für Anlagen- und Reaktorsicherheit mbH VERA database reviewed over the period from 1992 to 2011. The US NRC (Nuclear Regulatory Commission Licensee Event Reports (LERs database and the IAEA International Reporting System (IRS database were screened for relevant events registered over the period from 1990 to 2013. The number of LOOP events in each year in the analysed period and mode of operation are assessed during the screening. The LOOP frequencies obtained for the French and German nuclear power plants (NPPs during critical operation are of the same order of magnitude with the plant related events as a dominant contributor. A frequency of one LOOP event per shutdown year is obtained for German NPPs in shutdown mode of operation. For the US NPPs, the obtained LOOP frequency for critical and shutdown mode is comparable to the one assessed in NUREG/CR-6890. Decreasing trend is obtained for the LOOP events registered in three databases (IRSN, GRS, and NRC.
Statistical Analysis of Resistivity Anomalies Caused by Underground Caves
Frid, V.; Averbach, A.; Frid, M.; Dudkinski, D.; Liskevich, G.
2017-03-01
Geophysical prospecting of underground caves being performed on a construction site is often still a challenging procedure. Estimation of a likelihood level of an anomaly found is frequently a mandatory requirement of a project principal due to necessity of risk/safety assessment. However, the methodology of such estimation is not hitherto developed. Aiming to put forward such a methodology the present study (being performed as a part of an underground caves mapping prior to the land development on the site area) consisted of application of electrical resistivity tomography (ERT) together with statistical analysis utilized for the likelihood assessment of underground anomalies located. The methodology was first verified via a synthetic modeling technique and applied to the in situ collected ERT data and then crossed referenced with intrusive investigations (excavation and drilling) for the data verification. The drilling/excavation results showed that the proper discovering of underground caves can be done if anomaly probability level is not lower than 90 %. Such a probability value was shown to be consistent with the modeling results. More than 30 underground cavities were discovered on the site utilizing the methodology.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Utility green pricing programs: A statistical analysis of program effectiveness
Energy Technology Data Exchange (ETDEWEB)
Wiser, Ryan; Olson, Scott; Bird, Lori; Swezey, Blair
2004-02-01
Development of renewable energy. Such programs have grown in number in recent years. The design features and effectiveness of these programs varies considerably, however, leading a variety of stakeholders to suggest specific marketing and program design features that might improve customer response and renewable energy sales. This report analyzes actual utility green pricing program data to provide further insight into which program features might help maximize both customer participation in green pricing programs and the amount of renewable energy purchased by customers in those programs. Statistical analysis is performed on both the residential and non-residential customer segments. Data comes from information gathered through a questionnaire completed for 66 utility green pricing programs in early 2003. The questionnaire specifically gathered data on residential and non-residential participation, amount of renewable energy sold, program length, the type of renewable supply used, program price/cost premiums, types of consumer research and program evaluation performed, different sign-up options available, program marketing efforts, and ancillary benefits offered to participants.
Statistical analysis of CSP plants by simulating extensive meteorological series
Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana
2017-06-01
The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Statistical analysis of the ambiguities in the asteroid period determinations
Butkiewicz-Bąk, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.; Marciniak, A.
2017-09-01
Among asteroids there exist ambiguities in their rotation period determinations. They are due to incomplete coverage of the rotation, noise and/or aliases resulting from gaps between separate lightcurves. To help to remove such uncertainties, basic characteristic of the lightcurves resulting from constraints imposed by the asteroid shapes and geometries of observations should be identified. We simulated light variations of asteroids whose shapes were modelled as Gaussian random spheres, with random orientations of spin vectors and phase angles changed every 5° from 0° to 65°. This produced 1.4 million lightcurves. For each simulated lightcurve, Fourier analysis has been made and the harmonic of the highest amplitude was recorded. From the statistical point of view, all lightcurves observed at phase angles α 0.2 mag, are bimodal. Second most frequently dominating harmonic is the first one, with the 3rd harmonic following right after. For 1 per cent of lightcurves with amplitudes A < 0.1 mag and phase angles α < 40°, 4th harmonic dominates.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-10-02
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Shin, Dongwoo; Kim, Mijong; Song, Hyunjoon
2015-08-01
Although numerous morphologies of MnO nanostructures have been reported, an exact structural analysis and mechanistic study has been lacking. In the present study, the formation of regular MnO octapods was demonstrated in a simple procedure, comprising the thermal decomposition of manganese oleate. Because of their structural uniformity, an ideal three-dimensional model was successfully constructed. The eight arms protruded from the cubic center with tip angles of 38° and surface facets of {311} and {533} with rounded edges. The concentrations of oleate and chloride ions were the determining factors for the octapod formation. Selective coordination of the oleate ions to the {100} faces led to edge growth along the direction, which was then limited by the chloride ions bound to the high-index surface facets. These structural and mechanistic analyses should be helpful for understanding the complex nanostructures and for tuning their structure-related properties.
Fluorescence correlation spectroscopy: Statistical analysis and biological applications
Saffarian, Saveez
2002-01-01
The experimental design and realization of an apparatus which can be used both for single molecule fluorescence detection and also fluorescence correlation and cross correlation spectroscopy is presented. A thorough statistical analysis of the fluorescence correlation functions including the analysis of bias and errors based on analytical derivations has been carried out. Using the methods developed here, the mechanism of binding and cleavage site recognition of matrix metalloproteinases (MMP) for their substrates has been studied. We demonstrate that two of the MMP family members, Collagenase (MMP-1) and Gelatinase A (MMP-2) exhibit diffusion along their substrates, the importance of this diffusion process and its biological implications are discussed. We show through truncation mutants that the hemopexin domain of the MMP-2 plays and important role in the substrate diffusion of this enzyme. Single molecule diffusion of the collagenase MMP-1 has been observed on collagen fibrils and shown to be biased. The discovered biased diffusion would make the MMP-1 molecule an active motor, thus making it the first active motor that is not coupled to ATP hydrolysis. The possible sources of energy for this enzyme and their implications are discussed. We propose that a possible source of energy for the enzyme can be in the rearrangement of the structure of collagen fibrils. In a separate application, using the methods developed here, we have observed an intermediate in the intestinal fatty acid binding protein folding process through the changes in its hydrodynamic radius also the fluctuations in the structure of the IFABP in solution were measured using FCS.
Statistical analysis and optimization of igbt manufacturing flow
Directory of Open Access Journals (Sweden)
Baranov V. V.
2015-02-01
Full Text Available The use of computer simulation, design and optimization of power electronic devices formation technological processes can significantly reduce development time, improve the accuracy of calculations, choose the best options for implementation based on strict mathematical analysis. One of the most common power electronic devices is isolated gate bipolar transistor (IGBT, which combines the advantages of MOSFET and bipolar transistor. The achievement of high requirements for these devices is only possible by optimizing device design and manufacturing process parameters. Therefore important and necessary step in the modern cycle of IC design and manufacturing is to carry out the statistical analysis. Procedure of the IGBT threshold voltage optimization was realized. Through screening experiments according to the Plackett-Burman design the most important input parameters (factors that have the greatest impact on the output characteristic was detected. The coefficients of the approximation polynomial adequately describing the relationship between the input parameters and investigated output characteristics ware determined. Using the calculated approximation polynomial, a series of multiple, in a cycle of Monte Carlo, calculations to determine the spread of threshold voltage values at selected ranges of input parameters deviation were carried out. Combinations of input process parameters values were determined randomly by a normal distribution within a given range of changes. The procedure of IGBT process parameters optimization consist a mathematical problem of determining the value range of the input significant structural and technological parameters providing the change of the IGBT threshold voltage in a given interval. The presented results demonstrate the effectiveness of the proposed optimization techniques.
Allen, Kirk
The Statistics Concept Inventory (SCI) is a multiple choice test designed to assess students' conceptual understanding of topics typically encountered in an introductory statistics course. This dissertation documents the development of the SCI from Fall 2002 up to Spring 2006. The first phase of the project essentially sought to answer the question: "Can you write a test to assess topics typically encountered in introductory statistics?" Book One presents the results utilized in answering this question in the affirmative. The bulk of the results present the development and evolution of the items, primarily relying on objective metrics to gauge effectiveness but also incorporating student feedback. The second phase boils down to: "Now that you have the test, what else can you do with it?" This includes an exploration of Cronbach's alpha, the most commonly-used measure of test reliability in the literature. An online version of the SCI was designed, and its equivalency to the paper version is assessed. Adding an extra wrinkle to the online SCI, subjects rated their answer confidence. These results show a general positive trend between confidence and correct responses. However, some items buck this trend, revealing potential sources of misunderstandings, with comparisons offered to the extant statistics and probability educational research. The third phase is a re-assessment of the SCI: "Are you sure?" A factor analytic study favored a uni-dimensional structure for the SCI, although maintaining the likelihood of a deeper structure if more items can be written to tap similar topics. A shortened version of the instrument is proposed, demonstrated to be able to maintain a reliability nearly identical to that of the full instrument. Incorporating student feedback and a faculty topics survey, improvements to the items and recommendations for further research are proposed. The state of the concept inventory movement is assessed, to offer a comparison to the work presented
Statistics and Analysis of CIAE’s Meteorological Observed Data
Institute of Scientific and Technical Information of China (English)
ZHANG; Liang; CHENG; Wei-ya
2015-01-01
The work analyzes the recent years’meteorological observed data of CIAE site.The suited statistical method is selected for environment condition evaluation.1 Statistical method The data types are stability,wind direction,wind frequency,wind speed,temperature,and
Analysis of room transfer function and reverberant signal statistics
DEFF Research Database (Denmark)
Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn
2008-01-01
smoothing (e.g., as in complex smoothing) with respect to the original RTF statistics. More specifically, the RTF statistics, derived after the complex smoothing calculation, are compared to the original statistics across space inside typical rooms, by varying the source, the receiver position...... and the corresponding ratio of the direct and reverberant signal. In addition, this work examines the statistical quantities for speech and audio signals prior to their reproduction within rooms and when recorded in rooms. Histograms and other statistical distributions are used to compare RTF minima of typical...... “anechoic” and “reverberant” audio speech signals, in order to model the alterations due to room acoustics. The above results are obtained from both in-situ room response measurements and controlled acoustical response simulations....
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
Combined statistical analysis of landslide release and propagation
Mergili, Martin; Rohmaneo, Mohammad; Chu, Hone-Jay
2016-04-01
Statistical methods - often coupled with stochastic concepts - are commonly employed to relate areas affected by landslides with environmental layers, and to estimate spatial landslide probabilities by applying these relationships. However, such methods only concern the release of landslides, disregarding their motion. Conceptual models for mass flow routing are used for estimating landslide travel distances and possible impact areas. Automated approaches combining release and impact probabilities are rare. The present work attempts to fill this gap by a fully automated procedure combining statistical and stochastic elements, building on the open source GRASS GIS software: (1) The landslide inventory is subset into release and deposition zones. (2) We employ a traditional statistical approach to estimate the spatial release probability of landslides. (3) We back-calculate the probability distribution of the angle of reach of the observed landslides, employing the software tool r.randomwalk. One set of random walks is routed downslope from each pixel defined as release area. Each random walk stops when leaving the observed impact area of the landslide. (4) The cumulative probability function (cdf) derived in (3) is used as input to route a set of random walks downslope from each pixel in the study area through the DEM, assigning the probability gained from the cdf to each pixel along the path (impact probability). The impact probability of a pixel is defined as the average impact probability of all sets of random walks impacting a pixel. Further, the average release probabilities of the release pixels of all sets of random walks impacting a given pixel are stored along with the area of the possible release zone. (5) We compute the zonal release probability by increasing the release probability according to the size of the release zone - the larger the zone, the larger the probability that a landslide will originate from at least one pixel within this zone. We
Mascaró, Maite; Sacristán, Ana Isabel; Rufino, Marta M.
2016-01-01
For the past 4 years, we have been involved in a project that aims to enhance the teaching and learning of experimental analysis and statistics, of environmental and biological sciences students, through computational programming activities (using R code). In this project, through an iterative design, we have developed sequences of R-code-based…
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.
Tang, Liansheng Larry; Balakrishnan, N
2011-01-01
The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
Directory of Open Access Journals (Sweden)
Kathy Ahern
2002-09-01
Full Text Available This study investigates triangulation of the findings of a qualitative analysis by applying an exploratory factor analysis to themes identified in a phenomenological study. A questionnaire was developed from a phenomenological analysis of parents' experiences of parenting a child with Developmental Coordination Disorder (DCD. The questionnaire was administered to 114 parents of DCD children and data were analyzed using an exploratory factor analysis. The extracted factors provided support for the validity of the original qualitative analysis, and a commentary on the validity of the process is provided. The emerging description is of the compromises that were necessary to translate qualitative themes into statistical factors, and of the ways in which the statistical analysis suggests further qualitative study.
Probability and Statistics Questions and Tests : a critical analysis
Directory of Open Access Journals (Sweden)
Fabrizio Maturo
2015-06-01
Full Text Available In probability and statistics courses, a popular method for the evaluation of the students is to assess them using multiple choice tests. The use of these tests allows to evaluate certain types of skills such as fast response, short-term memory, mental clarity and ability to compete. In our opinion, the verification through testing can certainly be useful for the analysis of certain aspects, and to speed up the process of assessment, but we should be aware of the limitations of such a standardized procedure and then exclude that the assessments of pupils, classes and schools can be reduced to processing of test results. To prove this thesis, this article argues in detail the main test limits, presents some recent models which have been proposed in the literature and suggests some alternative valuation methods. Quesiti e test di Probabilità e Statistica: un'analisi critica Nei corsi di Probabilità e Statistica, un metodo molto diffuso per la valutazione degli studenti consiste nel sottoporli a quiz a risposta multipla. L'uso di questi test permette di valutare alcuni tipi di abilità come la rapidità di risposta, la memoria a breve termine, la lucidità mentale e l'attitudine a gareggiare. A nostro parere, la verifica attraverso i test può essere sicuramente utile per l'analisi di alcuni aspetti e per velocizzare il percorso di valutazione ma si deve essere consapevoli dei limiti di una tale procedura standardizzata e quindi escludere che le valutazioni di alunni, classi e scuole possano essere ridotte a elaborazioni di risultati di test. A dimostrazione di questa tesi, questo articolo argomenta in dettaglio i limiti principali dei test, presenta alcuni recenti modelli proposti in letteratura e propone alcuni metodi di valutazione alternativi. Parole Chiave: item responce theory, valutazione, test, probabilità
Statistical analysis of simple repeats in the human genome
Piazza, F.; Liò, P.
2005-03-01
The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo- and di-nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which belong the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter-tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families.
Emerging Trends and Statistical Analysis in Computational Modeling in Agriculture
Directory of Open Access Journals (Sweden)
Sunil Kumar
2015-03-01
Full Text Available In this paper the authors have tried to describe emerging trend in computational modelling used in the sphere of agriculture. Agricultural computational modelling with the use of intelligence techniques for computing the agricultural output by providing minimum input data to lessen the time through cutting down the multi locational field trials and also the labours and other inputs is getting momentum. Development of locally suitable integrated farming systems (IFS is the utmost need of the day, particularly in India where about 95% farms are under small and marginal holding size. Optimization of the size and number of the various enterprises to the desired IFS model for a particular set of agro-climate is essential components of the research to sustain the agricultural productivity for not only filling the stomach of the bourgeoning population of the country, but also to enhance the nutritional security and farms return for quality life. Review of literature pertaining to emerging trends in computational modelling applied in field of agriculture is done and described below for the purpose of understanding its trends mechanism behavior and its applications. Computational modelling is increasingly effective for designing and analysis of the system. Computa-tional modelling is an important tool to analyses the effect of different scenarios of climate and management options on the farming systems and its interaction among themselves. Further, authors have also highlighted the applications of computational modeling in integrated farming system, crops, weather, soil, climate, horticulture and statistical used in agriculture which can show the path to the agriculture researcher and rural farming community to replace some of the traditional techniques.
A statistical framework for differential network analysis from microarray data
Directory of Open Access Journals (Sweden)
Datta Somnath
2010-02-01
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
Statistical Design, Models and Analysis for the Job Change Framework.
Gleser, Leon Jay
1990-01-01
Proposes statistical methodology for testing Loughead and Black's "job change thermostat." Discusses choice of target population; relationship between job satisfaction and values, perceptions, and opportunities; and determinants of job change. (SK)
Detailed statistical analysis plan for the pulmonary protection trial
DEFF Research Database (Denmark)
Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H
2014-01-01
BACKGROUND: Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan...
STATISTICAL ANALYSIS OF SOME EXPERIMENTAL FATIGUE TESTS RESULTS
Adrian Stere PARIS; Gheorghe AMZA; Claudiu BABIŞ; Dan Niţoi
2012-01-01
The paper details the results of processing the fatigue data experiments to find the regression function. Application software for statistical processing like ANOVA and regression calculi are properly utilized, with emphasis on popular software like MSExcel and CurveExpert
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Algebraic Monte Carlo precedure reduces statistical analysis time and cost factors
Africano, R. C.; Logsdon, T. S.
1967-01-01
Algebraic Monte Carlo procedure statistically analyzes performance parameters in large, complex systems. The individual effects of input variables can be isolated and individual input statistics can be changed without having to repeat the entire analysis.
Van Wynsberge, Simon; Gilbert, Antoine; Guillemot, Nicolas; Heintz, Tom; Tremblay-Boyer, Laura
2017-07-01
Extensive biological field surveys are costly and time consuming. To optimize sampling and ensure regular monitoring on the long term, identifying informative indicators of anthropogenic disturbances is a priority. In this study, we used 1800 candidate indicators by combining metrics measured from coral, fish, and macro-invertebrate assemblages surveyed from 2006 to 2012 in the vicinity of an ongoing mining project in the Voh-Koné-Pouembout lagoon, New Caledonia. We performed a power analysis to identify a subset of indicators which would best discriminate temporal changes due to a simulated chronic anthropogenic impact. Only 4% of tested indicators were likely to detect a 10% annual decrease of values with sufficient power (>0.80). Corals generally exerted higher statistical power than macro-invertebrates and fishes because of lower natural variability and higher occurrence. For the same reasons, higher taxonomic ranks provided higher power than lower taxonomic ranks. Nevertheless, a number of families of common sedentary or sessile macro-invertebrates and fishes also performed well in detecting changes: Echinometridae, Isognomidae, Muricidae, Tridacninae, Arcidae, and Turbinidae for macro-invertebrates and Pomacentridae, Labridae, and Chaetodontidae for fishes. Interestingly, these families did not provide high power in all geomorphological strata, suggesting that the ability of indicators in detecting anthropogenic impacts was closely linked to reef geomorphology. This study provides a first operational step toward identifying statistically relevant indicators of anthropogenic disturbances in New Caledonia's coral reefs, which can be useful in similar tropical reef ecosystems where little information is available regarding the responses of ecological indicators to anthropogenic disturbances.
Energy Technology Data Exchange (ETDEWEB)
Savoldi Richard, L., E-mail: laura.savoldi@polito.it [Dipartimento Energia, Politecnico di Torino, 10129 Torino (Italy); Bonifetto, R. [Dipartimento Energia, Politecnico di Torino, 10129 Torino (Italy); Zanino, R., E-mail: roberto.zanino@polito.it [Dipartimento Energia, Politecnico di Torino, 10129 Torino (Italy); Corpino, S.; Obiols-Rabasa, G. [Dipartimento di Ingegneria Meccanica e Aerospaziale, Politecnico di Torino, 10129 Torino (Italy); Izquierdo, J. [F4E, Barcelona (Spain); Le Barbier, R.; Utin, Y. [ITER IO, Cadarache (France)
2013-12-15
The 3D steady-state Computational Fluid Dynamics (CFD) analysis of the ITER vacuum vessel (VV) regular sector no. 5 is presented, starting from the CATIA models and using a suite of tools from the commercial software ANSYS FLUENT{sup ®}. The peculiarity of the problem is linked to the wide range of spatial scales involved in the analysis, from the millimeter-size gaps between in-wall shielding (IWS) plates to the more than 10 m height of the VV itself. After performing several simplifications in the geometrical details, a computational mesh with ∼50 million cells is generated and used to compute the steady-state pressure and flow fields from a Reynolds-Averaged Navier–Stokes model with SST k-ω turbulence closure. The coolant mass flow rate turns out to be distributed 10% through the inboard and the remaining 90% through the outboard. The toroidal and poloidal ribs present in the VV structure constitute significant barriers for the flow, giving rise to large recirculation regions. The pressure drop is mainly localized in the inlet and outlet piping.
Institute of Scientific and Technical Information of China (English)
FAN JianXing; YANG HuaZhong; WANG Hui; YAN XiaoLang; HOU ChaoHuan
2007-01-01
Phase noise analysis of an oscillator is implemented with its periodic time-varying small signal state equations by perturbing the autonomous large signal state equations of the oscillator. In this paper, the time domain steady solutions of oscillators are perturbed with traditional regular method; the periodic time-varying Jocobian modulus matrices are decomposed with Sylvester theorem, and on the resulting space spanned by periodic vectors, the conditions under which the oscillator holds periodic steady states with any perturbations are analyzed. In this paper, stochastic calculus is applied to disclose the generation process of phase noise and calculate the phase jitter of the oscillator by injecting a pseudo sinusoidal signal in frequency domain, representing the white noise, and a δ correlation signal in time domain into the oscillator. Applying the principle of frequency modulation, we learned how the power-law and the Lorentzian spectrums are formed. Their relations and the Lorentzian spectrums of harmonics are also worked out. Based on the periodic Jacobian modulus matrix, the simple algorithms for Floquet exponents and phase noise are constructed, as well as a simple case is demonstrated. The analysis difficulties and the future directions for the phase noise of oscillators are also pointed out at the end.
Regularizing portfolio optimization
Still, Susanne; Kondor, Imre
2010-07-01
The optimization of large portfolios displays an inherent instability due to estimation error. This poses a fundamental problem, because solutions that are not stable under sample fluctuations may look optimal for a given sample, but are, in effect, very far from optimal with respect to the average risk. In this paper, we approach the problem from the point of view of statistical learning theory. The occurrence of the instability is intimately related to over-fitting, which can be avoided using known regularization methods. We show how regularized portfolio optimization with the expected shortfall as a risk measure is related to support vector regression. The budget constraint dictates a modification. We present the resulting optimization problem and discuss the solution. The L2 norm of the weight vector is used as a regularizer, which corresponds to a diversification 'pressure'. This means that diversification, besides counteracting downward fluctuations in some assets by upward fluctuations in others, is also crucial because it improves the stability of the solution. The approach we provide here allows for the simultaneous treatment of optimization and diversification in one framework that enables the investor to trade off between the two, depending on the size of the available dataset.
On regular rotating black holes
Torres, R.; Fayos, F.
2017-01-01
Different proposals for regular rotating black hole spacetimes have appeared recently in the literature. However, a rigorous analysis and proof of the regularity of this kind of spacetimes is still lacking. In this note we analyze rotating Kerr-like black hole spacetimes and find the necessary and sufficient conditions for the regularity of all their second order scalar invariants polynomial in the Riemann tensor. We also show that the regularity is linked to a violation of the weak energy conditions around the core of the rotating black hole.
On regular rotating black holes
Torres, Ramon
2016-01-01
Different proposals for regular rotating black hole spacetimes have appeared recently in the literature. However, a rigorous analysis and proof of the regularity of this kind of spacetimes is still lacking. In this note we analyze rotating Kerr-like black hole spacetimes and find the necessary and sufficient conditions for the regularity of all their second order scalar invariants polynomial in the Riemann tensor. We also show that the regularity is linked to a violation of the weak energy conditions around the core of the rotating black hole.
Statistics and data analysis for financial engineering with R examples
Ruppert, David
2015-01-01
The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...
Statistical mechanics analysis of thresholding 1-bit compressed sensing
Xu, Yingying
2016-01-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the L2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of the thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that, fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we ...
Sensitivity Analysis and Statistical Convergence of a Saltating Particle Model
Maldonado, S
2016-01-01
Saltation models provide considerable insight into near-bed sediment transport. This paper outlines a simple, efficient numerical model of stochastic saltation, which is validated against previously published experimental data on saltation in a channel of nearly horizontal bed. Convergence tests are systematically applied to ensure the model is free from statistical errors emanating from the number of particle hops considered. Two criteria for statistical convergence are derived; according to the first criterion, at least $10^3$ hops appear to be necessary for convergent results, whereas $10^4$ saltations seem to be the minimum required in order to achieve statistical convergence in accordance with the second criterion. Two empirical formulae for lift force are considered: one dependent on the slip (relative) velocity of the particle multiplied by the vertical gradient of the horizontal flow velocity component; the other dependent on the difference between the squares of the slip velocity components at the to...
Analysis of Alignment Influence on 3-D Anthropometric Statistics
Institute of Scientific and Technical Information of China (English)
CAI Xiuwen; LI Zhizhong; CHANG Chien-Chi; DEMPSEY Patrick
2005-01-01
Three-dimensional (3-D) surface anthropometry can provide much more useful information for many applications such as ergonomic product design than traditional individual body dimension measurements. However, the traditional definition of the percentile calculation is designed only for 1-D anthropometric data estimates. The same approach cannot be applied directly to 3-D anthropometric statistics otherwise it could lead to misinterpretations. In this paper, the influence of alignment references on 3-D anthropometric statistics is analyzed mathematically, which shows that different alignment reference points (for example, landmarks) for translation alignment could result in different object shapes if 3-D anthropometric data are processed for percentile values based on coordinates and that dimension percentile calculations based on coordinate statistics are incompatible with those traditionally based on individual dimensions.
Statistical analysis of natural disasters and related losses
Pisarenko, VF
2014-01-01
The study of disaster statistics and disaster occurrence is a complicated interdisciplinary field involving the interplay of new theoretical findings from several scientific fields like mathematics, physics, and computer science. Statistical studies on the mode of occurrence of natural disasters largely rely on fundamental findings in the statistics of rare events, which were derived in the 20th century. With regard to natural disasters, it is not so much the fact that the importance of this problem for mankind was recognized during the last third of the 20th century - the myths one encounters in ancient civilizations show that the problem of disasters has always been recognized - rather, it is the fact that mankind now possesses the necessary theoretical and practical tools to effectively study natural disasters, which in turn supports effective, major practical measures to minimize their impact. All the above factors have resulted in considerable progress in natural disaster research. Substantial accrued ma...
Analysis of linear weighted order statistics CFAR algorithm
Institute of Scientific and Technical Information of China (English)
孟祥伟; 关键; 何友
2004-01-01
CFAR technique is widely used in radar targets detection fields. Traditional algorithm is cell averaging (CA),which can give a good detection performance in a relatively ideal environment. Recently, censoring technique is adopted to make the detector perform robustly. Ordered statistic (OS) and trimmed mean (TM) methods are proposed. TM methods treat the reference samples which participate in clutter power estimates equally, but this processing will not realize the effective estimates of clutter power. Therefore, in this paper a quasi best weighted (Q BW) order statistics algorithm is presented. In special cases, QBW reduces to CA and the censored mean level detector (CMLD).
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Statistical Analysis of Q-matrix Based Diagnostic Classification Models
Chen, Yunxiao; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang
2014-01-01
Diagnostic classification models have recently gained prominence in educational assessment, psychiatric evaluation, and many other disciplines. Central to the model specification is the so-called Q-matrix that provides a qualitative specification of the item-attribute relationship. In this paper, we develop theories on the identifiability for the Q-matrix under the DINA and the DINO models. We further propose an estimation procedure for the Q-matrix through the regularized maximum likelihood. The applicability of this procedure is not limited to the DINA or the DINO model and it can be applied to essentially all Q-matrix based diagnostic classification models. Simulation studies are conducted to illustrate its performance. Furthermore, two case studies are presented. The first case is a data set on fraction subtraction (educational application) and the second case is a subsample of the National Epidemiological Survey on Alcohol and Related Conditions concerning the social anxiety disorder (psychiatric application). PMID:26294801
A Practical Application of Statistical Gap Analysis in National Park Management in Costa Rica
Directory of Open Access Journals (Sweden)
Aguirre González, Juan Antonio
2009-04-01
Full Text Available If the tourism growth predicted materialized as tourism for Costa Rica protected areas would see major increases. A study conducted in Volcan Poas National Park and Volcan Turrialba National Park two of Costa Rica leading volcanic crater parks was undertaken to make available to national parks and protected areas managers, a procedure, that could be use: to measure using an adapted form of the expectations disconfirmation theory the satisfaction of visitors to Costa Rica national parks, and to evaluate if the results could be used for establishing the areas of the park infrastructure, services and recreational options that needed improvement and management decisions to enhance visitor's satisfaction. The sample included 1414 surveys The findings indicates that the procedure adapted base on the expectations-disconfirmation model was proven helpful in: a getting the information to help “zero in”, the man-agement decisions in the short and medium term and for the development of the Tourist Management Plans that is to say being developed in the 2 sites, b guiding park managers in the resource allocation process, under the conditions of scarcity that are so common in developing countries, c facilitating regular monitoring of the conditions, with a simple and quick methodology that can be used for “day to day” decisions and more sophisticated statistical analysis d identifying the areas in the management of protected areas that need further analysis and in that way is contributing to the development of the long term socio-economic research programs in national parks, e the “real” importance of the information and education activities in national parks, combination of activities that seems to be critical to enhance “consumer satisfaction” among the visitors to national parks everywhere and particularly as a means of understanding whether visitors needs and expectations are met, whether they receive what they should and as a context for
Bayesian statistical analysis of censored data in geotechnical engineering
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans
2000-01-01
The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...
Statistical analysis of lightning electric field measured under Malaysian condition
Salimi, Behnam; Mehranzamir, Kamyar; Abdul-Malek, Zulkurnain
2014-02-01
Lightning is an electrical discharge during thunderstorms that can be either within clouds (Inter-Cloud), or between clouds and ground (Cloud-Ground). The Lightning characteristics and their statistical information are the foundation for the design of lightning protection system as well as for the calculation of lightning radiated fields. Nowadays, there are various techniques to detect lightning signals and to determine various parameters produced by a lightning flash. Each technique provides its own claimed performances. In this paper, the characteristics of captured broadband electric fields generated by cloud-to-ground lightning discharges in South of Malaysia are analyzed. A total of 130 cloud-to-ground lightning flashes from 3 separate thunderstorm events (each event lasts for about 4-5 hours) were examined. Statistical analyses of the following signal parameters were presented: preliminary breakdown pulse train time duration, time interval between preliminary breakdowns and return stroke, multiplicity of stroke, and percentages of single stroke only. The BIL model is also introduced to characterize the lightning signature patterns. Observations on the statistical analyses show that about 79% of lightning signals fit well with the BIL model. The maximum and minimum of preliminary breakdown time duration of the observed lightning signals are 84 ms and 560 us, respectively. The findings of the statistical results show that 7.6% of the flashes were single stroke flashes, and the maximum number of strokes recorded was 14 multiple strokes per flash. A preliminary breakdown signature in more than 95% of the flashes can be identified.
Statistical Lineament Analysis in South Greenland Based on Landsat Imagery
DEFF Research Database (Denmark)
Conradsen, Knut; Nilsson, Gert; Thyrsted, Tage
1986-01-01
Linear features, mapped visually from MSS channel-7 photoprints (1: 1 000 000) of Landsat images from South Greenland, were digitized and analyzed statistically. A sinusoidal curve was fitted to the frequency distribution which was then divided into ten significant classes of azimuthal trends. Maps...
Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis
Magoti, Edwin
2016-01-01
Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…
STATISTICAL ANALYSIS OF SOME EXPERIMENTAL FATIGUE TESTS RESULTS
Directory of Open Access Journals (Sweden)
Adrian Stere PARIS
2012-05-01
Full Text Available The paper details the results of processing the fatigue data experiments to find the regression function. Application software for statistical processing like ANOVA and regression calculi are properly utilized, with emphasis on popular software like MSExcel and CurveExpert
Statistical simulation and counterfactual analysis in social sciences
Directory of Open Access Journals (Sweden)
François Gélineau
2012-06-01
Full Text Available In this paper, we present statistical simulation techniques of interest in substantial interpretation of regression results. Taking stock of recent literature on causality, we argue that such techniques can operate within a counterfactual framework. To illustrate, we report findings using post-electoral data on voter turnout.
Statistical Analysis of Human Reliability of Armored Equipment
Institute of Scientific and Technical Information of China (English)
LIU Wei-ping; CAO Wei-guo; REN Jing
2007-01-01
Human errors of seven types of armored equipment, which occur during the course of field test, are statistically analyzed. The human error-to-armored equipment failure ratio is obtained. The causes of human errors are analyzed. The distribution law of human errors is acquired. The ratio of human errors and human reliability index are also calculated.
Modular Regularization Algorithms
DEFF Research Database (Denmark)
Jacobsen, Michael
2004-01-01
The class of linear ill-posed problems is introduced along with a range of standard numerical tools and basic concepts from linear algebra, statistics and optimization. Known algorithms for solving linear inverse ill-posed problems are analyzed to determine how they can be decomposed into indepen......The class of linear ill-posed problems is introduced along with a range of standard numerical tools and basic concepts from linear algebra, statistics and optimization. Known algorithms for solving linear inverse ill-posed problems are analyzed to determine how they can be decomposed...... into independent modules. These modules are then combined to form new regularization algorithms with other properties than those we started out with. Several variations are tested using the Matlab toolbox MOORe Tools created in connection with this thesis. Object oriented programming techniques are explained...... and used to set up the illposed problems in the toolbox. Hereby, we are able to write regularization algorithms that automatically exploit structure in the ill-posed problem without being rewritten explicitly. We explain how to implement a stopping criteria for a parameter choice method based upon...
Statistical Model Analysis of (n, α Cross Sections for 4.0-6.5 MeV Neutrons
Directory of Open Access Journals (Sweden)
Khuukhenkhuu G.
2016-01-01
Full Text Available The statistical model based on the Weisskopf-Ewing theory and constant nuclear temperature approximation is used for systematical analysis of the 4.0-6.5 MeV neutron induced (n, α reaction cross sections. The α-clusterization effect was considered in the (n, α cross sections. A certain dependence of the (n, α cross sections on the relative neutron excess parameter of the target nuclei was observed. The systematic regularity of the (n, α cross sections behaviour is useful to estimate the same reaction cross sections for unstable isotopes. The results of our analysis can be used for nuclear astrophysical calculations such as helium burning and possible branching in the s-process.
Abadjieva, Emilia; Abadjiev, Valentin
2017-06-01
The science that study the processes of motions transformation upon a preliminary defined law between non-coplanar axes (in general case) axes of rotations or axis of rotation and direction of rectilinear translation by three-link mechanisms, equipped with high kinematic joints, can be treated as an independent branch of Applied Mechanics. It deals with mechanical behaviour of these multibody systems in relation to the kinematic and geometric characteristics of the elements of the high kinematic joints, which form them. The object of study here is the process of regular transformation of rotation into translation. The developed mathematical model is subjected to the defined task for studying the sliding velocity vector function at the contact point from the surfaces elements of arbitrary high kinematic joints. The main kinematic characteristics of the studied type motions transformation (kinematic cylinders on level, kinematic relative helices (helical conoids) and kinematic pitch configurations) are defined on the bases of the realized analysis. These features expand the theoretical knowledge, which is the objective of the gearing theory. They also complement the system of kinematic and geometric primitives, that form the mathematical model for synthesis of spatial rack mechanisms.
Interfaces between statistical analysis packages and the ESRI geographic information system
Masuoka, E.
1980-01-01
Interfaces between ESRI's geographic information system (GIS) data files and real valued data files written to facilitate statistical analysis and display of spatially referenced multivariable data are described. An example of data analysis which utilized the GIS and the statistical analysis system is presented to illustrate the utility of combining the analytic capability of a statistical package with the data management and display features of the GIS.
Statistical traffic modeling of MPEG frame size: Experiments and Analysis
Directory of Open Access Journals (Sweden)
Haniph A. Latchman
2009-12-01
Full Text Available For guaranteed quality of service (QoS and sufficient bandwidth in a communication network which provides an integrated multimedia service, it is important to obtain an analytical and tractable model of the compressed MPEG data. This paper presents a statistical approach to a group of picture (GOP MPEG frame size model to increase network traffic performance in a communication network. We extract MPEG frame data from commercial DVD movies and make probability histograms to analyze the statistical characteristics of MPEG frame data. Six candidates of probability distributions are considered here and their parameters are obtained from the empirical data using the maximum likelihood estimation (MLE. This paper shows that the lognormal distribution is the best fitting model of MPEG-2 total frame data.
PERFORMANCE ANALYSIS OF SECOND-ORDER STATISTICS FOR CYCLOSTATIONARY SIGNALS
Institute of Scientific and Technical Information of China (English)
姜鸣; 陈进
2002-01-01
The second-order statistics for cyclostationary signals were introduced, and their performance were discussed. It especially researched the time lag characteristic of the cyclic autocorrelation function and spectral correlation characteristic of spectral correlation density function. It was pointed out that those functions can be available to extract the time-vary information of the kind of non-stationary signals. Using the relations of time lag-cyclic frequency and frequency-cyclic frequency independently, vibration signals of a rolling element bearing measured on test bed were analyzed. The results indicate that the second-order cyclostationary statistics might provide a powerful tool for the feature extracting and fault diagnosis of rolling element bearing.
A Statistical Framework for the Functional Analysis of Metagenomes
Energy Technology Data Exchange (ETDEWEB)
Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.
2008-10-01
Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.
STATISTICAL ANALYSYS OF THE SCFE OF A BRAZILAN MINERAL COAL
Directory of Open Access Journals (Sweden)
DARIVA Cláudio
1997-01-01
Full Text Available The influence of some process variables on the productivity of the fractions (liquid yield times fraction percent obtained from SCFE of a Brazilian mineral coal using isopropanol and ethanol as primary solvents is analyzed using statistical techniques. A full factorial 23 experimental design was adopted to investigate the effects of process variables (temperature, pressure and cosolvent concentration on the extraction products. The extracts were analyzed by the Preparative Liquid Chromatography-8 fractions method (PLC-8, a reliable, non destructive solvent fractionation method, especially developed for coal-derived liquids. Empirical statistical modeling was carried out in order to reproduce the experimental data. Correlations obtained were always greater than 0.98. Four specific process criteria were used to allow process optimization. Results obtained show that it is not possible to maximize both extract productivity and purity (through the minimization of heavy fraction content simultaneously by manipulating the mentioned process variables.
Statistical approaches for the analysis of DNA methylation microarray data.
Siegmund, Kimberly D
2011-06-01
Following the rapid development and adoption in DNA methylation microarray assays, we are now experiencing a growth in the number of statistical tools to analyze the resulting large-scale data sets. As is the case for other microarray applications, biases caused by technical issues are of concern. Some of these issues are old (e.g., two-color dye bias and probe- and array-specific effects), while others are new (e.g., fragment length bias and bisulfite conversion efficiency). Here, I highlight characteristics of DNA methylation that suggest standard statistical tools developed for other data types may not be directly suitable. I then describe the microarray technologies most commonly in use, along with the methods used for preprocessing and obtaining a summary measure. I finish with a section describing downstream analyses of the data, focusing on methods that model percentage DNA methylation as the outcome, and methods for integrating DNA methylation with gene expression or genotype data.
Ambiguity and nonidentifiability in the statistical analysis of neural codes
Amarasingham, Asohan; Geman, Stuart; Harrison, Matthew T.
2015-01-01
Many experimental studies of neural coding rely on a statistical interpretation of the theoretical notion of the rate at which a neuron fires spikes. For example, neuroscientists often ask, “Does a population of neurons exhibit more synchronous spiking than one would expect from the covariability of their instantaneous firing rates?” For another example, “How much of a neuron’s observed spiking variability is caused by the variability of its instantaneous firing rate, and how much is caused by spike timing variability?” However, a neuron’s theoretical firing rate is not necessarily well-defined. Consequently, neuroscientific questions involving the theoretical firing rate do not have a meaning in isolation but can only be interpreted in light of additional statistical modeling choices. Ignoring this ambiguity can lead to inconsistent reasoning or wayward conclusions. We illustrate these issues with examples drawn from the neural-coding literature. PMID:25934918
Statistical analysis of motion contrast in optical coherence tomography angiography
Cheng, Yuxuan; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng
2015-01-01
Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific researches. In this work, based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential (AD) and complex differential (CD) Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed in this work, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.
Common misconceptions about data analysis and statistics1
Motulsky, Harvey J
2015-01-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Statistics Analysis: FDI Obsorbing in Jan-Jun
Institute of Scientific and Technical Information of China (English)
Jovi Shu
2008-01-01
@@ According to China's Customs' statistics,from January to May 2008,the import and export volume of foreign-invested enterprises totaled US＄563.612 billion,an increase of 21.28 percent over the same period of last year,5.02 percent lower than the growth rate of the country(26.30 percent)in the same period,accounting for 55.69 percent of the total import and export of the country.(See Chart 1)
Statistical Mechanical Analysis of Compressed Sensing Utilizing Correlated Compression Matrix
Takeda, Koujin
2010-01-01
We investigate a reconstruction limit of compressed sensing for a reconstruction scheme based on the L1-norm minimization utilizing a correlated compression matrix with a statistical mechanics method. We focus on the compression matrix modeled as the Kronecker-type random matrix studied in research on multi-input multi-output wireless communication systems. We found that strong one-dimensional correlations between expansion bases of original information slightly degrade reconstruction performance.
Statistical Mechanics Analysis of LDPC Coding in MIMO Gaussian Channels
Alamino, Roberto C.; Saad, David
2007-01-01
Using analytical methods of statistical mechanics, we analyse the typical behaviour of a multiple-input multiple-output (MIMO) Gaussian channel with binary inputs under LDPC network coding and joint decoding. The saddle point equations for the replica symmetric solution are found in particular realizations of this channel, including a small and large number of transmitters and receivers. In particular, we examine the cases of a single transmitter, a single receiver and the symmetric and asymm...
Learning to Translate: A Statistical and Computational Analysis
Directory of Open Access Journals (Sweden)
Marco Turchi
2012-01-01
Full Text Available We present an extensive experimental study of Phrase-based Statistical Machine Translation, from the point of view of its learning capabilities. Very accurate Learning Curves are obtained, using high-performance computing, and extrapolations of the projected performance of the system under different conditions are provided. Our experiments confirm existing and mostly unpublished beliefs about the learning capabilities of statistical machine translation systems. We also provide insight into the way statistical machine translation learns from data, including the respective influence of translation and language models, the impact of phrase length on performance, and various unlearning and perturbation analyses. Our results support and illustrate the fact that performance improves by a constant amount for each doubling of the data, across different language pairs, and different systems. This fundamental limitation seems to be a direct consequence of Zipf law governing textual data. Although the rate of improvement may depend on both the data and the estimation method, it is unlikely that the general shape of the learning curve will change without major changes in the modeling and inference phases. Possible research directions that address this issue include the integration of linguistic rules or the development of active learning procedures.
The Digital Divide in Romania – A Statistical Analysis
Directory of Open Access Journals (Sweden)
Daniela BORISOV
2012-06-01
Full Text Available The digital divide is a subject of major importance in the current economic circumstances in which Information and Communication Technologies (ICT are seen as a significant determinant of increasing the domestic competitiveness and contribute to better life quality. Latest international reports regarding various aspects of ICT usage in modern society reveal a decrease of overall digital disparity towards the average trends of the worldwide ITC’s sector – this relates to latest advances of mobile and computer penetration rates, both for personal use and for households/ business. In Romania, the low starting point in the development of economy and society in the ICT direction was, in some extent, compensated by the rapid annual growth of the last decade. Even with these dynamic developments, the statistical data still indicate poor positions in European Union hierarchy; in this respect, the prospects of a rapid recovery of the low performance of the Romanian ICT endowment and usage and the issue continue to be regarded as a challenge for progress in economic and societal terms. The paper presents several methods for assessing the current state of ICT related aspects in terms of Internet usage based on the latest data provided by international databases. The current position of Romanian economy is judged according to several economy using statistical methods based on variability measurements: the descriptive statistics indicators, static measures of disparities and distance metrics.
Statistical mechanics analysis of thresholding 1-bit compressed sensing
Xu, Yingying; Kabashima, Yoshiyuki
2016-08-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the l 2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we develop a heuristic that adaptively tunes the threshold parameter based on the frequency of positive (or negative) values in the binary outputs. Numerical experiments show that the heuristic exhibits satisfactory performance while incurring low computational cost.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
Using multivariate statistical analysis to assess changes in water ...
African Journals Online (AJOL)
analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of ... International studies with regard to impacts on aquatic systems .... frequently used to assess for the impact of acidic deposition on.
Mazumdar, Madhu; Banerjee, Samprit; Van Epps, Heather L
2010-01-01
A majority of original articles published in biomedical journals include some form of statistical analysis. Unfortunately, many of the articles contain errors in statistical design and/or analysis. These errors are worrisome, as the misuse of statistics jeopardizes the process of scientific discovery and the accumulation of scientific knowledge. To help avoid these errors and improve statistical reporting, four approaches are suggested: (1) development of guidelines for statistical reporting that could be adopted by all journals, (2) improvement in statistics curricula in biomedical research programs with an emphasis on hands-on teaching by biostatisticians, (3) expansion and enhancement of biomedical science curricula in statistics programs, and (4) increased participation of biostatisticians in the peer review process along with the adoption of more rigorous journal editorial policies regarding statistics. In this chapter, we provide an overview of these issues with emphasis to the field of molecular biology and highlight the need for continuing efforts on all fronts.
Kotula, Paul G; Keenan, Michael R
2006-12-01
Multivariate statistical analysis methods have been applied to scanning transmission electron microscopy (STEM) energy-dispersive X-ray spectral images. The particular application of the multivariate curve resolution (MCR) technique provides a high spectral contrast view of the raw spectral image. The power of this approach is demonstrated with a microelectronics failure analysis. Specifically, an unexpected component describing a chemical contaminant was found, as well as a component consistent with a foil thickness change associated with the focused ion beam specimen preparation process. The MCR solution is compared with a conventional analysis of the same spectral image data set.
A COMPARATIVE STATISTICAL ANALYSIS OF RICE CULTIVARS DATA
Directory of Open Access Journals (Sweden)
Mugemangango Cyprien
2012-12-01
Full Text Available In this paper, rice cultivars data have been analysed by three different statisticaltechniques viz. Split-plot analysis in RBD, two-factor factorial analysis in RBD and analysis oftwo-way classified data with several observations per cell. The powers of the tests under differentmethods of analysis have been calculated. The method of two-way classified data with severalobservations per cell is found better followed by two-factor factorial technique in RBD and splitplot analysis for analyzing the given data.
Research on the integrative strategy of spatial statistical analysis of GIS
Xie, Zhong; Han, Qi Juan; Wu, Liang
2008-12-01
Presently, the spacial social and natural phenomenon is studied by both the GIS technique and statistics methods. However, plenty of complex practical applications restrict these research methods. The data models and technologies exploited are full of special localization. This paper firstly sums up the requirement of spacial statistical analysis. On the base of the requirement, the universal spatial statistical models are transformed into the function tools in statistical GIS system. A pyramidal structure of three layers is brought forward. Therefore, it is feasible to combine the techniques of spacial dada management, searches and visualization in GIS with the methods of processing data in the statistic analysis. It will form an integrative statistical GIS environment with the management, analysis, application and assistant decision-making of spacial statistical information.
Statistical analysis of questionnaires a unified approach based on R and Stata
Bartolucci, Francesco; Gnaldi, Michela
2015-01-01
Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va
JAWS data collection, analysis highlights, and microburst statistics
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Symbolic Data Analysis Conceptual Statistics and Data Mining
Billard, Lynne
2012-01-01
With the advent of computers, very large datasets have become routine. Standard statistical methods don't have the power or flexibility to analyse these efficiently, and extract the required knowledge. An alternative approach is to summarize a large dataset in such a way that the resulting summary dataset is of a manageable size and yet retains as much of the knowledge in the original dataset as possible. One consequence of this is that the data may no longer be formatted as single values, but be represented by lists, intervals, distributions, etc. The summarized data have their own internal s
Statistical analysis of the metrological properties of float glass
Yates, Brian W.; Duffy, Alan M.
2008-08-01
The radius of curvature, slope error, surface roughness and associated height distribution and power spectral density of uncoated commercial float glass samples have been measured in our Canadian Light Source Optical Metrology Facility, using our Micromap-570 surface profiler and long trace profilometer. The statistical differences in these parameters have been investigated between the tin and air sides of float glass. The effect of soaking the float glass in sulfuric acid to try to dissolve the tin contamination has also been investigated, and untreated and post-treatment surface roughness measurements compared. We report the results of our studies on these float glass samples.
Introduction to statistical data analysis for the life sciences
Ekstrom, Claus Thorn
2014-01-01
This text provides a computational toolbox that enables students to analyze real datasets and gain the confidence and skills to undertake more sophisticated analyses. Although accessible with any statistical software, the text encourages a reliance on R. For those new to R, an introduction to the software is available in an appendix. The book also includes end-of-chapter exercises as well as an entire chapter of case exercises that help students apply their knowledge to larger datasets and learn more about approaches specific to the life sciences.
An invariant approach to statistical analysis of shapes
Lele, Subhash R
2001-01-01
INTRODUCTIONA Brief History of MorphometricsFoundations for the Study of Biological FormsDescription of the data SetsMORPHOMETRIC DATATypes of Morphometric DataLandmark Homology and CorrespondenceCollection of Landmark CoordinatesReliability of Landmark Coordinate DataSummarySTATISTICAL MODELS FOR LANDMARK COORDINATE DATAStatistical Models in GeneralModels for Intra-Group VariabilityEffect of Nuisance ParametersInvariance and Elimination of Nuisance ParametersA Definition of FormCoordinate System Free Representation of FormEst
Fatigue Crack Propagation: Probabilistic Modeling and Statistical Analysis.
1988-03-23
School of Physics "Enrico Fermi" (1986) (eds. D.V. Lindley and C.A. Clarotti) Amsterdam: North Holland (with Morris H. DeGroot ) An accelerated life...Festschrift in Honor of Ingram Olkin 1988, Editors: Jim Press & Leon Jay Gleser (with Morris H. DeGroot and Maria J. Bayarri) New York: Springer-Verlag...389, Department of Statistics, Ohio State University (with Morris H. DeGroot ) In this paper, the concepts of comparison of experiments in the context
Statistical distributions of potential interest in ultrasound speckle analysis
Energy Technology Data Exchange (ETDEWEB)
Nadarajah, Saralees [School of Mathematics, University of Manchester, Manchester M60 1QD (United Kingdom)
2007-05-21
Compound statistical modelling of the uncompressed envelope of the backscattered signal has received much interest recently. In this note, a comprehensive collection of models is derived for the uncompressed envelope of the backscattered signal by compounding the Nakagami distribution with 13 flexible families. The corresponding estimation procedures are derived by the method of moments and the method of maximum likelihood. The sensitivity of the models to their various parameters is examined. It is expected that this work could serve as a useful reference and lead to improved modelling of the uncompressed envelope of the backscattered signal. (note)
Statistical analysis of DNT detection using chemically functionalized microcantilever arrays
DEFF Research Database (Denmark)
Bosco, Filippo; Bache, M.; Hwu, E.-T.
2012-01-01
The need for miniaturized and sensitive sensors for explosives detection is increasing in areas such as security and demining. Micrometer sized cantilevers are often used for label-free detection, and have previously been reported to be able to detect explosives. However, only a few measurements...... from 1 to 2 cantilevers have been reported, without any information on repeatability and reliability of the presented data. In explosive detection high reliability is needed and thus a statistical measurement approach needs to be developed and implemented. We have developed a DVD-based read-out system...
Statistical Analysis for the Driving Cycle of Beijing's Bus
Institute of Scientific and Technical Information of China (English)
王震坡; 孙逢春; 王军; 孙立清
2004-01-01
According to the test data of the driving model for Beijing's bus routes, 9 parameters and the actual values of Beijing bus are confirmed to evaluate the driving cycle, 2 ways of establishing driving cycle model are analyzed, the formula of calculating driving cycle is acquired, and the calculating driving cycle model and the statistical driving cycle model for the buses in Beijing urban areas are set up. This study provides scientific basis for selecting the bus type and confirming the design parameters and the running method in Beijing.
Detecting Hidden Encrypted Volume Files via Statistical Analysis
Directory of Open Access Journals (Sweden)
Mario Piccinelli
2015-05-01
Full Text Available Nowadays various software tools have been developed for the purpose of creating encrypted volume files. Many of those tools are open source and freely available on the internet. Because of that, the probability of finding encrypted files which could contain forensically useful information has dramatically increased. While decoding these files without the key is still a major challenge, the simple fact of being able to recognize their existence is now a top priority for every digital forensics investigation. In this paper we will present a statistical approach to find elements of a seized filesystem which have a reasonable chance of containing encrypted data.
Statistical analysis of multivariate atmospheric variables. [cloud cover
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.