Gaussian mixture model of heart rate variability.
Directory of Open Access Journals (Sweden)
Tommaso Costa
Full Text Available Heart rate variability (HRV is an important measure of sympathetic and parasympathetic functions of the autonomic nervous system and a key indicator of cardiovascular condition. This paper proposes a novel method to investigate HRV, namely by modelling it as a linear combination of Gaussians. Results show that three Gaussians are enough to describe the stationary statistics of heart variability and to provide a straightforward interpretation of the HRV power spectrum. Comparisons have been made also with synthetic data generated from different physiologically based models showing the plausibility of the Gaussian mixture parameters.
Modeling text with generalizable Gaussian mixtures
DEFF Research Database (Denmark)
Hansen, Lars Kai; Sigurdsson, Sigurdur; Kolenda, Thomas
2000-01-01
We apply and discuss generalizable Gaussian mixture (GGM) models for text mining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss ...... the relation between supervised and unsupervised learning in the test data. Finally, we implement a novelty detector based on the density model....
A Gaussian Mixture Model for Nulling Pulsars
Kaplan, D. L.; Swiggum, J. K.; Fichtenbauer, T. D. J.; Vallisneri, M.
2018-03-01
The phenomenon of pulsar nulling—where pulsars occasionally turn off for one or more pulses—provides insight into pulsar-emission mechanisms and the processes by which pulsars turn off when they cross the “death line.” However, while ever more pulsars are found that exhibit nulling behavior, the statistical techniques used to measure nulling are biased, with limited utility and precision. In this paper, we introduce an improved algorithm, based on Gaussian mixture models, for measuring pulsar nulling behavior. We demonstrate this algorithm on a number of pulsars observed as part of a larger sample of nulling pulsars, and show that it performs considerably better than existing techniques, yielding better precision and no bias. We further validate our algorithm on simulated data. Our algorithm is widely applicable to a large number of pulsars even if they do not show obvious nulls. Moreover, it can be used to derive nulling probabilities of nulling for individual pulses, which can be used for in-depth studies.
Sparse Gaussian graphical mixture model | Lotsi | Afrika Statistika
African Journals Online (AJOL)
Abstract. This paper considers the problem of networks reconstruction from heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well known that parameter estimation in this context is challenging due to large numbers of variables coupled with the degenerate nature of the likelihood. We propose as ...
Improved Gaussian Mixture Models for Adaptive Foreground Segmentation
DEFF Research Database (Denmark)
Katsarakis, Nikolaos; Pnevmatikakis, Aristodemos; Tan, Zheng-Hua
2016-01-01
Adaptive foreground segmentation is traditionally performed using Stauffer & Grimson’s algorithm that models every pixel of the frame by a mixture of Gaussian distributions with continuously adapted parameters. In this paper we provide an enhancement of the algorithm by adding two important dynamic...
Evaluation of Distance Measures Between Gaussian Mixture Models of MFCCs
DEFF Research Database (Denmark)
Jensen, Jesper Højvang; Ellis, Dan P. W.; Christensen, Mads Græsbøll
2007-01-01
In music similarity and in the related task of genre classification, a distance measure between Gaussian mixture models is frequently needed. We present a comparison of the Kullback-Leibler distance, the earth movers distance and the normalized L2 distance for this application. Although...
Gaussian-input Gaussian mixture model for representing density maps and atomic models.
Kawabata, Takeshi
2018-03-06
A new Gaussian mixture model (GMM) has been developed for better representations of both atomic models and electron microscopy 3D density maps. The standard GMM algorithm employs an EM algorithm to determine the parameters. It accepted a set of 3D points with weights, corresponding to voxel or atomic centers. Although the standard algorithm worked reasonably well; however, it had three problems. First, it ignored the size (voxel width or atomic radius) of the input, and thus it could lead to a GMM with a smaller spread than the input. Second, the algorithm had a singularity problem, as it sometimes stopped the iterative procedure due to a Gaussian function with almost zero variance. Third, a map with a large number of voxels required a long computation time for conversion to a GMM. To solve these problems, we have introduced a Gaussian-input GMM algorithm, which considers the input atoms or voxels as a set of Gaussian functions. The standard EM algorithm of GMM was extended to optimize the new GMM. The new GMM has identical radius of gyration to the input, and does not suddenly stop due to the singularity problem. For fast computation, we have introduced a down-sampled Gaussian functions (DSG) by merging neighboring voxels into an anisotropic Gaussian function. It provides a GMM with thousands of Gaussian functions in a short computation time. We also have introduced a DSG-input GMM: the Gaussian-input GMM with the DSG as the input. This new algorithm is much faster than the standard algorithm. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Automatic image equalization and contrast enhancement using Gaussian mixture modeling.
Celik, Turgay; Tjahjadi, Tardi
2012-01-01
In this paper, we propose an adaptive image equalization algorithm that automatically enhances the contrast in an input image. The algorithm uses the Gaussian mixture model to model the image gray-level distribution, and the intersection points of the Gaussian components in the model are used to partition the dynamic range of the image into input gray-level intervals. The contrast equalized image is generated by transforming the pixels' gray levels in each input interval to the appropriate output gray-level interval according to the dominant Gaussian component and the cumulative distribution function of the input interval. To take account of the hypothesis that homogeneous regions in the image represent homogeneous silences (or set of Gaussian components) in the image histogram, the Gaussian components with small variances are weighted with smaller values than the Gaussian components with larger variances, and the gray-level distribution is also used to weight the components in the mapping of the input interval to the output interval. Experimental results show that the proposed algorithm produces better or comparable enhanced images than several state-of-the-art algorithms. Unlike the other algorithms, the proposed algorithm is free of parameter setting for a given dynamic range of the enhanced image and can be applied to a wide range of image types.
Detecting Clusters in Atom Probe Data with Gaussian Mixture Models.
Zelenty, Jennifer; Dahl, Andrew; Hyde, Jonathan; Smith, George D W; Moody, Michael P
2017-04-01
Accurately identifying and extracting clusters from atom probe tomography (APT) reconstructions is extremely challenging, yet critical to many applications. Currently, the most prevalent approach to detect clusters is the maximum separation method, a heuristic that relies heavily upon parameters manually chosen by the user. In this work, a new clustering algorithm, Gaussian mixture model Expectation Maximization Algorithm (GEMA), was developed. GEMA utilizes a Gaussian mixture model to probabilistically distinguish clusters from random fluctuations in the matrix. This machine learning approach maximizes the data likelihood via expectation maximization: given atomic positions, the algorithm learns the position, size, and width of each cluster. A key advantage of GEMA is that atoms are probabilistically assigned to clusters, thus reflecting scientifically meaningful uncertainty regarding atoms located near precipitate/matrix interfaces. GEMA outperforms the maximum separation method in cluster detection accuracy when applied to several realistically simulated data sets. Lastly, GEMA was successfully applied to real APT data.
Color Texture Segmentation by Decomposition of Gaussian Mixture Model
Czech Academy of Sciences Publication Activity Database
Grim, Jiří; Somol, Petr; Haindl, Michal; Pudil, Pavel
2006-01-01
Roč. 19, č. 4225 (2006), s. 287-296 ISSN 0302-9743. [Iberoamerican Congress on Pattern Recognition. CIARP 2006 /11./. Cancun, 14.11.2006-17.11.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572; GA MŠk 2C06019 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : texture segmentation * gaussian mixture model * EM algorithm Subject RIV: IN - Informatics, Computer Science Impact factor: 0.402, year: 2005 http://library.utia.cas.cz/separaty/historie/grim-color texture segmentation by decomposition of gaussian mixture model.pdf
Efficient speaker verification using Gaussian mixture model component clustering.
Energy Technology Data Exchange (ETDEWEB)
De Leon, Phillip L. (New Mexico State University, Las Cruces, NM); McClanahan, Richard D.
2012-04-01
In speaker verification (SV) systems that employ a support vector machine (SVM) classifier to make decisions on a supervector derived from Gaussian mixture model (GMM) component mean vectors, a significant portion of the computational load is involved in the calculation of the a posteriori probability of the feature vectors of the speaker under test with respect to the individual component densities of the universal background model (UBM). Further, the calculation of the sufficient statistics for the weight, mean, and covariance parameters derived from these same feature vectors also contribute a substantial amount of processing load to the SV system. In this paper, we propose a method that utilizes clusters of GMM-UBM mixture component densities in order to reduce the computational load required. In the adaptation step we score the feature vectors against the clusters and calculate the a posteriori probabilities and update the statistics exclusively for mixture components belonging to appropriate clusters. Each cluster is a grouping of multivariate normal distributions and is modeled by a single multivariate distribution. As such, the set of multivariate normal distributions representing the different clusters also form a GMM. This GMM is referred to as a hash GMM which can be considered to a lower resolution representation of the GMM-UBM. The mapping that associates the components of the hash GMM with components of the original GMM-UBM is referred to as a shortlist. This research investigates various methods of clustering the components of the GMM-UBM and forming hash GMMs. Of five different methods that are presented one method, Gaussian mixture reduction as proposed by Runnall's, easily outperformed the other methods. This method of Gaussian reduction iteratively reduces the size of a GMM by successively merging pairs of component densities. Pairs are selected for merger by using a Kullback-Leibler based metric. Using Runnal's method of reduction, we
Gaussian Mixture Model and Rjmcmc Based RS Image Segmentation
Shi, X.; Zhao, Q. H.
2017-09-01
For the image segmentation method based on Gaussian Mixture Model (GMM), there are some problems: 1) The number of component was usually a fixed number, i.e., fixed class and 2) GMM is sensitive to image noise. This paper proposed a RS image segmentation method that combining GMM with reversible jump Markov Chain Monte Carlo (RJMCMC). In proposed algorithm, GMM was designed to model the distribution of pixel intensity in RS image. Assume that the number of component was a random variable. Respectively build the prior distribution of each parameter. In order to improve noise resistance, used Gibbs function to model the prior distribution of GMM weight coefficient. According to Bayes' theorem, build posterior distribution. RJMCMC was used to simulate the posterior distribution and estimate its parameters. Finally, an optimal segmentation is obtained on RS image. Experimental results show that the proposed algorithm can converge to the optimal number of class and get an ideal segmentation results.
GAUSSIAN MIXTURE MODEL AND RJMCMC BASED RS IMAGE SEGMENTATION
Directory of Open Access Journals (Sweden)
X. Shi
2017-09-01
Full Text Available For the image segmentation method based on Gaussian Mixture Model (GMM, there are some problems: 1 The number of component was usually a fixed number, i.e., fixed class and 2 GMM is sensitive to image noise. This paper proposed a RS image segmentation method that combining GMM with reversible jump Markov Chain Monte Carlo (RJMCMC. In proposed algorithm, GMM was designed to model the distribution of pixel intensity in RS image. Assume that the number of component was a random variable. Respectively build the prior distribution of each parameter. In order to improve noise resistance, used Gibbs function to model the prior distribution of GMM weight coefficient. According to Bayes' theorem, build posterior distribution. RJMCMC was used to simulate the posterior distribution and estimate its parameters. Finally, an optimal segmentation is obtained on RS image. Experimental results show that the proposed algorithm can converge to the optimal number of class and get an ideal segmentation results.
Processing tree point clouds using Gaussian Mixture Models
Directory of Open Access Journals (Sweden)
D. Belton
2013-10-01
Full Text Available While traditionally used for surveying and photogrammetric fields, laser scanning is increasingly being used for a wider range of more general applications. In addition to the issues typically associated with processing point data, such applications raise a number of new complications, such as the complexity of the scenes scanned, along with the sheer volume of data. Consequently, automated procedures are required for processing, and analysing such data. This paper introduces a method for modelling multi-modal, geometrically complex objects in terrestrial laser scanning point data; specifically, the modelling of trees. The model method comprises a number of geometric features in conjunction with a multi-modal machine learning technique. The model can then be used for contextually dependent region growing through separating the tree into its component part at the point level. Subsequently object analysis can be performed, for example, performing volumetric analysis of a tree by removing points associated with leaves. The workflow for this process is as follows: isolate individual trees within the scanned scene, train a Gaussian mixture model (GMM, separate clusters within the mixture model according to exemplar points determined by the GMM, grow the structure of the tree, and then perform volumetric analysis on the structure.
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
International Nuclear Information System (INIS)
Yu, Jie; Chen, Kuilin; Mori, Junichi; Rashid, Mudassir M.
2013-01-01
Optimizing wind power generation and controlling the operation of wind turbines to efficiently harness the renewable wind energy is a challenging task due to the intermittency and unpredictable nature of wind speed, which has significant influence on wind power production. A new approach for long-term wind speed forecasting is developed in this study by integrating GMCM (Gaussian mixture copula model) and localized GPR (Gaussian process regression). The time series of wind speed is first classified into multiple non-Gaussian components through the Gaussian mixture copula model and then Bayesian inference strategy is employed to incorporate the various non-Gaussian components using the posterior probabilities. Further, the localized Gaussian process regression models corresponding to different non-Gaussian components are built to characterize the stochastic uncertainty and non-stationary seasonality of the wind speed data. The various localized GPR models are integrated through the posterior probabilities as the weightings so that a global predictive model is developed for the prediction of wind speed. The proposed GMCM–GPR approach is demonstrated using wind speed data from various wind farm locations and compared against the GMCM-based ARIMA (auto-regressive integrated moving average) and SVR (support vector regression) methods. In contrast to GMCM–ARIMA and GMCM–SVR methods, the proposed GMCM–GPR model is able to well characterize the multi-seasonality and uncertainty of wind speed series for accurate long-term prediction. - Highlights: • A novel predictive modeling method is proposed for long-term wind speed forecasting. • Gaussian mixture copula model is estimated to characterize the multi-seasonality. • Localized Gaussian process regression models can deal with the random uncertainty. • Multiple GPR models are integrated through Bayesian inference strategy. • The proposed approach shows higher prediction accuracy and reliability
Background based Gaussian mixture model lesion segmentation in PET
Energy Technology Data Exchange (ETDEWEB)
Soffientini, Chiara Dolores, E-mail: chiaradolores.soffientini@polimi.it; Baselli, Giuseppe [DEIB, Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan 20133 (Italy); De Bernardi, Elisabetta [Department of Medicine and Surgery, Tecnomed Foundation, University of Milano—Bicocca, Monza 20900 (Italy); Zito, Felicia; Castellani, Massimo [Nuclear Medicine Department, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, via Francesco Sforza 35, Milan 20122 (Italy)
2016-05-15
Purpose: Quantitative {sup 18}F-fluorodeoxyglucose positron emission tomography is limited by the uncertainty in lesion delineation due to poor SNR, low resolution, and partial volume effects, subsequently impacting oncological assessment, treatment planning, and follow-up. The present work develops and validates a segmentation algorithm based on statistical clustering. The introduction of constraints based on background features and contiguity priors is expected to improve robustness vs clinical image characteristics such as lesion dimension, noise, and contrast level. Methods: An eight-class Gaussian mixture model (GMM) clustering algorithm was modified by constraining the mean and variance parameters of four background classes according to the previous analysis of a lesion-free background volume of interest (background modeling). Hence, expectation maximization operated only on the four classes dedicated to lesion detection. To favor the segmentation of connected objects, a further variant was introduced by inserting priors relevant to the classification of neighbors. The algorithm was applied to simulated datasets and acquired phantom data. Feasibility and robustness toward initialization were assessed on a clinical dataset manually contoured by two expert clinicians. Comparisons were performed with respect to a standard eight-class GMM algorithm and to four different state-of-the-art methods in terms of volume error (VE), Dice index, classification error (CE), and Hausdorff distance (HD). Results: The proposed GMM segmentation with background modeling outperformed standard GMM and all the other tested methods. Medians of accuracy indexes were VE <3%, Dice >0.88, CE <0.25, and HD <1.2 in simulations; VE <23%, Dice >0.74, CE <0.43, and HD <1.77 in phantom data. Robustness toward image statistic changes (±15%) was shown by the low index changes: <26% for VE, <17% for Dice, and <15% for CE. Finally, robustness toward the user-dependent volume initialization was
UNSUPERVISED CHANGE DETECTION IN SAR IMAGES USING GAUSSIAN MIXTURE MODELS
Directory of Open Access Journals (Sweden)
E. Kiana
2015-12-01
Full Text Available In this paper, we propose a method for unsupervised change detection in Remote Sensing Synthetic Aperture Radar (SAR images. This method is based on the mixture modelling of the histogram of difference image. In this process, the difference image is classified into three classes; negative change class, positive change class and no change class. However the SAR images suffer from speckle noise, the proposed method is able to map the changes without speckle filtering. To evaluate the performance of this method, two dates of SAR data acquired by Uninhabited Aerial Vehicle Synthetic from an agriculture area are used. Change detection results show better efficiency when compared to the state-of-the-art methods.
Infrared image segmentation based on region of interest extraction with Gaussian mixture modeling
Yeom, Seokwon
2017-05-01
Infrared (IR) imaging has the capability to detect thermal characteristics of objects under low-light conditions. This paper addresses IR image segmentation with Gaussian mixture modeling. An IR image is segmented with Expectation Maximization (EM) method assuming the image histogram follows the Gaussian mixture distribution. Multi-level segmentation is applied to extract the region of interest (ROI). Each level of the multi-level segmentation is composed of the k-means clustering, the EM algorithm, and a decision process. The foreground objects are individually segmented from the ROI windows. In the experiments, various methods are applied to the IR image capturing several humans at night.
Directory of Open Access Journals (Sweden)
Abdenaceur Boudlal
2010-01-01
Full Text Available This article investigates a new method of motion estimation based on block matching criterion through the modeling of image blocks by a mixture of two and three Gaussian distributions. Mixture parameters (weights, means vectors, and covariance matrices are estimated by the Expectation Maximization algorithm (EM which maximizes the log-likelihood criterion. The similarity between a block in the current image and the more resembling one in a search window on the reference image is measured by the minimization of Extended Mahalanobis distance between the clusters of mixture. Performed experiments on sequences of real images have given good results, and PSNR reached 3 dB.
Sholihat, Seli Siti; Murfi, Hendri
2016-01-01
Banks must be able to manage all of banking risk; one of them is operational risk. Banks manage operational risk by calculates estimating operational risk which is known as the economic capital (EC). Loss Distribution Approach (LDA) is a popular method to estimate economic capital(EC).This paper propose Gaussian Mixture Model(GMM) for severity distribution estimation of loss distribution approach(LDA). The result on this research is the value at EC of LDA method using GMM is smaller 2 % -...
Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z
2017-06-01
Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Automatic segmentation of corpus callosum using Gaussian mixture modeling and Fuzzy C means methods.
İçer, Semra
2013-10-01
This paper presents a comparative study of the success and performance of the Gaussian mixture modeling and Fuzzy C means methods to determine the volume and cross-sectionals areas of the corpus callosum (CC) using simulated and real MR brain images. The Gaussian mixture model (GMM) utilizes weighted sum of Gaussian distributions by applying statistical decision procedures to define image classes. In the Fuzzy C means (FCM), the image classes are represented by certain membership function according to fuzziness information expressing the distance from the cluster centers. In this study, automatic segmentation for midsagittal section of the CC was achieved from simulated and real brain images. The volume of CC was obtained using sagittal sections areas. To compare the success of the methods, segmentation accuracy, Jaccard similarity and time consuming for segmentation were calculated. The results show that the GMM method resulted by a small margin in more accurate segmentation (midsagittal section segmentation accuracy 98.3% and 97.01% for GMM and FCM); however the FCM method resulted in faster segmentation than GMM. With this study, an accurate and automatic segmentation system that allows opportunity for quantitative comparison to doctors in the planning of treatment and the diagnosis of diseases affecting the size of the CC was developed. This study can be adapted to perform segmentation on other regions of the brain, thus, it can be operated as practical use in the clinic. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Roch, Marie A; Soldevilla, Melissa S; Burtenshaw, Jessica C; Henderson, E Elizabeth; Hildebrand, John A
2007-03-01
A method for the automatic classification of free-ranging delphinid vocalizations is presented. The vocalizations of short-beaked and long-beaked common (Delphinus delphis and Delphinus capensis), Pacific white-sided (Lagenorhynchus obliquidens), and bottlenose (Tursiops truncatus) dolphins were recorded in a pelagic environment of the Southern California Bight and the Gulf of California over a period of 4 years. Cepstral feature vectors are extracted from call data which contain simultaneous overlapping whistles, burst-pulses, and clicks from a single species. These features are grouped into multisecond segments. A portion of the data is used to train Gaussian mixture models of varying orders for each species. The remaining call data are used to test the performance of the models. Species are predicted based upon probabilistic measures of model similarity with test segment groups having durations between 1 and 25 s. For this data set, 256 mixture Gaussian mixture models and segments of at least 10 s of call data resulted in the best classification results. The classifier predicts the species of groups with 67%-75% accuracy depending upon the partitioning of the training and test data.
Fedorov, A K; Anufriev, M N; Zhirnov, A A; Stepanov, K V; Nesterov, E T; Namiot, D E; Karasik, V E; Pnev, A B
2016-03-01
We propose a novel approach to the recognition of particular classes of non-conventional events in signals from phase-sensitive optical time-domain-reflectometry-based sensors. Our algorithmic solution has two main features: filtering aimed at the de-nosing of signals and a Gaussian mixture model to cluster them. We test the proposed algorithm using experimentally measured signals. The results show that two classes of events can be distinguished with the best-case recognition probability close to 0.9 at sufficient numbers of training samples.
Directory of Open Access Journals (Sweden)
Yunjie Chen
2016-01-01
Full Text Available We propose a novel segmentation method based on regional and nonlocal information to overcome the impact of image intensity inhomogeneities and noise in human brain magnetic resonance images. With the consideration of the spatial distribution of different tissues in brain images, our method does not need preestimation or precorrection procedures for intensity inhomogeneities and noise. A nonlocal information based Gaussian mixture model (NGMM is proposed to reduce the effect of noise. To reduce the effect of intensity inhomogeneity, the multigrid nonlocal Gaussian mixture model (MNGMM is proposed to segment brain MR images in each nonoverlapping multigrid generated by using a new multigrid generation method. Therefore the proposed model can simultaneously overcome the impact of noise and intensity inhomogeneity and automatically classify 2D and 3D MR data into tissues of white matter, gray matter, and cerebral spinal fluid. To maintain the statistical reliability and spatial continuity of the segmentation, a fusion strategy is adopted to integrate the clustering results from different grid. The experiments on synthetic and clinical brain MR images demonstrate the superior performance of the proposed model comparing with several state-of-the-art algorithms.
Assessing clustering strategies for Gaussian mixture filtering a subsurface contaminant model
Liu, Bo
2016-02-03
An ensemble-based Gaussian mixture (GM) filtering framework is studied in this paper in term of its dependence on the choice of the clustering method to construct the GM. In this approach, a number of particles sampled from the posterior distribution are first integrated forward with the dynamical model for forecasting. A GM representation of the forecast distribution is then constructed from the forecast particles. Once an observation becomes available, the forecast GM is updated according to Bayes’ rule. This leads to (i) a Kalman filter-like update of the particles, and (ii) a Particle filter-like update of their weights, generalizing the ensemble Kalman filter update to non-Gaussian distributions. We focus on investigating the impact of the clustering strategy on the behavior of the filter. Three different clustering methods for constructing the prior GM are considered: (i) a standard kernel density estimation, (ii) clustering with a specified mixture component size, and (iii) adaptive clustering (with a variable GM size). Numerical experiments are performed using a two-dimensional reactive contaminant transport model in which the contaminant concentration and the heterogenous hydraulic conductivity fields are estimated within a confined aquifer using solute concentration data. The experimental results suggest that the performance of the GM filter is sensitive to the choice of the GM model. In particular, increasing the size of the GM does not necessarily result in improved performances. In this respect, the best results are obtained with the proposed adaptive clustering scheme.
Gaussian mixture models and semantic gating improve reconstructions from human brain activity
Directory of Open Access Journals (Sweden)
Sanne eSchoenmakers
2015-01-01
Full Text Available Better acquisition protocols and analysis techniques are making it possible to use fMRI to obtain highly detailed visualizations of brain processes. In particular we focus on the reconstruction of natural images from BOLD responses in visual cortex. We expand our linear Gaussian framework for percept decoding with Gaussian mixture models to better represent the prior distribution of natural images. Reconstruction of such images then boils down to probabilistic inference in a hybrid Bayesian network. In our set-up, different mixture components correspond to different character categories. Our framework can automatically infer higher-order semantic categories from lower-level brain areas. Furthermore the framework can gate semantic information from higher-order brain areas to enforce the correct category during reconstruction. When categorical information is not available, we show that automatically learned clusters in the data give a similar improvement in reconstruction. The hybrid Bayesian network leads to highly accurate reconstructions in both supervised and unsupervised settings.
Mixed Platoon Flow Dispersion Model Based on Speed-Truncated Gaussian Mixture Distribution
Directory of Open Access Journals (Sweden)
Weitiao Wu
2013-01-01
Full Text Available A mixed traffic flow feature is presented on urban arterials in China due to a large amount of buses. Based on field data, a macroscopic mixed platoon flow dispersion model (MPFDM was proposed to simulate the platoon dispersion process along the road section between two adjacent intersections from the flow view. More close to field observation, truncated Gaussian mixture distribution was adopted as the speed density distribution for mixed platoon. Expectation maximum (EM algorithm was used for parameters estimation. The relationship between the arriving flow distribution at downstream intersection and the departing flow distribution at upstream intersection was investigated using the proposed model. Comparison analysis using virtual flow data was performed between the Robertson model and the MPFDM. The results confirmed the validity of the proposed model.
An enhanced method for real-time modelling of cardiac related biosignals using Gaussian mixtures.
Alqudah, Ali Mohammad
2017-11-01
Cardiac related biosignals modelling is very important for detecting, classification, compression and transmission of such health-related signals. This paper introduces a new, fast and accurate method for modelling the cardiac related biosignals (ECG and PPG) based on a mixture of Gaussian waves. For any signal, at first, the start and end of the ECG beat or PPG pulse is detected, then the baseline is detected then subtracted from the original signal, after that the signal is divided into two signals positive and negative, each modelled separately then incorporated together to form the modelled signal. The proposed method is applied in the MIMIC, and MIT-BIH Arrhythmia databases available online at PhysioNet.
Clustering gene expression time series data using an infinite Gaussian process mixture model.
McDowell, Ian C; Manandhar, Dinesh; Vockley, Christopher M; Schmid, Amy K; Reddy, Timothy E; Engelhardt, Barbara E
2018-01-01
Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP), which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.
Clustering gene expression time series data using an infinite Gaussian process mixture model.
Directory of Open Access Journals (Sweden)
Ian C McDowell
2018-01-01
Full Text Available Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP, which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.
Directory of Open Access Journals (Sweden)
Natalia A. Tomashenko
2016-11-01
Full Text Available Subject of Research. We study speaker adaptation of deep neural network (DNN acoustic models in automatic speech recognition systems. The aim of speaker adaptation techniques is to improve the accuracy of the speech recognition system for a particular speaker. Method. A novel method for training and adaptation of deep neural network acoustic models has been developed. It is based on using an auxiliary GMM (Gaussian Mixture Models model and GMMD (GMM-derived features. The principle advantage of the proposed GMMD features is the possibility of performing the adaptation of a DNN through the adaptation of the auxiliary GMM. In the proposed approach any methods for the adaptation of the auxiliary GMM can be used, hence, it provides a universal method for transferring adaptation algorithms developed for GMMs to DNN adaptation.Main Results. The effectiveness of the proposed approach was shown by means of one of the most common adaptation algorithms for GMM models – MAP (Maximum A Posteriori adaptation. Different ways of integration of the proposed approach into state-of-the-art DNN architecture have been proposed and explored. Analysis of choosing the type of the auxiliary GMM model is given. Experimental results on the TED-LIUM corpus demonstrate that, in an unsupervised adaptation mode, the proposed adaptation technique can provide, approximately, a 11–18% relative word error reduction (WER on different adaptation sets, compared to the speaker-independent DNN system built on conventional features, and a 3–6% relative WER reduction compared to the SAT-DNN trained on fMLLR adapted features.
Directory of Open Access Journals (Sweden)
Hesam Farsaie Alaie
2012-01-01
Full Text Available We make use of information inside infant’s cry signal in order to identify the infant’s psychological condition. Gaussian mixture models (GMMs are applied to distinguish between healthy full-term and premature infants, and those with specific medical problems available in our cry database. Cry pattern for each pathological condition is created by using adapted boosting mixture learning (BML method to estimate mixture model parameters. In the first experiment, test results demonstrate that the introduced adapted BML method for learning of GMMs has a better performance than conventional EM-based reestimation algorithm as a reference system in multipathological classification task. This newborn cry-based diagnostic system (NCDS extracted Mel-frequency cepstral coefficients (MFCCs as a feature vector for cry patterns of newborn infants. In binary classification experiment, the system discriminated a test infant’s cry signal into one of two groups, namely, healthy and pathological based on MFCCs. The binary classifier achieved a true positive rate of 80.77% and a true negative rate of 86.96% which show the ability of the system to correctly identify healthy and diseased infants, respectively.
ADAPTIVE BACKGROUND DENGAN METODE GAUSSIAN MIXTURE MODELS UNTUK REAL-TIME TRACKING
Directory of Open Access Journals (Sweden)
Silvia Rostianingsih
2008-01-01
Full Text Available Nowadays, motion tracking application is widely used for many purposes, such as detecting traffic jam and counting how many people enter a supermarket or a mall. A method to separate background and the tracked object is required for motion tracking. It will not be hard to develop the application if the tracking is performed on a static background, but it will be difficult if the tracked object is at a place with a non-static background, because the changing part of the background can be recognized as a tracking area. In order to handle the problem an application can be made to separate background where that separation can adapt to change that occur. This application is made to produce adaptive background using Gaussian Mixture Models (GMM as its method. GMM method clustered the input pixel data with pixel color value as it’s basic. After the cluster formed, dominant distributions are choosen as background distributions. This application is made by using Microsoft Visual C 6.0. The result of this research shows that GMM algorithm could made adaptive background satisfactory. This proofed by the result of the tests that succeed at all condition given. This application can be developed so the tracking process integrated in adaptive background maker process. Abstract in Bahasa Indonesia : Saat ini, aplikasi motion tracking digunakan secara luas untuk banyak tujuan, seperti mendeteksi kemacetan dan menghitung berapa banyak orang yang masuk ke sebuah supermarket atau sebuah mall. Sebuah metode untuk memisahkan antara background dan obyek yang di-track dibutuhkan untuk melakukan motion tracking. Membuat aplikasi tracking pada background yang statis bukanlah hal yang sulit, namun apabila tracking dilakukan pada background yang tidak statis akan lebih sulit, dikarenakan perubahan background dapat dikenali sebagai area tracking. Untuk mengatasi masalah tersebut, dapat dibuat suatu aplikasi untuk memisahkan background dimana aplikasi tersebut dapat
Estimation of Seismic Wavelets Based on the Multivariate Scale Mixture of Gaussians Model
Directory of Open Access Journals (Sweden)
Jing-Huai Gao
2009-12-01
Full Text Available This paper proposes a new method for estimating seismic wavelets. Suppose a seismic wavelet can be modeled by a formula with three free parameters (scale, frequency and phase. We can transform the estimation of the wavelet into determining these three parameters. The phase of the wavelet is estimated by constant-phase rotation to the seismic signal, while the other two parameters are obtained by the Higher-order Statistics (HOS (fourth-order cumulant matching method. In order to derive the estimator of the Higher-order Statistics (HOS, the multivariate scale mixture of Gaussians (MSMG model is applied to formulating the multivariate joint probability density function (PDF of the seismic signal. By this way, we can represent HOS as a polynomial function of second-order statistics to improve the anti-noise performance and accuracy. In addition, the proposed method can work well for short time series.
LEARNING VECTOR QUANTIZATION FOR ADAPTED GAUSSIAN MIXTURE MODELS IN AUTOMATIC SPEAKER IDENTIFICATION
Directory of Open Access Journals (Sweden)
IMEN TRABELSI
2017-05-01
Full Text Available Speaker Identification (SI aims at automatically identifying an individual by extracting and processing information from his/her voice. Speaker voice is a robust a biometric modality that has a strong impact in several application areas. In this study, a new combination learning scheme has been proposed based on Gaussian mixture model-universal background model (GMM-UBM and Learning vector quantization (LVQ for automatic text-independent speaker identification. Features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC extracted from the speech signal are used to train the New England subset of the TIMIT database. The best results obtained (90% for gender- independent speaker identification, 97 % for male speakers and 93% for female speakers for test data using 36 MFCC features.
Yu, Wangyang; Chen, Xiangguang; Wu, Lei
2015-04-01
Passive millimeter wave (PMMW) imaging has become one of the most effective means to detect the objects concealed under clothing. Due to the limitations of the available hardware and the inherent physical properties of PMMW imaging systems, images often exhibit poor contrast and low signal-to-noise ratios. Thus, it is difficult to achieve ideal results by using a general segmentation algorithm. In this paper, an advanced Gaussian Mixture Model (GMM) algorithm for the segmentation of concealed objects in PMMW images is presented. Our work is concerned with the fact that the GMM is a parametric statistical model, which is often used to characterize the statistical behavior of images. Our approach is three-fold: First, we remove the noise from the image using both a notch reject filter and a total variation filter. Next, we use an adaptive parameter initialization GMM algorithm (APIGMM) for simulating the histogram of images. The APIGMM provides an initial number of Gaussian components and start with more appropriate parameter. Bayesian decision is employed to separate the pixels of concealed objects from other areas. At last, the confidence interval (CI) method, alongside local gradient information, is used to extract the concealed objects. The proposed hybrid segmentation approach detects the concealed objects more accurately, even compared to two other state-of-the-art segmentation methods.
Directory of Open Access Journals (Sweden)
Hariharan Muthusamy
2015-01-01
Full Text Available Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM and k-nearest neighbor (kNN classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature.
OPTICAL-TO-SAR IMAGE REGISTRATION BASED ON GAUSSIAN MIXTURE MODEL
Directory of Open Access Journals (Sweden)
H. Wang
2012-07-01
Full Text Available Image registration is a fundamental in remote sensing applications such as inter-calibration and image fusion. Compared to other multi sensor image registration problems such as optical-to-IR, the registration for SAR and optical images has its specials. Firstly, the radiometric and geometric characteristics are different between SAR and optical images. Secondly, the feature extraction methods are heavily suffered with the speckle in SAR images. Thirdly, the structural information is more useful than the point features such as corners. In this work, we proposed a novel Gaussian Mixture Model (GMM based Optical-to-SAR image registration algorithm. The feature of line support region (LSR is used to describe the structural information and the orientation attributes are added into the GMM to avoid Expectation Maximization (EM algorithm falling into local extremum in feature sets matching phase. Through the experiments it proves that our algorithm is very robust for optical-to- SAR image registration problem.
Liu, Sijia; Sa, Ruhan; Maguire, Orla; Minderman, Hans; Chaudhary, Vipin
2015-03-01
Cytogenetic abnormalities are important diagnostic and prognostic criteria for acute myeloid leukemia (AML). A flow cytometry-based imaging approach for FISH in suspension (FISH-IS) was established that enables the automated analysis of several log-magnitude higher number of cells compared to the microscopy-based approaches. The rotational positioning can occur leading to discordance between spot count. As a solution of counting error from overlapping spots, in this study, a Gaussian Mixture Model based classification method is proposed. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) of GMM are used as global image features of this classification method. Via Random Forest classifier, the result shows that the proposed method is able to detect closely overlapping spots which cannot be separated by existing image segmentation based spot detection methods. The experiment results show that by the proposed method we can obtain a significant improvement in spot counting accuracy.
Image deblocking using joint Gaussian mixture model and anchored neighborhood regression priors
Fan, Meng; He, Xiaohai; Xiong, Shuhua; Qing, Linbo
2017-07-01
At low bit rates, Block based transform coding method uses large quantization step to quantize transform coefficients, which usually causes compression artifacts for images. Post-processing strategy is a promising solution which can greatly improve the visual quality of degraded images without change of existing codec. In this paper, we propose an image deblocking method for JPEG compressed images using joint Gaussian mixture model (GMM) and anchored neighborhood regression priors. The proposed method takes advantage of image priors to reduce blocking artifacts and achieve a better image quality simultaneously. First, we utilize GMM to reduce blocking artifacts. Based on the assumption that similar image patches can be derived from one certain Gaussian probability distribution, we formulate the image deblocking as an optimization problem by maximizing a posteriori function. Solving this problem ultimately boils down to the liner Wiener filtering. We then learn mapping functions offline based on the recent adjusted anchored neighborhood regression to enhance image details and edges. Extensive experimental results validate that our proposed method performs better both objectively and subjectively compared to some recently presented methods.
Directory of Open Access Journals (Sweden)
Qunyi Xie
2016-01-01
Full Text Available Content-based image retrieval has recently become an important research topic and has been widely used for managing images from repertories. In this article, we address an efficient technique, called MNGS, which integrates multiview constrained nonnegative matrix factorization (NMF and Gaussian mixture model- (GMM- based spectral clustering for image retrieval. In the proposed methodology, the multiview NMF scheme provides competitive sparse representations of underlying images through decomposition of a similarity-preserving matrix that is formed by fusing multiple features from different visual aspects. In particular, the proposed method merges manifold constraints into the standard NMF objective function to impose an orthogonality constraint on the basis matrix and satisfy the structure preservation requirement of the coefficient matrix. To manipulate the clustering method on sparse representations, this paper has developed a GMM-based spectral clustering method in which the Gaussian components are regrouped in spectral space, which significantly improves the retrieval effectiveness. In this way, image retrieval of the whole database translates to a nearest-neighbour search in the cluster containing the query image. Simultaneously, this study investigates the proof of convergence of the objective function and the analysis of the computational complexity. Experimental results on three standard image datasets reveal the advantages that can be achieved with the proposed retrieval scheme.
Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-07-01
What is the "best" model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, Mk; k=>(1,…,K>), and help identify which model is most supported by the observed data, Y>˜=>(y˜1,…,y˜n>). Here, we introduce a new and robust estimator of the model evidence, p>(Y>˜|Mk>), which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, p>(Y>˜|Mk>) is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of p>(Y>˜|Mk>) and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection.
Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things
Directory of Open Access Journals (Sweden)
Xiaobo Yan
2015-01-01
Full Text Available This paper addresses missing value imputation for the Internet of Things (IoT. Nowadays, the IoT has been used widely and commonly by a variety of domains, such as transportation and logistics domain and healthcare domain. However, missing values are very common in the IoT for a variety of reasons, which results in the fact that the experimental data are incomplete. As a result of this, some work, which is related to the data of the IoT, can’t be carried out normally. And it leads to the reduction in the accuracy and reliability of the data analysis results. This paper, for the characteristics of the data itself and the features of missing data in IoT, divides the missing data into three types and defines three corresponding missing value imputation problems. Then, we propose three new models to solve the corresponding problems, and they are model of missing value imputation based on context and linear mean (MCL, model of missing value imputation based on binary search (MBS, and model of missing value imputation based on Gaussian mixture model (MGI. Experimental results showed that the three models can improve the accuracy, reliability, and stability of missing value imputation greatly and effectively.
Directory of Open Access Journals (Sweden)
John Christian G Spainhour
Full Text Available Matrix assisted laser desorption/ionization time-of-flight (MALDI-TOF coupled with stable isotope standards (SIS has been used to quantify native peptides. This peptide quantification by MALDI-TOF approach has difficulties quantifying samples containing peptides with ion currents in overlapping spectra. In these overlapping spectra the currents sum together, which modify the peak heights and make normal SIS estimation problematic. An approach using Gaussian mixtures based on known physical constants to model the isotopic cluster of a known compound is proposed here. The characteristics of this approach are examined for single and overlapping compounds. The approach is compared to two commonly used SIS quantification methods for single compound, namely Peak Intensity method and Riemann sum area under the curve (AUC method. For studying the characteristics of the Gaussian mixture method, Angiotensin II, Angiotensin-2-10, and Angiotenisn-1-9 and their associated SIS peptides were used. The findings suggest, Gaussian mixture method has similar characteristics as the two methods compared for estimating the quantity of isolated isotopic clusters for single compounds. All three methods were tested using MALDI-TOF mass spectra collected for peptides of the renin-angiotensin system. The Gaussian mixture method accurately estimated the native to labeled ratio of several isolated angiotensin peptides (5.2% error in ratio estimation with similar estimation errors to those calculated using peak intensity and Riemann sum AUC methods (5.9% and 7.7%, respectively. For overlapping angiotensin peptides, (where the other two methods are not applicable the estimation error of the Gaussian mixture was 6.8%, which is within the acceptable range. In summary, for single compounds the Gaussian mixture method is equivalent or marginally superior compared to the existing methods of peptide quantification and is capable of quantifying overlapping (convolved peptides within
CSIR Research Space (South Africa)
Heyns, T
2012-10-01
Full Text Available This paper investigates how Gaussian mixture models (GMMs) may be used to detect and trend fault induced vibration signal irregularities, such as those which might be indicative of the onset of gear damage. The negative log likelihood (NLL...
A Gaussian mixture model for definition of lung tumor volumes in positron emission tomography
International Nuclear Information System (INIS)
Aristophanous, Michalis; Penney, Bill C.; Martel, Mary K.; Pelizzari, Charles A.
2007-01-01
The increased interest in 18 F-fluorodeoxyglucose (FDG) positron emission tomography (PET) in radiation treatment planning in the past five years necessitated the independent and accurate segmentation of gross tumor volume (GTV) from FDG-PET scans. In some studies the radiation oncologist contours the GTV based on a computed tomography scan, while incorporating pertinent data from the PET images. Alternatively, a simple threshold, typically 40% of the maximum intensity, has been employed to differentiate tumor from normal tissue, while other researchers have developed algorithms to aid the PET based GTV definition. None of these methods, however, results in reliable PET tumor segmentation that can be used for more sophisticated treatment plans. For this reason, we developed a Gaussian mixture model (GMM) based segmentation technique on selected PET tumor regions from non-small cell lung cancer patients. The purpose of this study was to investigate the feasibility of using a GMM-based tumor volume definition in a robust, reliable and reproducible way. A GMM relies on the idea that any distribution, in our case a distribution of image intensities, can be expressed as a mixture of Gaussian densities representing different classes. According to our implementation, each class belongs to one of three regions in the image; the background (B), the uncertain (U) and the target (T), and from these regions we can obtain the tumor volume. User interaction in the implementation is required, but is limited to the initialization of the model parameters and the selection of an ''analysis region'' to which the modeling is restricted. The segmentation was developed on three and tested on another four clinical cases to ensure robustness against differences observed in the clinic. It also compared favorably with thresholding at 40% of the maximum intensity and a threshold determination function based on tumor to background image intensities proposed in a recent paper. The parts of the
Gaussian Process-Mixture Conditional Heteroscedasticity.
Platanios, Emmanouil A; Chatzis, Sotirios P
2014-05-01
Generalized autoregressive conditional heteroscedasticity (GARCH) models have long been considered as one of the most successful families of approaches for volatility modeling in financial return series. In this paper, we propose an alternative approach based on methodologies widely used in the field of statistical machine learning. Specifically, we propose a novel nonparametric Bayesian mixture of Gaussian process regression models, each component of which models the noise variance process that contaminates the observed data as a separate latent Gaussian process driven by the observed data. This way, we essentially obtain a Gaussian process-mixture conditional heteroscedasticity (GPMCH) model for volatility modeling in financial return series. We impose a nonparametric prior with power-law nature over the distribution of the model mixture components, namely the Pitman-Yor process prior, to allow for better capturing modeled data distributions with heavy tails and skewness. Finally, we provide a copula-based approach for obtaining a predictive posterior for the covariances over the asset returns modeled by means of a postulated GPMCH model. We evaluate the efficacy of our approach in a number of benchmark scenarios, and compare its performance to state-of-the-art methodologies.
Damage Detection of Refractory Based on Principle Component Analysis and Gaussian Mixture Model
Directory of Open Access Journals (Sweden)
Changming Liu
2018-01-01
Full Text Available Acoustic emission (AE technique is a common approach to identify the damage of the refractories; however, there is a complex problem since there are as many as fifteen involved parameters, which calls for effective data processing and classification algorithms to reduce the level of complexity. In this paper, experiments involving three-point bending tests of refractories were conducted and AE signals were collected. A new data processing method of merging the similar parameters in the description of the damage and reducing the dimension was developed. By means of the principle component analysis (PCA for dimension reduction, the fifteen related parameters can be reduced to two parameters. The parameters were the linear combinations of the fifteen original parameters and taken as the indexes for damage classification. Based on the proposed approach, the Gaussian mixture model was integrated with the Bayesian information criterion to group the AE signals into two damage categories, which accounted for 99% of all damage. Electronic microscope scanning of the refractories verified the two types of damage.
Identification of damage in composite structures using Gaussian mixture model-processed Lamb waves
Wang, Qiang; Ma, Shuxian; Yue, Dong
2018-04-01
Composite materials have comprehensively better properties than traditional materials, and therefore have been more and more widely used, especially because of its higher strength-weight ratio. However, the damage of composite structures is usually varied and complicated. In order to ensure the security of these structures, it is necessary to monitor and distinguish the structural damage in a timely manner. Lamb wave-based structural health monitoring (SHM) has been proved to be effective in online structural damage detection and evaluation; furthermore, the characteristic parameters of the multi-mode Lamb wave varies in response to different types of damage in the composite material. This paper studies the damage identification approach for composite structures using the Lamb wave and the Gaussian mixture model (GMM). The algorithm and principle of the GMM, and the parameter estimation, is introduced. Multi-statistical characteristic parameters of the excited Lamb waves are extracted, and the parameter space with reduced dimensions is adopted by principal component analysis (PCA). The damage identification system using the GMM is then established through training. Experiments on a glass fiber-reinforced epoxy composite laminate plate are conducted to verify the feasibility of the proposed approach in terms of damage classification. The experimental results show that different types of damage can be identified according to the value of the likelihood function of the GMM.
Vehicle speed detection based on gaussian mixture model using sequential of images
Setiyono, Budi; Ratna Sulistyaningrum, Dwi; Soetrisno; Fajriyah, Farah; Wahyu Wicaksono, Danang
2017-09-01
Intelligent Transportation System is one of the important components in the development of smart cities. Detection of vehicle speed on the highway is supporting the management of traffic engineering. The purpose of this study is to detect the speed of the moving vehicles using digital image processing. Our approach is as follows: The inputs are a sequence of frames, frame rate (fps) and ROI. The steps are following: First we separate foreground and background using Gaussian Mixture Model (GMM) in each frames. Then in each frame, we calculate the location of object and its centroid. Next we determine the speed by computing the movement of centroid in sequence of frames. In the calculation of speed, we only consider frames when the centroid is inside the predefined region of interest (ROI). Finally we transform the pixel displacement into a time unit of km/hour. Validation of the system is done by comparing the speed calculated manually and obtained by the system. The results of software testing can detect the speed of vehicles with the highest accuracy is 97.52% and the lowest accuracy is 77.41%. And the detection results of testing by using real video footage on the road is included with real speed of the vehicle.
GMCM: Unsupervised Clustering and Meta-Analysis Using Gaussian Mixture Copula Models
Directory of Open Access Journals (Sweden)
Anders Ellern Bilgrau
2016-04-01
Full Text Available Methods for clustering in unsupervised learning are an important part of the statistical toolbox in numerous scientific disciplines. Tewari, Giering, and Raghunathan (2011 proposed to use so-called Gaussian mixture copula models (GMCM for general unsupervised learning based on clustering. Li, Brown, Huang, and Bickel (2011 independently discussed a special case of these GMCMs as a novel approach to meta-analysis in highdimensional settings. GMCMs have attractive properties which make them highly flexible and therefore interesting alternatives to other well-established methods. However, parameter estimation is hard because of intrinsic identifiability issues and intractable likelihood functions. Both aforementioned papers discuss similar expectation-maximization-like algorithms as their pseudo maximum likelihood estimation procedure. We present and discuss an improved implementation in R of both classes of GMCMs along with various alternative optimization routines to the EM algorithm. The software is freely available in the R package GMCM. The implementation is fast, general, and optimized for very large numbers of observations. We demonstrate the use of package GMCM through different applications.
Genotype copy number variations using Gaussian mixture models: theory and algorithms.
Lin, Chang-Yun; Lo, Yungtai; Ye, Kenny Q
2012-10-12
Copy number variations (CNVs) are important in the disease association studies and are usually targeted by most recent microarray platforms developed for GWAS studies. However, the probes targeting the same CNV regions could vary greatly in performance, with some of the probes carrying little information more than pure noise. In this paper, we investigate how to best combine measurements of multiple probes to estimate copy numbers of individuals under the framework of Gaussian mixture model (GMM). First we show that under two regularity conditions and assume all the parameters except the mixing proportions are known, optimal weights can be obtained so that the univariate GMM based on the weighted average gives the exactly the same classification as the multivariate GMM does. We then developed an algorithm that iteratively estimates the parameters and obtains the optimal weights, and uses them for classification. The algorithm performs well on simulation data and two sets of real data, which shows clear advantage over classification based on the equal weighted average.
Hierarchical heuristic search using a Gaussian mixture model for UAV coverage planning.
Lin, Lanny; Goodrich, Michael A
2014-12-01
During unmanned aerial vehicle (UAV) search missions, efficient use of UAV flight time requires flight paths that maximize the probability of finding the desired subject. The probability of detecting the desired subject based on UAV sensor information can vary in different search areas due to environment elements like varying vegetation density or lighting conditions, making it likely that the UAV can only partially detect the subject. This adds another dimension of complexity to the already difficult (NP-Hard) problem of finding an optimal search path. We present a new class of algorithms that account for partial detection in the form of a task difficulty map and produce paths that approximate the payoff of optimal solutions. The algorithms use the mode goodness ratio heuristic that uses a Gaussian mixture model to prioritize search subregions. The algorithms search for effective paths through the parameter space at different levels of resolution. We compare the performance of the new algorithms against two published algorithms (Bourgault's algorithm and LHC-GW-CONV algorithm) in simulated searches with three real search and rescue scenarios, and show that the new algorithms outperform existing algorithms significantly and can yield efficient paths that yield payoffs near the optimal.
Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Directory of Open Access Journals (Sweden)
M. H. Savoji
2014-09-01
Full Text Available Gaussian Mixture Models (GMMs of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equations whose solutions lead to the first estimates of speech and noise power spectra. The noise source is also identified and the input SNR estimated in this first step. These first estimates are then refined using approximate but explicit MMSE and MAP estimation formulations. The refined estimates are then used in a Wiener filter to reduce noise and enhance the noisy speech. The proposed schemes show good results. Nevertheless, it is shown that the MAP explicit solution, introduced here for the first time, reduces the computation time to less than one third with a slight higher improvement in SNR and PESQ score and also less distortion in comparison to the MMSE solution.
Flexible Mixture-Amount Models for Business and Industry Using Gaussian Processes
A. Ruseckaite (Aiste); D. Fok (Dennis); P.P. Goos (Peter)
2016-01-01
markdownabstractMany products and services can be described as mixtures of ingredients whose proportions sum to one. Specialized models have been developed for linking the mixture proportions to outcome variables, such as preference, quality and liking. In many scenarios, only the mixture
Liu, Jie; Zhuang, Xiahai; Wu, Lianming; An, Dongaolei; Xu, Jianrong; Peters, Terry; Gu, Lixu
2017-11-01
Objective: In this paper, we propose a fully automatic framework for myocardium segmentation of delayed-enhancement (DE) MRI images without relying on prior patient-specific information. Methods: We employ a multicomponent Gaussian mixture model to deal with the intensity heterogeneity of myocardium caused by the infarcts. To differentiate the myocardium from other tissues with similar intensities, while at the same time maintain spatial continuity, we introduce a coupled level set (CLS) to regularize the posterior probability. The CLS, as a spatial regularization, can be adapted to the image characteristics dynamically. We also introduce an image intensity gradient based term into the CLS, adding an extra force to the posterior probability based framework, to improve the accuracy of myocardium boundary delineation. The prebuilt atlases are propagated to the target image to initialize the framework. Results: The proposed method was tested on datasets of 22 clinical cases, and achieved Dice similarity coefficients of 87.43 ± 5.62% (endocardium), 90.53 ± 3.20% (epicardium) and 73.58 ± 5.58% (myocardium), which have outperformed three variants of the classic segmentation methods. Conclusion: The results can provide a benchmark for the myocardial segmentation in the literature. Significance: DE MRI provides an important tool to assess the viability of myocardium. The accurate segmentation of myocardium, which is a prerequisite for further quantitative analysis of myocardial infarction (MI) region, can provide important support for the diagnosis and treatment management for MI patients. Objective: In this paper, we propose a fully automatic framework for myocardium segmentation of delayed-enhancement (DE) MRI images without relying on prior patient-specific information. Methods: We employ a multicomponent Gaussian mixture model to deal with the intensity heterogeneity of myocardium caused by the infarcts. To differentiate the myocardium from other tissues with
Precision Measurements of the Cluster Red Sequence using an Error Corrected Gaussian Mixture Model
Energy Technology Data Exchange (ETDEWEB)
Hao, Jiangang; /Fermilab /Michigan U.; Koester, Benjamin P.; /Chicago U.; Mckay, Timothy A.; /Michigan U.; Rykoff, Eli S.; /UC, Santa Barbara; Rozo, Eduardo; /Ohio State U.; Evrard, August; /Michigan U.; Annis, James; /Fermilab; Becker, Matthew; /Chicago U.; Busha, Michael; /KIPAC, Menlo Park /SLAC; Gerdes, David; /Michigan U.; Johnston, David E.; /Northwestern U. /Brookhaven
2009-07-01
The red sequence is an important feature of galaxy clusters and plays a crucial role in optical cluster detection. Measurement of the slope and scatter of the red sequence are affected both by selection of red sequence galaxies and measurement errors. In this paper, we describe a new error corrected Gaussian Mixture Model for red sequence galaxy identification. Using this technique, we can remove the effects of measurement error and extract unbiased information about the intrinsic properties of the red sequence. We use this method to select red sequence galaxies in each of the 13,823 clusters in the maxBCG catalog, and measure the red sequence ridgeline location and scatter of each. These measurements provide precise constraints on the variation of the average red galaxy populations in the observed frame with redshift. We find that the scatter of the red sequence ridgeline increases mildly with redshift, and that the slope decreases with redshift. We also observe that the slope does not strongly depend on cluster richness. Using similar methods, we show that this behavior is mirrored in a spectroscopic sample of field galaxies, further emphasizing that ridgeline properties are independent of environment. These precise measurements serve as an important observational check on simulations and mock galaxy catalogs. The observed trends in the slope and scatter of the red sequence ridgeline with redshift are clues to possible intrinsic evolution of the cluster red-sequence itself. Most importantly, the methods presented in this work lay the groundwork for further improvements in optically-based cluster cosmology.
Using Mixture of Gaussians to Compare Approaches to Signal Separation
DEFF Research Database (Denmark)
Petersen, Kaare Brandt
2004-01-01
is an example of how such different approaches to separation can be compared using Mixtures of Gaussians as a prior distribution. This not only illuminates some interesting properties of Maximum Likelihood and Energy Based Models, but is also an example of how Mixtures of Gaussians can serve as a both flexible...
Avendaño-Valencia, Luis David; Fassois, Spilios D.
2017-12-01
The problem of vibration-based damage diagnosis in structures characterized by time-dependent dynamics under significant environmental and/or operational uncertainty is considered. A stochastic framework consisting of a Gaussian Mixture Random Coefficient model of the uncertain time-dependent dynamics under each structural health state, proper estimation methods, and Bayesian or minimum distance type decision making, is postulated. The Random Coefficient (RC) time-dependent stochastic model with coefficients following a multivariate Gaussian Mixture Model (GMM) allows for significant flexibility in uncertainty representation. Certain of the model parameters are estimated via a simple procedure which is founded on the related Multiple Model (MM) concept, while the GMM weights are explicitly estimated for optimizing damage diagnostic performance. The postulated framework is demonstrated via damage detection in a simple simulated model of a quarter-car active suspension with time-dependent dynamics and considerable uncertainty on the payload. Comparisons with a simpler Gaussian RC model based method are also presented, with the postulated framework shown to be capable of offering considerable improvement in diagnostic performance.
Directory of Open Access Journals (Sweden)
ZHAO Quanhua
2015-12-01
Full Text Available Full waveform LiDAR data record the signal of the backscattered laser pulse. The elevation and the energy information of ground targets can be effectively obtained by decomposition of the full waveform LiDAR data. Therefore, waveform decomposition is the key to full waveform LiDAR data processing. However, in waveform decomposition, determining the number of the components is a focus and difficult problem. To this end, this paper presents a method which can automatically determine the number. First of all, a given full waveform LiDAR data is modeled on the assumption that energy recorded at elevation points satisfy Gaussian mixture distribution. The constraint function is defined to steer the model fitting the waveform. Correspondingly, a probability distribution based on the function is constructed by Gibbs. The Bayesian paradigm is followed to build waveform decomposition model. Then a RJMCMC (reversible jump Markov chain Monte Carlo scheme is used to simulate the decomposition model, which determines the number of the components and decomposes the waveform into a group of Gaussian distributions. In the RJMCMC algorithm, the move types are designed, including updating parameter vector, splitting or merging Gaussian components, birth or death Gaussian component. The results obtained from the ICESat-GLAS waveform data of different areas show that the proposed algorithm is efficient and promising.
Almeida, Javier; Velasco, Nelson; Alvarez, Charlens; Romero, Eduardo
2017-11-01
Autism Spectrum Disorder (ASD) is a complex neurological condition characterized by a triad of signs: stereotyped behaviors, verbal and non-verbal communication problems. The scientific community has been interested on quantifying anatomical brain alterations of this disorder. Several studies have focused on measuring brain cortical and sub-cortical volumes. This article presents a fully automatic method which finds out differences among patients diagnosed with autism and control patients. After the usual pre-processing, a template (MNI152) is registered to an evaluated brain which becomes then a set of regions. Each of these regions is the represented by the normalized histogram of intensities which is approximated by mixture of Gaussian (GMM). The gray and white matter are separated to calculate the mean and standard deviation of each Gaussian. These features are then used to train, region per region, a binary SVM classifier. The method was evaluated in an adult population aged from 18 to 35 years, from the public database Autism Brain Imaging Data Exchange (ABIDE). Highest discrimination values were found for the Right Middle Temporal Gyrus, with an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) the curve of 0.72.
Holoien, Thomas W.-S.; Marshall, Philip J.; Wechsler, Risa H.
2017-06-01
We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of a subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.
Energy Technology Data Exchange (ETDEWEB)
Holoien, Thomas W.-S.; /Ohio State U., Dept. Astron. /Ohio State U., CCAPP /KIPAC, Menlo Park /SLAC; Marshall, Philip J.; Wechsler, Risa H.; /KIPAC, Menlo Park /SLAC
2017-05-11
We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of a subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.
Directory of Open Access Journals (Sweden)
Y. Yang
2017-09-01
Full Text Available To solve the problems of existing method of change detection using fully polarimetric SAR which not takes full advantage of polarimetric information and the result of false alarm rate of which is high, a method is proposed based on test statistic and Gaussian mixture model in this paper. In the case of the flood disaster in Wuhan city in 2016, difference image is obtained by the likelihoodratio parameter which is built using coherency matrix C3 or covariance matrix T3 of fully polarimetric SAR based on test statistic, and it becomes a reality that the change information is automatic extracted by the parameter of Gaussian mixture model (GMM of difference image based on the expectation maximization (EM iterative algorithm. The experimental results show that the overall accuracy of change detection results can be improved and false alarm rate can be reduced using this method by comparison with traditional constant false alarm rate (CFAR method. Thus the validity and feasibility of the method is demonstrated.
Yang, Y.; Liu, W.
2017-09-01
To solve the problems of existing method of change detection using fully polarimetric SAR which not takes full advantage of polarimetric information and the result of false alarm rate of which is high, a method is proposed based on test statistic and Gaussian mixture model in this paper. In the case of the flood disaster in Wuhan city in 2016, difference image is obtained by the likelihoodratio parameter which is built using coherency matrix C3 or covariance matrix T3 of fully polarimetric SAR based on test statistic, and it becomes a reality that the change information is automatic extracted by the parameter of Gaussian mixture model (GMM) of difference image based on the expectation maximization (EM) iterative algorithm. The experimental results show that the overall accuracy of change detection results can be improved and false alarm rate can be reduced using this method by comparison with traditional constant false alarm rate (CFAR) method. Thus the validity and feasibility of the method is demonstrated.
Iterative Mixture Component Pruning Algorithm for Gaussian Mixture PHD Filter
Directory of Open Access Journals (Sweden)
Xiaoxi Yan
2014-01-01
Full Text Available As far as the increasing number of mixture components in the Gaussian mixture PHD filter is concerned, an iterative mixture component pruning algorithm is proposed. The pruning algorithm is based on maximizing the posterior probability density of the mixture weights. The entropy distribution of the mixture weights is adopted as the prior distribution of mixture component parameters. The iterative update formulations of the mixture weights are derived by Lagrange multiplier and Lambert W function. Mixture components, whose weights become negative during iterative procedure, are pruned by setting corresponding mixture weights to zeros. In addition, multiple mixture components with similar parameters describing the same PHD peak can be merged into one mixture component in the algorithm. Simulation results show that the proposed iterative mixture component pruning algorithm is superior to the typical pruning algorithm based on thresholds.
Directory of Open Access Journals (Sweden)
Li Yao
2014-01-01
Full Text Available Modeling background and segmenting moving objects are significant techniques for computer vision applications. Mixture-of-Gaussians (MoG background model is commonly used in foreground extraction in video steam. However considering the case that the objects enter the scenery and stay for a while, the foreground extraction would fail as the objects stay still and gradually merge into the background. In this paper, we adopt a blob tracking method to cope with this situation. To construct the MoG model more quickly, we add frame difference method to the foreground extracted from MoG for very crowded situations. What is more, a new shadow removal method based on RGB color space is proposed.
Godino-Llorente, Juan Ignacio; Gómez-Vilda, Pedro; Blanco-Velasco, Manuel
2006-10-01
Voice diseases have been increasing dramatically in recent times due mainly to unhealthy social habits and voice abuse. These diseases must be diagnosed and treated at an early stage, especially in the case of larynx cancer. It is widely recognized that vocal and voice diseases do not necessarily cause changes in voice quality as perceived by a listener. Acoustic analysis could be a useful tool to diagnose this type of disease. Preliminary research has shown that the detection of voice alterations can be carried out by means of Gaussian mixture models and short-term mel cepstral parameters complemented by frame energy together with first and second derivatives. This paper, using the F-Ratio and Fisher's discriminant ratio, will demonstrate that the detection of voice impairments can be performed using both mel cesptral vectors and their first derivative, ignoring the second derivative.
Li, Zheng; Jiang, Yi-han; Duan, Lian; Zhu, Chao-zhe
2017-08-01
Objective. Functional near infra-red spectroscopy (fNIRS) is a promising brain imaging technology for brain-computer interfaces (BCI). Future clinical uses of fNIRS will likely require operation over long time spans, during which neural activation patterns may change. However, current decoders for fNIRS signals are not designed to handle changing activation patterns. The objective of this study is to test via simulations a new adaptive decoder for fNIRS signals, the Gaussian mixture model adaptive classifier (GMMAC). Approach. GMMAC can simultaneously classify and track activation pattern changes without the need for ground-truth labels. This adaptive classifier uses computationally efficient variational Bayesian inference to label new data points and update mixture model parameters, using the previous model parameters as priors. We test GMMAC in simulations in which neural activation patterns change over time and compare to static decoders and unsupervised adaptive linear discriminant analysis classifiers. Main results. Our simulation experiments show GMMAC can accurately decode under time-varying activation patterns: shifts of activation region, expansions of activation region, and combined contractions and shifts of activation region. Furthermore, the experiments show the proposed method can track the changing shape of the activation region. Compared to prior work, GMMAC performed significantly better than the other unsupervised adaptive classifiers on a difficult activation pattern change simulation: 99% versus brain-computer interfaces, including neurofeedback training systems, where operation over long time spans is required.
Qiu, Lei; Yuan, Shenfang; Mei, Hanfei; Fang, Fang
2016-01-01
Structural Health Monitoring (SHM) technology is considered to be a key technology to reduce the maintenance cost and meanwhile ensure the operational safety of aircraft structures. It has gradually developed from theoretic and fundamental research to real-world engineering applications in recent decades. The problem of reliable damage monitoring under time-varying conditions is a main issue for the aerospace engineering applications of SHM technology. Among the existing SHM methods, Guided Wave (GW) and piezoelectric sensor-based SHM technique is a promising method due to its high damage sensitivity and long monitoring range. Nevertheless the reliability problem should be addressed. Several methods including environmental parameter compensation, baseline signal dependency reduction and data normalization, have been well studied but limitations remain. This paper proposes a damage propagation monitoring method based on an improved Gaussian Mixture Model (GMM). It can be used on-line without any structural mechanical model and a priori knowledge of damage and time-varying conditions. With this method, a baseline GMM is constructed first based on the GW features obtained under time-varying conditions when the structure under monitoring is in the healthy state. When a new GW feature is obtained during the on-line damage monitoring process, the GMM can be updated by an adaptive migration mechanism including dynamic learning and Gaussian components split-merge. The mixture probability distribution structure of the GMM and the number of Gaussian components can be optimized adaptively. Then an on-line GMM can be obtained. Finally, a best match based Kullback-Leibler (KL) divergence is studied to measure the migration degree between the baseline GMM and the on-line GMM to reveal the weak cumulative changes of the damage propagation mixed in the time-varying influence. A wing spar of an aircraft is used to validate the proposed method. The results indicate that the crack
Qiu, Lei; Yuan, Shenfang; Mei, Hanfei; Fang, Fang
2016-02-26
Structural Health Monitoring (SHM) technology is considered to be a key technology to reduce the maintenance cost and meanwhile ensure the operational safety of aircraft structures. It has gradually developed from theoretic and fundamental research to real-world engineering applications in recent decades. The problem of reliable damage monitoring under time-varying conditions is a main issue for the aerospace engineering applications of SHM technology. Among the existing SHM methods, Guided Wave (GW) and piezoelectric sensor-based SHM technique is a promising method due to its high damage sensitivity and long monitoring range. Nevertheless the reliability problem should be addressed. Several methods including environmental parameter compensation, baseline signal dependency reduction and data normalization, have been well studied but limitations remain. This paper proposes a damage propagation monitoring method based on an improved Gaussian Mixture Model (GMM). It can be used on-line without any structural mechanical model and a priori knowledge of damage and time-varying conditions. With this method, a baseline GMM is constructed first based on the GW features obtained under time-varying conditions when the structure under monitoring is in the healthy state. When a new GW feature is obtained during the on-line damage monitoring process, the GMM can be updated by an adaptive migration mechanism including dynamic learning and Gaussian components split-merge. The mixture probability distribution structure of the GMM and the number of Gaussian components can be optimized adaptively. Then an on-line GMM can be obtained. Finally, a best match based Kullback-Leibler (KL) divergence is studied to measure the migration degree between the baseline GMM and the on-line GMM to reveal the weak cumulative changes of the damage propagation mixed in the time-varying influence. A wing spar of an aircraft is used to validate the proposed method. The results indicate that the crack
Finite Gaussian Mixture Approximations to Analytically Intractable Density Kernels
DEFF Research Database (Denmark)
Khorunzhina, Natalia; Richard, Jean-Francois
The objective of the paper is that of constructing finite Gaussian mixture approximations to analytically intractable density kernels. The proposed method is adaptive in that terms are added one at the time and the mixture is fully re-optimized at each step using a distance measure that approxima......The objective of the paper is that of constructing finite Gaussian mixture approximations to analytically intractable density kernels. The proposed method is adaptive in that terms are added one at the time and the mixture is fully re-optimized at each step using a distance measure...... that approximates the corresponding importance sampling variance. All functions of interest are evaluated under Gaussian quadrature rules. Examples include a sequential (filtering) evaluation of the likelihood function of a stochastic volatility model where all relevant densities (filtering, predictive...
Modification of Gaussian mixture models for data classification in high energy physics
Štěpánek, Michal; Franc, Jiří; Kůs, Václav
2015-01-01
In high energy physics, we deal with demanding task of signal separation from background. The Model Based Clustering method involves the estimation of distribution mixture parameters via the Expectation-Maximization algorithm in the training phase and application of Bayes' rule in the testing phase. Modifications of the algorithm such as weighting, missing data processing, and overtraining avoidance will be discussed. Due to the strong dependence of the algorithm on initialization, genetic optimization techniques such as mutation, elitism, parasitism, and the rank selection of individuals will be mentioned. Data pre-processing plays a significant role for the subsequent combination of final discriminants in order to improve signal separation efficiency. Moreover, the results of the top quark separation from the Tevatron collider will be compared with those of standard multivariate techniques in high energy physics. Results from this study has been used in the measurement of the inclusive top pair production cross section employing DØ Tevatron full Runll data (9.7 fb-1).
Directory of Open Access Journals (Sweden)
Bernard Mazoyer
Full Text Available Hemispheric lateralization for language production and its relationships with manual preference and manual preference strength were studied in a sample of 297 subjects, including 153 left-handers (LH. A hemispheric functional lateralization index (HFLI for language was derived from fMRI acquired during a covert sentence generation task as compared with a covert word list recitation. The multimodal HFLI distribution was optimally modeled using a mixture of 3 and 4 Gaussian functions in right-handers (RH and LH, respectively. Gaussian function parameters helped to define 3 types of language hemispheric lateralization, namely "Typical" (left hemisphere dominance with clear positive HFLI values, 88% of RH, 78% of LH, "Ambilateral" (no dominant hemisphere with HFLI values close to 0, 12% of RH, 15% of LH and "Strongly-atypical" (right-hemisphere dominance with clear negative HFLI values, 7% of LH. Concordance between dominant hemispheres for hand and for language did not exceed chance level, and most of the association between handedness and language lateralization was explained by the fact that all Strongly-atypical individuals were left-handed. Similarly, most of the relationship between language lateralization and manual preference strength was explained by the fact that Strongly-atypical individuals exhibited a strong preference for their left hand. These results indicate that concordance of hemispheric dominance for hand and for language occurs barely above the chance level, except in a group of rare individuals (less than 1% in the general population who exhibit strong right hemisphere dominance for both language and their preferred hand. They call for a revisit of models hypothesizing common determinants for handedness and for language dominance.
Zhou, Ya-Tong; Fan, Yu; Chen, Zi-Yi; Sun, Jian-Cheng
2017-05-01
The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expectation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHC-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval. SHC-EM outperforms the traditional variational learning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning. Supported by the National Natural Science Foundation of China under Grant No 60972106, the China Postdoctoral Science Foundation under Grant No 2014M561053, the Humanity and Social Science Foundation of Ministry of Education of China under Grant No 15YJA630108, and the Hebei Province Natural Science Foundation under Grant No E2016202341.
Şahingil, Mehmet C.; Aslan, Murat Ş.
2013-05-01
The reticle systems which are considered as the classical approach for determining the angular position of radiating targets in infrared band are widely used in early generation surface-to-air and air-to-air infrared guided missile seekers. One of the cost-effective ways of protecting aircrafts against these missiles is to dispense flare decoys from the countermeasure dispensing system (CMDS) integrated into the aircraft platform. Although this counter-measuring technique seems very simple, if not optimized carefully, it may not be effective for protecting the aircraft. Flares should be dispensed in accordance with a specific dispensing program which determines the number of flares to be dispensed from each dispenser of the CMDS and timing sequence of dispensing. Optimizing the parameters of the dispensing program is not trivial. It requires a good understanding of the operating principle of the threat seeker, operational capabilities of own platform and engagement scenario between them. In the present paper, we propose a complete simulation-based procedure to form an effectiveness boundary of flare dispensing programs against the spin-scan and conical-scan reticle seekers. The region of effectiveness is determined via Gaussian mixture models. The raw data is collected via extensive number of simulations using a MATLAB-coded simulator which models reticle-based seeker, aircraft radiation, aircraft motion, aircraft CMDS system, flare motion and flare radiation.
Chattopadhyay, Souradeep; Maitra, Ranjan
2017-08-01
Clustering methods are an important tool to enumerate and describe the different coherent kind of gamma-ray bursts (GRBs). But their performance can be affected by a number of factors such as the choice of clustering algorithm and inherent associated assumptions, the inclusion of variables in clustering, nature of initialization methods used or the iterative algorithm or the criterion used to judge the optimal number of groups supported by the data. We analysed GRBs from the Burst and Transient Source Experiment (BATSE) 4Br Catalog using k-means and Gaussian-mixture-models-based clustering methods and found that after accounting for all the above factors, all six variables - different subsets of which have been used in the literature - that are, namely, the flux duration variables (T50, T90), the peak flux (P256) measured in 256 ms bins, the total fluence (Ft) and the spectral hardness ratios (H32 and H321) contain information on clustering. Further, our analysis found evidence of five different kinds of GRBs and that these groups have different kinds of dispersions in terms of shape, size and orientation. In terms of duration, fluence and spectrum, the five types of GRBs were characterized as intermediate/faint/intermediate, long/intermediate/soft, intermediate/intermediate/intermediate, short/faint/hard and long/bright/intermediate.
Directory of Open Access Journals (Sweden)
Li Juan
2016-01-01
Full Text Available XinTianYou, a folk song style from Shannxi province in China, is considered to be a precious traditional culture heritage. Research about XinTianYou is important to the overall Chinese folk music theory and is potentially quite useful for the culture preservation and applications. In this paper, we analyze the general characteristics of XinTianYou by using the pitch, rhythm features and the combination of these two features. First, we use the Gaussian Mixture Model (GMM to cluster the XinTianYou audio based on pitch and rhythm respectively, and analyze the general characteristics of XinTianYou based on the clustering result. Second, we propose an improved Features Relative Contribution Algorithm (CFRCA to com-pare the contributions of pitch and rhythm. Third, the probability of a song being XinTianYou can be estimated based on the GMM and the cosine similarity distance. The experimental results show that XinTianYou has large pitch span and large proportion of high pitch value (about 22%. Regarding the rhythm, we find that moderato is dominated while lento-moderato keep a similar ratio as moderato-allegro. The similarity between pitch features of all XinTianYou songs is more significant than rhythm features. Additionally, the average accuracy of XinTianYou recognition reaches 92.4% based on our method
Sworn testimony of the model evidence : Gaussian Mixture Importance (GAME) sampling
Volpi, Elena; Schoups, G.H.W.; Firmani, Giovanni; Vrugt, Jasper A.
2017-01-01
What is the “best” model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K
Statistical imitation system using relational interest points and Gaussian mixture models
CSIR Research Space (South Africa)
Claassens, J
2009-11-01
Full Text Available function is set up so that unlikely points are expensive. There are a number of advantages to this approach. Firstly, the algorithm can be modified to perform col- lision avoidance much like elastic path planning algorithms [8]. Collision avoidance... allows other influences such as collision avoidance to be incorporated into the planner. The statistical model can also be used in be- haviour recognition. A simple drawing experiment is used to demonstrate the proposed system and its performance. 1...
Jin, S.; Tamura, M.; Susaki, J.
2014-09-01
Leaf area index (LAI) is one of the most important structural parameters of forestry studies which manifests the ability of the green vegetation interacted with the solar illumination. Classic understanding about LAI is to consider the green canopy as integration of horizontal leaf layers. Since multi-angle remote sensing technique developed, LAI obliged to be deliberated according to the observation geometry. Effective LAI could formulate the leaf-light interaction virtually and precisely. To retrieve the LAI/effective LAI from remotely sensed data therefore becomes a challenge during the past decades. Laser scanning technique can provide accurate surface echoed coordinates with densely scanned intervals. To utilize the density based statistical algorithm for analyzing the voluminous amount of the 3-D points data is one of the subjects of the laser scanning applications. Computational geometry also provides some mature applications for point cloud data (PCD) processing and analysing. In this paper, authors investigated the feasibility of a new application for retrieving the effective LAI of an isolated broad leaf tree. Simplified curvature was calculated for each point in order to remove those non-photosynthetic tissues. Then PCD were discretized into voxel, and clustered by using Gaussian mixture model. Subsequently the area of each cluster was calculated by employing the computational geometry applications. In order to validate our application, we chose an indoor plant to estimate the leaf area, the correlation coefficient between calculation and measurement was 98.28 %. We finally calculated the effective LAI of the tree with 6 × 6 assumed observation directions.
Jahromi, Mahdi Kazemian; Kafieh, Raheleh; Rabbani, Hossein; Dehnavi, Alireza Mehri; Peyman, Alireza; Hajizadeh, Fedra; Ommani, Mohammadreza
2014-07-01
Diagnosis of corneal diseases is possible by measuring and evaluation of corneal thickness in different layers. Thus, the need for precise segmentation of corneal layer boundaries is inevitable. Obviously, manual segmentation is time-consuming and imprecise. In this paper, the Gaussian mixture model (GMM) is used for automatic segmentation of three clinically important corneal boundaries on optical coherence tomography (OCT) images. For this purpose, we apply the GMM method in two consequent steps. In the first step, the GMM is applied on the original image to localize the first and the last boundaries. In the next step, gradient response of a contrast enhanced version of the image is fed into another GMM algorithm to obtain a more clear result around the second boundary. Finally, the first boundary is traced toward down to localize the exact location of the second boundary. We tested the performance of the algorithm on images taken from a Heidelberg OCT imaging system. To evaluate our approach, the automatic boundary results are compared with the boundaries that have been segmented manually by two corneal specialists. The quantitative results show that the proposed method segments the desired boundaries with a great accuracy. Unsigned mean errors between the results of the proposed method and the manual segmentation are 0.332, 0.421, and 0.795 for detection of epithelium, Bowman, and endothelium boundaries, respectively. Unsigned mean errors of the inter-observer between two corneal specialists have also a comparable unsigned value of 0.330, 0.398, and 0.534, respectively.
Lee, Hyunna; Shim, Hackjoon; Chang, Hyuk-Jae
2015-01-01
This study aimed to propose an intensity-vesselness Gaussian mixture model (IVGMM) tracking for 2D + t segmentation of coronary arteries for X-ray angiography (XA) image sequences. We compose a two dimensional (2D) feature vector of intensity and vesselness to characterize the Gaussian mixture models. In our IVGMM tracking, vessel segmentation is performed for each image frame based on these vessel and background IVGMMs and then the segmentation results of the current image frame is used to update these IVGMMs. The 2D + t segmentation of coronary arteries over the 2D XA image sequence is solved by means of iterating two processes, i.e., segmentation of coronary arteries and update of the IVGMMs. The performance of the proposed IVGMM tracking was evaluated using clinical 2D XA datasets. We evaluated the segmentation accuracy of the IVGMM tracking by comparing with two previous 2D vessel segmentation methods and seven background subtraction (BGS) methods. Of the ten segmentation methods, IVGMM tracking shows the highest similarity to the manual segmentation in terms of precision, recall, Jaccard index (JI), F1 score, and peak signal-to-noise ratio (PSNR). It is concluded that the IVGMM tracking could obtain reasonable segmentation accuracy outperforming conventional vessel enhancement methods and object tracking methods.
Fujitani, Youhei
2017-11-01
Suppose a spherical colloidal particle surrounded by a near-critical binary fluid mixture in the homogeneous phase. The particle surface usually preferentially attracts one component of the mixture, and the resultant concentration gradient, which causes the osmotic pressure, becomes significant in the ambient near-criticality. The concentration profile is deformed by the particle motion, and can generate a nonzero force exerted on the moving particle. This link was previously shown to slightly suppress the positional equal-time correlation of a particle trapped by a harmonic potential. This previous study presupposed a small fluctuation amplitude of a particle much larger than the correlation length, a weak preferential attraction, and the Gaussian model for the free-energy functional of the mixture. In the present study, we calculate the equal-time correlation without assuming the weak preferential attraction and show that the suppression becomes much more distinct in some range of the trap stiffness because of the increased induced mass. This suggests the possible experimental usage of a trapped particle as a probe for local environments of a near-critical binary fluid mixture.
Efficient Kernel-Based Ensemble Gaussian Mixture Filtering
Liu, Bo
2015-11-11
We consider the Bayesian filtering problem for data assimilation following the kernel-based ensemble Gaussian-mixture filtering (EnGMF) approach introduced by Anderson and Anderson (1999). In this approach, the posterior distribution of the system state is propagated with the model using the ensemble Monte Carlo method, providing a forecast ensemble that is then used to construct a prior Gaussian-mixture (GM) based on the kernel density estimator. This results in two update steps: a Kalman filter (KF)-like update of the ensemble members and a particle filter (PF)-like update of the weights, followed by a resampling step to start a new forecast cycle. After formulating EnGMF for any observational operator, we analyze the influence of the bandwidth parameter of the kernel function on the covariance of the posterior distribution. We then focus on two aspects: i) the efficient implementation of EnGMF with (relatively) small ensembles, where we propose a new deterministic resampling strategy preserving the first two moments of the posterior GM to limit the sampling error; and ii) the analysis of the effect of the bandwidth parameter on contributions of KF and PF updates and on the weights variance. Numerical results using the Lorenz-96 model are presented to assess the behavior of EnGMF with deterministic resampling, study its sensitivity to different parameters and settings, and evaluate its performance against ensemble KFs. The proposed EnGMF approach with deterministic resampling suggests improved estimates in all tested scenarios, and is shown to require less localization and to be less sensitive to the choice of filtering parameters.
Estimation of Fuzzy Measures Using Covariance Matrices in Gaussian Mixtures
Directory of Open Access Journals (Sweden)
Nishchal K. Verma
2012-01-01
Full Text Available This paper presents a novel computational approach for estimating fuzzy measures directly from Gaussian mixtures model (GMM. The mixture components of GMM provide the membership functions for the input-output fuzzy sets. By treating consequent part as a function of fuzzy measures, we derived its coefficients from the covariance matrices found directly from GMM and the defuzzified output constructed from both the premise and consequent parts of the nonadditive fuzzy rules that takes the form of Choquet integral. The computational burden involved with the solution of λ-measure is minimized using Q-measure. The fuzzy model whose fuzzy measures were computed using covariance matrices found in GMM has been successfully applied on two benchmark problems and one real-time electric load data of Indian utility. The performance of the resulting model for many experimental studies including the above-mentioned application is found to be better and comparable to recent available fuzzy models. The main contribution of this paper is the estimation of fuzzy measures efficiently and directly from covariance matrices found in GMM, avoiding the computational burden greatly while learning them iteratively and solving polynomial equations of order of the number of input-output variables.
Skakun, Sergii; Franch, Belen; Vermote, Eric; Roger, Jean-Claude; Becker-Reshef, Inbal; Justice, Christopher; Kussul, Nataliia
2017-01-01
Knowledge on geographical location and distribution of crops at global, national and regional scales is an extremely valuable source of information applications. Traditional approaches to crop mapping using remote sensing data rely heavily on reference or ground truth data in order to train/calibrate classification models. As a rule, such models are only applicable to a single vegetation season and should be recalibrated to be applicable for other seasons. This paper addresses the problem of early season large-area winter crop mapping using Moderate Resolution Imaging Spectroradiometer (MODIS) derived Normalized Difference Vegetation Index (NDVI) time-series and growing degree days (GDD) information derived from the Modern-Era Retrospective analysis for Research and Applications (MERRA-2) product. The model is based on the assumption that winter crops have developed biomass during early spring while other crops (spring and summer) have no biomass. As winter crop development is temporally and spatially non-uniform due to the presence of different agro-climatic zones, we use GDD to account for such discrepancies. A Gaussian mixture model (GMM) is applied to discriminate winter crops from other crops (spring and summer). The proposed method has the following advantages: low input data requirements, robustness, applicability to global scale application and can provide winter crop maps 1.5-2 months before harvest. The model is applied to two study regions, the State of Kansas in the US and Ukraine, and for multiple seasons (2001-2014). Validation using the US Department of Agriculture (USDA) Crop Data Layer (CDL) for Kansas and ground measurements for Ukraine shows that accuracies of greater than 90% can be achieved in mapping winter crops 1.5-2 months before harvest. Results also show good correspondence to official statistics with average coefficients of determination R(exp. 2) greater than 0.85.
Energy Technology Data Exchange (ETDEWEB)
Wang, Li; Gac, Nicolas; Mohammad-Djafari, Ali [Laboratoire des Signaux et Systèmes 3, Rue Joliot-Curie 91192 Gif sur Yvette (France)
2015-01-13
In order to improve quality of 3D X-ray tomography reconstruction for Non Destructive Testing (NDT), we investigate in this paper hierarchical Bayesian methods. In NDT, useful prior information on the volume like the limited number of materials or the presence of homogeneous area can be included in the iterative reconstruction algorithms. In hierarchical Bayesian methods, not only the volume is estimated thanks to the prior model of the volume but also the hyper parameters of this prior. This additional complexity in the reconstruction methods when applied to large volumes (from 512{sup 3} to 8192{sup 3} voxels) results in an increasing computational cost. To reduce it, the hierarchical Bayesian methods investigated in this paper lead to an algorithm acceleration by Variational Bayesian Approximation (VBA) [1] and hardware acceleration thanks to projection and back-projection operators paralleled on many core processors like GPU [2]. In this paper, we will consider a Student-t prior on the gradient of the image implemented in a hierarchical way [3, 4, 1]. Operators H (forward or projection) and H{sup t} (adjoint or back-projection) implanted in multi-GPU [2] have been used in this study. Different methods will be evalued on synthetic volume 'Shepp and Logan' in terms of quality and time of reconstruction. We used several simple regularizations of order 1 and order 2. Other prior models also exists [5]. Sometimes for a discrete image, we can do the segmentation and reconstruction at the same time, then the reconstruction can be done with less projections.
Bak, N; Ebdrup, B H; Oranje, B; Fagerlund, B; Jensen, M H; Düring, S W; Nielsen, M Ø; Glenthøj, B Y; Hansen, L K
2017-04-11
Deficits in information processing and cognition are among the most robust findings in schizophrenia patients. Previous efforts to translate group-level deficits into clinically relevant and individualized information have, however, been non-successful, which is possibly explained by biologically different disease subgroups. We applied machine learning algorithms on measures of electrophysiology and cognition to identify potential subgroups of schizophrenia. Next, we explored subgroup differences regarding treatment response. Sixty-six antipsychotic-naive first-episode schizophrenia patients and sixty-five healthy controls underwent extensive electrophysiological and neurocognitive test batteries. Patients were assessed on the Positive and Negative Syndrome Scale (PANSS) before and after 6 weeks of monotherapy with the relatively selective D 2 receptor antagonist, amisulpride (280.3±159 mg per day). A reduced principal component space based on 19 electrophysiological variables and 26 cognitive variables was used as input for a Gaussian mixture model to identify subgroups of patients. With support vector machines, we explored the relation between PANSS subscores and the identified subgroups. We identified two statistically distinct subgroups of patients. We found no significant baseline psychopathological differences between these subgroups, but the effect of treatment in the groups was predicted with an accuracy of 74.3% (P=0.003). In conclusion, electrophysiology and cognition data may be used to classify subgroups of schizophrenia patients. The two distinct subgroups, which we identified, were psychopathologically inseparable before treatment, yet their response to dopaminergic blockade was predicted with significant accuracy. This proof of principle encourages further endeavors to apply data-driven, multivariate and multimodal models to facilitate progress from symptom-based psychiatry toward individualized treatment regimens.
Bridging asymptotic independence and dependence in spatial exbtremes using Gaussian scale mixtures
Huser, Raphaël
2017-06-23
Gaussian scale mixtures are constructed as Gaussian processes with a random variance. They have non-Gaussian marginals and can exhibit asymptotic dependence unlike Gaussian processes, which are asymptotically independent except in the case of perfect dependence. In this paper, we study the extremal dependence properties of Gaussian scale mixtures and we unify and extend general results on their joint tail decay rates in both asymptotic dependence and independence cases. Motivated by the analysis of spatial extremes, we propose flexible yet parsimonious parametric copula models that smoothly interpolate from asymptotic dependence to independence and include the Gaussian dependence as a special case. We show how these new models can be fitted to high threshold exceedances using a censored likelihood approach, and we demonstrate that they provide valuable information about tail characteristics. In particular, by borrowing strength across locations, our parametric model-based approach can also be used to provide evidence for or against either asymptotic dependence class, hence complementing information given at an exploratory stage by the widely used nonparametric or parametric estimates of the χ and χ̄ coefficients. We demonstrate the capacity of our methodology by adequately capturing the extremal properties of wind speed data collected in the Pacific Northwest, US.
Energy Technology Data Exchange (ETDEWEB)
Liu, T [Department of Radiation Oncology and Winship Cancer Institute, Emory Univ, Atlanta, GA (United States); Yu, D; Beitler, J; Curran, W; Yang, X [Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA (United States); Tridandapani, S [Department of Radiology and Imaging Sciences and Winship Cancer Institute, Emory University, Atlanta, GA (United States); Bruner, D [School of Nursing and Winship Cancer Institute, Emory Univesity, Atlanta, GA (United States)
2014-06-15
Purpose: Xerostomia (dry mouth), secondary to parotid-gland injury, is a distressing side-effect in head-and-neck radiotherapy (RT). This study's purpose is to develop a novel ultrasound technique to quantitatively evaluate post-RT parotid-gland injury. Methods: Recent ultrasound studies have shown that healthy parotid glands exhibit homogeneous echotexture, whereas post-RT parotid glands are often heterogeneous, with multiple hypoechoic (inflammation) or hyperechoic (fibrosis) regions. We propose to use a Gaussian mixture model to analyze the ultrasonic echo-histogram of the parotid glands. An IRB-approved clinical study was conducted: (1) control-group: 13 healthy-volunteers, served as the control; (2) acutetoxicity group − 20 patients (mean age: 62.5 ± 8.9 years, follow-up: 2.0±0.8 months); and (3) late-toxicity group − 18 patients (mean age: 60.7 ± 7.3 years, follow-up: 20.1±10.4 months). All patients experienced RTOG grade 1 or 2 salivary-gland toxicity. Each participant underwent an ultrasound scan (10 MHz) of the bilateral parotid glands. An echo-intensity histogram was derived for each parotid and a Gaussian mixture model was used to fit the histogram using expectation maximization (EM) algorithm. The quality of the fitting was evaluated with the R-squared value. Results: (1) Controlgroup: all parotid glands fitted well with one Gaussian component, with a mean intensity of 79.8±4.9 (R-squared>0.96). (2) Acute-toxicity group: 37 of the 40 post-RT parotid glands fitted well with two Gaussian components, with a mean intensity of 42.9±7.4, 73.3±12.2 (R-squared>0.95). (3) Latetoxicity group: 32 of the 36 post-RT parotid fitted well with 3 Gaussian components, with mean intensities of 49.7±7.6, 77.2±8.7, and 118.6±11.8 (R-squared>0.98). Conclusion: RT-associated parotid-gland injury is common in head-and-neck RT, but challenging to assess. This work has demonstrated that the Gaussian mixture model of the echo-histogram could quantify acute and
Tien Bui, Dieu; Hoang, Nhat-Duc
2017-09-01
In this study, a probabilistic model, named as BayGmmKda, is proposed for flood susceptibility assessment in a study area in central Vietnam. The new model is a Bayesian framework constructed by a combination of a Gaussian mixture model (GMM), radial-basis-function Fisher discriminant analysis (RBFDA), and a geographic information system (GIS) database. In the Bayesian framework, GMM is used for modeling the data distribution of flood-influencing factors in the GIS database, whereas RBFDA is utilized to construct a latent variable that aims at enhancing the model performance. As a result, the posterior probabilistic output of the BayGmmKda model is used as flood susceptibility index. Experiment results showed that the proposed hybrid framework is superior to other benchmark models, including the adaptive neuro-fuzzy inference system and the support vector machine. To facilitate the model implementation, a software program of BayGmmKda has been developed in MATLAB. The BayGmmKda program can accurately establish a flood susceptibility map for the study region. Accordingly, local authorities can overlay this susceptibility map onto various land-use maps for the purpose of land-use planning or management.
Gaussian Mixture Reduction for Tracking Multiple Maneuvering Targets in Clutter
2003-03-01
the Poisson exponential term e−λV and the factorial 2-44 of the number of measurements Nm!: c ′ = cNm! e−λV The volume of the combined validation...nonswapped tracks, often other local optima exist for track swap possibilities. The approach of centering a Gaussian optimally (in the MMSE sense) between
Model selection for Gaussian kernel PCA denoising
DEFF Research Database (Denmark)
Jørgensen, Kasper Winther; Hansen, Lars Kai
2012-01-01
We propose kernel Parallel Analysis (kPA) for automatic kernel scale and model order selection in Gaussian kernel PCA. Parallel Analysis [1] is based on a permutation test for covariance and has previously been applied for model order selection in linear PCA, we here augment the procedure to also...... tune the Gaussian kernel scale of radial basis function based kernel PCA.We evaluate kPA for denoising of simulated data and the US Postal data set of handwritten digits. We find that kPA outperforms other heuristics to choose the model order and kernel scale in terms of signal-to-noise ratio (SNR...
Directory of Open Access Journals (Sweden)
César Soto-Valero
2017-07-01
Full Text Available The generation and availability of football data has increased considerably last decades, mostly due to its popularity and also because of technological advances. Gaussian mixture clustering models represents a novel approach to exploring and analyzing performance data in sports. In this paper, we use principal components analysis in conjunction with a model-based Gaussian clustering method with the purpose of characterizing professional football players. Our model approach is tested using 40 attributes from EA Sports' FIFA video game series system, corresponding to 7705 European players. Clustering results reveal a clear distinction among different performance indicators, representing four different roles in the team. Players were labeled according to these roles and a gradient tree boosting model was used for ranking attributes regarding to its importance. We found that the dribbling skill is the most discriminating variable among the different clustered players’ profiles. Resumen En las últimas décadas se ha visto un incremento considerable en la generación y disponibilidad de datos de fútbol, esto se debe fundamentalmente a la popularidad de este deporte así como a los avances tecnológicos realizados en este campo. Los modelos de agrupamiento basados en mixturas Gaussianas representan un enfoque novedoso para explorar y analizar datos de desempeño deportivo. En el presente trabajo, se lleva a cabo una caracterización de jugadores profesionales de fútbol utilizando técnicas de análisis de componentes principales y agrupamiento basados en mixturas Gaussianas. El modelo presentado es comprobado utilizando datos del sistema de videojuegos FIFA de EA Sports, dichos datos representan 40 atributos correspondientes a 7705 futbolistas europeos. Los resultados del agrupamiento revelan una clara distinción entre algunos indicadores de desempeño, los cuales corresponden a cuatro roles diferentes en el equipo. Consecuentemente, los
Extended Linear Models with Gaussian Priors
DEFF Research Database (Denmark)
Quinonero, Joaquin
2002-01-01
In extended linear models the input space is projected onto a feature space by means of an arbitrary non-linear transformation. A linear model is then applied to the feature space to construct the model output. The dimension of the feature space can be very large, or even infinite, giving the model...... a very big flexibility. Support Vector Machines (SVM's) and Gaussian processes are two examples of such models. In this technical report I present a model in which the dimension of the feature space remains finite, and where a Bayesian approach is used to train the model with Gaussian priors...... on the parameters. The Relevance Vector Machine, introduced by Tipping, is a particular case of such a model. I give the detailed derivations of the expectation-maximisation (EM) algorithm used in the training. These derivations are not found in the literature, and might be helpful for newcomers....
Probabilistic mixture-based image modelling
Czech Academy of Sciences Publication Activity Database
Haindl, Michal; Havlíček, Vojtěch; Grim, Jiří
2011-01-01
Roč. 47, č. 3 (2011), s. 482-500 ISSN 0023-5954 R&D Projects: GA MŠk 1M0572; GA ČR GA102/08/0593 Grant - others:CESNET(CZ) 387/2010; GA MŠk(CZ) 2C06019; GA ČR(CZ) GA103/11/0335 Institutional research plan: CEZ:AV0Z10750506 Keywords : BTF texture modelling * discrete distribution mixtures * Bernoulli mixture * Gaussian mixture * multi-spectral texture modelling Subject RIV: BD - Theory of Information Impact factor: 0.454, year: 2011 http://library.utia.cas.cz/separaty/2011/RO/haindl-0360244.pdf
Directory of Open Access Journals (Sweden)
Carlen Peter L
2011-04-01
Full Text Available Abstract Background Epilepsy is a common neurological disorder characterized by recurrent electrophysiological activities, known as seizures. Without the appropriate detection strategies, these seizure episodes can dramatically affect the quality of life for those afflicted. The rationale of this study is to develop an unsupervised algorithm for the detection of seizure states so that it may be implemented along with potential intervention strategies. Methods Hidden Markov model (HMM was developed to interpret the state transitions of the in vitro rat hippocampal slice local field potentials (LFPs during seizure episodes. It can be used to estimate the probability of state transitions and the corresponding characteristics of each state. Wavelet features were clustered and used to differentiate the electrophysiological characteristics at each corresponding HMM states. Using unsupervised training method, the HMM and the clustering parameters were obtained simultaneously. The HMM states were then assigned to the electrophysiological data using expert guided technique. Minimum redundancy maximum relevance (mRMR analysis and Akaike Information Criterion (AICc were applied to reduce the effect of over-fitting. The sensitivity, specificity and optimality index of chronic seizure detection were compared for various HMM topologies. The ability of distinguishing early and late tonic firing patterns prior to chronic seizures were also evaluated. Results Significant improvement in state detection performance was achieved when additional wavelet coefficient rates of change information were used as features. The final HMM topology obtained using mRMR and AICc was able to detect non-ictal (interictal, early and late tonic firing, chronic seizures and postictal activities. A mean sensitivity of 95.7%, mean specificity of 98.9% and optimality index of 0.995 in the detection of chronic seizures was achieved. The detection of early and late tonic firing was
Chiu, Alan Wl; Derchansky, Miron; Cotic, Marija; Carlen, Peter L; Turner, Steuart O; Bardakjian, Berj L
2011-04-19
Epilepsy is a common neurological disorder characterized by recurrent electrophysiological activities, known as seizures. Without the appropriate detection strategies, these seizure episodes can dramatically affect the quality of life for those afflicted. The rationale of this study is to develop an unsupervised algorithm for the detection of seizure states so that it may be implemented along with potential intervention strategies. Hidden Markov model (HMM) was developed to interpret the state transitions of the in vitro rat hippocampal slice local field potentials (LFPs) during seizure episodes. It can be used to estimate the probability of state transitions and the corresponding characteristics of each state. Wavelet features were clustered and used to differentiate the electrophysiological characteristics at each corresponding HMM states. Using unsupervised training method, the HMM and the clustering parameters were obtained simultaneously. The HMM states were then assigned to the electrophysiological data using expert guided technique. Minimum redundancy maximum relevance (mRMR) analysis and Akaike Information Criterion (AICc) were applied to reduce the effect of over-fitting. The sensitivity, specificity and optimality index of chronic seizure detection were compared for various HMM topologies. The ability of distinguishing early and late tonic firing patterns prior to chronic seizures were also evaluated. Significant improvement in state detection performance was achieved when additional wavelet coefficient rates of change information were used as features. The final HMM topology obtained using mRMR and AICc was able to detect non-ictal (interictal), early and late tonic firing, chronic seizures and postictal activities. A mean sensitivity of 95.7%, mean specificity of 98.9% and optimality index of 0.995 in the detection of chronic seizures was achieved. The detection of early and late tonic firing was validated with experimental intracellular electrical
Iterative Diffusion-Based Distributed Cubature Gaussian Mixture Filter for Multisensor Estimation
Directory of Open Access Journals (Sweden)
Bin Jia
2016-10-01
Full Text Available In this paper, a distributed cubature Gaussian mixture filter (DCGMF based on an iterative diffusion strategy (DCGMF-ID is proposed for multisensor estimation and information fusion. The uncertainties are represented as Gaussian mixtures at each sensor node. A high-degree cubature Kalman filter provides accurate estimation of each Gaussian mixture component. An iterative diffusion scheme is utilized to fuse the mean and covariance of each Gaussian component obtained from each sensor node. The DCGMF-ID extends the conventional diffusion-based fusion strategy by using multiple iterative information exchanges among neighboring sensor nodes. The convergence property of the iterative diffusion is analyzed. In addition, it is shown that the convergence of the iterative diffusion can be interpreted from the information-theoretic perspective as minimization of the Kullback–Leibler divergence. The performance of the DCGMF-ID is compared with the DCGMF based on the average consensus (DCGMF-AC and the DCGMF based on the iterative covariance intersection (DCGMF-ICI via a maneuvering target-tracking problem using multiple sensors. The simulation results show that the DCGMF-ID has better performance than the DCGMF based on noniterative diffusion, which validates the benefit of iterative information exchanges. In addition, the DCGMF-ID outperforms the DCGMF-ICI and DCGMF-AC when the number of iterations is limited.
Performance of BICM-T transceivers over Gaussian mixture noise channels
Malik, Muhammad Talha
2014-04-01
Experimental measurements have shown that the noise in many communication channels is non-Gaussian. Bit interleaved coded modulation (BICM) is very popular for spectrally efficient transmission. Recent results have shown that the performance of BICM using convolutional codes in non-fading channels can be significantly improved if the coded bits are not interleaved at all. This particular BICM design is called BICM trivial (BICM-T). In this paper, we analyze the performance of a generalized BICM-T design for communication over Gaussian mixture noise (GMN) channels. The results disclose that for an optimal bit error rate (BER) performance, the use of an interleaver in BICM for GMN channels depends upon the strength of the impulsive noise components in the Gaussian mixture. The results presented for 16-QAM show that the BICM-T can result in gains up to 1.5 dB for a target BER of 10-6 if the impulsive noise in the Gaussian mixture is below a certain threshold level. The simulation results verify the tightness of developed union bound (UB) on BER performance.
Environmental Modeling Framework using Stacked Gaussian Processes
Abdelfatah, Kareem; Bao, Junshu; Terejanu, Gabriel
2016-01-01
A network of independently trained Gaussian processes (StackedGP) is introduced to obtain predictions of quantities of interest with quantified uncertainties. The main applications of the StackedGP framework are to integrate different datasets through model composition, enhance predictions of quantities of interest through a cascade of intermediate predictions, and to propagate uncertainties through emulated dynamical systems driven by uncertain forcing variables. By using analytical first an...
On the structure of Gaussian pricing models and Gaussian Markov functional models
C.D.D. Neumann
2002-01-01
textabstractThis article investigates the structure of Gaussian pricing models (that is, models in which future returns are normally distributed). Although much is already known about such models, this article differs in that it is based on a formulation of the theory of derivative pricing in which
Lorentzen, Rolf J.; Stordal, Andreas S.; Hewitt, Neal
2017-05-01
Flowrate allocation in production wells is a complicated task, especially for multiphase flow combined with several reservoir zones and/or branches. The result depends heavily on the available production data, and the accuracy of these. In the application we show here, downhole pressure and temperature data are available, in addition to the total flowrates at the wellhead. The developed methodology inverts these observations to the fluid flowrates (oil, water and gas) that enters two production branches in a real full-scale producer. A major challenge is accurate estimation of flowrates during rapid variations in the well, e.g. due to choke adjustments. The Auxiliary Sequential Importance Resampling (ASIR) filter was developed to handle such challenges, by introducing an auxiliary step, where the particle weights are recomputed (second weighting step) based on how well the particles reproduce the observations. However, the ASIR filter suffers from large computational time when the number of unknown parameters increase. The Gaussian Mixture (GM) filter combines a linear update, with the particle filters ability to capture non-Gaussian behavior. This makes it possible to achieve good performance with fewer model evaluations. In this work we present a new filter which combines the ASIR filter and the Gaussian Mixture filter (denoted ASGM), and demonstrate improved estimation (compared to ASIR and GM filters) in cases with rapid parameter variations, while maintaining reasonable computational cost.
Gu, Wenjun; Zhang, Weizhi; Wang, Jin; Amini Kashani, M. R.; Kavehrad, Mohsen
2015-01-01
Over the past decade, location based services (LBS) have found their wide applications in indoor environments, such as large shopping malls, hospitals, warehouses, airports, etc. Current technologies provide wide choices of available solutions, which include Radio-frequency identification (RFID), Ultra wideband (UWB), wireless local area network (WLAN) and Bluetooth. With the rapid development of light-emitting-diodes (LED) technology, visible light communications (VLC) also bring a practical approach to LBS. As visible light has a better immunity against multipath effect than radio waves, higher positioning accuracy is achieved. LEDs are utilized both for illumination and positioning purpose to realize relatively lower infrastructure cost. In this paper, an indoor positioning system using VLC is proposed, with LEDs as transmitters and photo diodes as receivers. The algorithm for estimation is based on received-signalstrength (RSS) information collected from photo diodes and trilateration technique. By appropriately making use of the characteristics of receiver movements and the property of trilateration, estimation on three-dimensional (3-D) coordinates is attained. Filtering technique is applied to enable tracking capability of the algorithm, and a higher accuracy is reached compare to raw estimates. Gaussian mixture Sigma-point particle filter (GM-SPPF) is proposed for this 3-D system, which introduces the notion of Gaussian Mixture Model (GMM). The number of particles in the filter is reduced by approximating the probability distribution with Gaussian components.
Multimodal Similarity Gaussian Process Latent Variable Model.
Song, Guoli; Wang, Shuhui; Huang, Qingming; Tian, Qi
2017-09-01
Data from real applications involve multiple modalities representing content with the same semantics from complementary aspects. However, relations among heterogeneous modalities are simply treated as observation-to-fit by existing work, and the parameterized modality specific mapping functions lack flexibility in directly adapting to the content divergence and semantic complicacy in multimodal data. In this paper, we build our work based on the Gaussian process latent variable model (GPLVM) to learn the non-parametric mapping functions and transform heterogeneous modalities into a shared latent space. We propose multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation. We further propose multimodal distance-preserved similarity GPLVM (m-DSimGP) to preserve the intra-modal global similarity structure, and multimodal regularized similarity GPLVM (m-RSimGP) by encouraging similar/dissimilar points to be similar/dissimilar in the latent space. We propose m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSimGP to learn the latent representation. The overall objective functions of the four models are solved by simple and scalable gradient decent techniques. They can be applied to various tasks to discover the nonlinear correlations and to obtain the comparable low-dimensional representation for heterogeneous modalities. On five widely used real-world data sets, our approaches outperform existing models on cross-modal content retrieval and multimodal classification.
Directory of Open Access Journals (Sweden)
Chin-Teng Lin
2018-01-01
Full Text Available Electroencephalogram (EEG signals are usually contaminated with various artifacts, such as signal associated with muscle activity, eye movement, and body motion, which have a noncerebral origin. The amplitude of such artifacts is larger than that of the electrical activity of the brain, so they mask the cortical signals of interest, resulting in biased analysis and interpretation. Several blind source separation methods have been developed to remove artifacts from the EEG recordings. However, the iterative process for measuring separation within multichannel recordings is computationally intractable. Moreover, manually excluding the artifact components requires a time-consuming offline process. This work proposes a real-time artifact removal algorithm that is based on canonical correlation analysis (CCA, feature extraction, and the Gaussian mixture model (GMM to improve the quality of EEG signals. The CCA was used to decompose EEG signals into components followed by feature extraction to extract representative features and GMM to cluster these features into groups to recognize and remove artifacts. The feasibility of the proposed algorithm was demonstrated by effectively removing artifacts caused by blinks, head/body movement, and chewing from EEG recordings while preserving the temporal and spectral characteristics of the signals that are important to cognitive research.
Unbiased free energy estimates in fast nonequilibrium transformations using Gaussian mixtures
International Nuclear Information System (INIS)
Procacci, Piero
2015-01-01
In this paper, we present an improved method for obtaining unbiased estimates of the free energy difference between two thermodynamic states using the work distribution measured in nonequilibrium driven experiments connecting these states. The method is based on the assumption that any observed work distribution is given by a mixture of Gaussian distributions, whose normal components are identical in either direction of the nonequilibrium process, with weights regulated by the Crooks theorem. Using the prototypical example for the driven unfolding/folding of deca-alanine, we show that the predicted behavior of the forward and reverse work distributions, assuming a combination of only two Gaussian components with Crooks derived weights, explains surprisingly well the striking asymmetry in the observed distributions at fast pulling speeds. The proposed methodology opens the way for a perfectly parallel implementation of Jarzynski-based free energy calculations in complex systems
Stochastic cluster algorithms for discrete Gaussian (SOS) models
International Nuclear Information System (INIS)
Evertz, H.G.; Hamburg Univ.; Hasenbusch, M.; Marcu, M.; Tel Aviv Univ.; Pinn, K.; Muenster Univ.; Solomon, S.
1990-10-01
We present new Monte Carlo cluster algorithms which eliminate critical slowing down in the simulation of solid-on-solid models. In this letter we focus on the two-dimensional discrete Gaussian model. The algorithms are based on reflecting the integer valued spin variables with respect to appropriately chosen reflection planes. The proper choice of the reflection plane turns out to be crucial in order to obtain a small dynamical exponent z. Actually, the successful versions of our algorithm are a mixture of two different procedures for choosing the reflection plane, one of them ergodic but slow, the other one non-ergodic and also slow when combined with a Metropolis algorithm. (orig.)
A Non-Gaussian Spatial Generalized Linear Latent Variable Model
Irincheeva, Irina
2012-08-03
We consider a spatial generalized linear latent variable model with and without normality distributional assumption on the latent variables. When the latent variables are assumed to be multivariate normal, we apply a Laplace approximation. To relax the assumption of marginal normality in favor of a mixture of normals, we construct a multivariate density with Gaussian spatial dependence and given multivariate margins. We use the pairwise likelihood to estimate the corresponding spatial generalized linear latent variable model. The properties of the resulting estimators are explored by simulations. In the analysis of an air pollution data set the proposed methodology uncovers weather conditions to be a more important source of variability than air pollution in explaining all the causes of non-accidental mortality excluding accidents. © 2012 International Biometric Society.
Guadagnini, A.; Riva, M.; Neuman, S. P.
2016-12-01
Environmental quantities such as log hydraulic conductivity (or transmissivity), Y(x) = ln K(x), and their spatial (or temporal) increments, ΔY, are known to be generally non-Gaussian. Documented evidence of such behavior includes symmetry of increment distributions at all separation scales (or lags) between incremental values of Y with sharp peaks and heavy tails that decay asymptotically as lag increases. This statistical scaling occurs in porous as well as fractured media characterized by either one or a hierarchy of spatial correlation scales. In hierarchical media one observes a range of additional statistical ΔY scaling phenomena, all of which are captured comprehensibly by a novel generalized sub-Gaussian (GSG) model. In this model Y forms a mixture Y(x) = U(x) G(x) of single- or multi-scale Gaussian processes G having random variances, U being a non-negative subordinator independent of G. Elsewhere we developed ways to generate unconditional and conditional random realizations of isotropic or anisotropic GSG fields which can be embedded in numerical Monte Carlo flow and transport simulations. Here we present and discuss expressions for probability distribution functions of Y and ΔY as well as their lead statistical moments. We then focus on a simple flow setting of mean uniform steady state flow in an unbounded, two-dimensional domain, exploring ways in which non-Gaussian heterogeneity affects stochastic flow and transport descriptions. Our expressions represent (a) lead order autocovariance and cross-covariance functions of hydraulic head, velocity and advective particle displacement as well as (b) analogues of preasymptotic and asymptotic Fickian dispersion coefficients. We compare them with corresponding expressions developed in the literature for Gaussian Y.
A Robust Non-Gaussian Data Assimilation Method for Highly Non-Linear Models
Directory of Open Access Journals (Sweden)
Elias D. Nino-Ruiz
2018-03-01
Full Text Available In this paper, we propose an efficient EnKF implementation for non-Gaussian data assimilation based on Gaussian Mixture Models and Markov-Chain-Monte-Carlo (MCMC methods. The proposed method works as follows: based on an ensemble of model realizations, prior errors are estimated via a Gaussian Mixture density whose parameters are approximated by means of an Expectation Maximization method. Then, by using an iterative method, observation operators are linearized about current solutions and posterior modes are estimated via a MCMC implementation. The acceptance/rejection criterion is similar to that of the Metropolis-Hastings rule. Experimental tests are performed on the Lorenz 96 model. The results show that the proposed method can decrease prior errors by several order of magnitudes in a root-mean-square-error sense for nearly sparse or dense observational networks.
Link Prediction via Sparse Gaussian Graphical Model
Directory of Open Access Journals (Sweden)
Liangliang Zhang
2016-01-01
Full Text Available Link prediction is an important task in complex network analysis. Traditional link prediction methods are limited by network topology and lack of node property information, which makes predicting links challenging. In this study, we address link prediction using a sparse Gaussian graphical model and demonstrate its theoretical and practical effectiveness. In theory, link prediction is executed by estimating the inverse covariance matrix of samples to overcome information limits. The proposed method was evaluated with four small and four large real-world datasets. The experimental results show that the area under the curve (AUC value obtained by the proposed method improved by an average of 3% and 12.5% compared to 13 mainstream similarity methods, respectively. This method outperforms the baseline method, and the prediction accuracy is superior to mainstream methods when using only 80% of the training set. The method also provides significantly higher AUC values when using only 60% in Dolphin and Taro datasets. Furthermore, the error rate of the proposed method demonstrates superior performance with all datasets compared to mainstream methods.
Gaussian-Charge Polarizable and Nonpolarizable Models for CO2.
Jiang, Hao; Moultos, Othonas A; Economou, Ioannis G; Panagiotopoulos, Athanassios Z
2016-02-11
A polarizable intermolecular potential model using three classical Drude oscillators on the atomic sites has been developed for CO2. The model is rigid with bond lengths and molecular geometries set to their experimental values. Electrostatic interactions are represented by three Gaussian charges connected to the molecular frame by harmonic springs. Nonelectrostatic interactions are represented by the Buckingham exponential-6 potential, with potential parameters optimized to vapor-liquid equilibria (VLE) data. A nonpolarizable CO2 model that shares the other ingredients of the polarizable model was also developed and optimized to VLE data. Gibbs ensemble Monte Carlo and molecular dynamics simulations were used to evaluate the two models with respect to a variety of thermodynamic and transport properties, including the enthalpy of vaporization, second virial coefficient, density in the one-phase fluid region, isobaric and isochoric heat capacities, radial distribution functions, self-diffusion coefficient, and shear viscosity. Excellent agreement between model predictions and experimental data was found for all properties studied. The polarizable and nonpolarizable models provide a similar representation of CO2 properties, which indicates that the properties of pure CO2 fluid are not strongly affected by polarization. The polarizable model, which has an order of magnitude higher computational cost than the nonpolarizable model, will likely be useful for the study of a mixture of CO2 and polar components for which polarization is important.
Extending Growth Mixture Models Using Continuous Non-Elliptical Distributions
Wei, Yuhong; Tang, Yang; Shireman, Emilie; McNicholas, Paul D.; Steinley, Douglas L.
2017-01-01
Growth mixture models (GMMs) incorporate both conventional random effects growth modeling and latent trajectory classes as in finite mixture modeling; therefore, they offer a way to handle the unobserved heterogeneity between subjects in their development. GMMs with Gaussian random effects dominate the literature. When the data are asymmetric and/or have heavier tails, more than one latent class is required to capture the observed variable distribution. Therefore, a GMM with continuous non-el...
Application Of Shared Gamma And Inverse-Gaussian Frailty Models ...
African Journals Online (AJOL)
Shared Gamma and Inverse-Gaussian Frailty models are used to analyze the survival times of patients who are clustered according to cancer/tumor types under Parametric Proportional Hazard framework. The result of the ... However, no evidence is strong enough for preference of either Gamma or Inverse Gaussian Frailty.
Directory of Open Access Journals (Sweden)
Milad eLankarany
2013-09-01
Full Text Available Time-varying excitatory and inhibitory synaptic inputs govern activity of neurons and process information in the brain. The importance of trial-to-trial fluctuations of synaptic inputs has recently been investigated in neuroscience. Such fluctuations are ignored in the most conventional techniques because they are removed when trials are averaged during linear regression techniques. Here, we propose a novel recursive algorithm based on Gaussian mixture Kalman filtering for estimating time-varying excitatory and inhibitory synaptic inputs from single trials of noisy membrane potential in current clamp recordings. The Kalman filtering is followed by an expectation maximization algorithm to infer the statistical parameters (time-varying mean and variance of the synaptic inputs in a non-parametric manner. As our proposed algorithm is repeated recursively, the inferred parameters of the mixtures are used to initiate the next iteration. Unlike other recent algorithms, our algorithm does not assume an a priori distribution from which the synaptic inputs are generated. Instead, the algorithm recursively estimates such a distribution by fitting a Gaussian mixture model. The performance of the proposed algorithms is compared to a previously proposed PF-based algorithm (Paninski et al., 2012 with several illustrative examples, assuming that the distribution of synaptic input is unknown. If noise is small, the performance of our algorithms is similar to that of the previous one. However, if noise is large, they can significantly outperform the previous proposal. These promising results suggest that our algorithm is a robust and efficient technique for estimating time varying excitatory and inhibitory synaptic conductances from single trials of membrane potential recordings.
Infinite von Mises-Fisher Mixture Modeling of Whole Brain fMRI Data
DEFF Research Database (Denmark)
Røge, Rasmus; Madsen, Kristoffer Hougaard; Schmidt, Mikkel Nørgaard
2017-01-01
Cluster analysis of functional magnetic resonance imaging (fMRI) data is often performed using gaussian mixture models, but when the time series are standardized such that the data reside on a hypersphere, this modeling assumption is questionable. The consequences of ignoring the underlying...... spherical manifold are rarely analyzed, in part due to the computational challenges imposed by directional statistics. In this letter, we discuss a Bayesian von Mises-Fisher (vMF) mixture model for data on the unit hypersphere and present an efficient inference procedure based on collapsed Markov chain...... Monte Carlo sampling. Comparing the vMF and gaussian mixture models on synthetic data, we demonstrate that the vMF model has a slight advantage inferring the true underlying clustering when compared to gaussian-based models on data generated from both a mixture of vMFs and a mixture of gaussians...
Leong, Siow Hoo; Ong, Seng Huat
2017-01-01
This paper considers three crucial issues in processing scaled down image, the representation of partial image, similarity measure and domain adaptation. Two Gaussian mixture model based algorithms are proposed to effectively preserve image details and avoids image degradation. Multiple partial images are clustered separately through Gaussian mixture model clustering with a scan and select procedure to enhance the inclusion of small image details. The local image features, represented by maximum likelihood estimates of the mixture components, are classified by using the modified Bayes factor (MBF) as a similarity measure. The detection of novel local features from MBF will suggest domain adaptation, which is changing the number of components of the Gaussian mixture model. The performance of the proposed algorithms are evaluated with simulated data and real images and it is shown to perform much better than existing Gaussian mixture model based algorithms in reproducing images with higher structural similarity index.
Concomitant variables in finite mixture models
Wedel, M
The standard mixture model, the concomitant variable mixture model, the mixture regression model and the concomitant variable mixture regression model all enable simultaneous identification and description of groups of observations. This study reviews the different ways in which dependencies among
Directory of Open Access Journals (Sweden)
Masaru Yokoe
2009-03-01
Full Text Available This paper proposes a method to quantitatively measure and evaluate finger tapping movements for the assessment of motor function using log-linearized Gaussian mixture networks (LLGMNs. First, finger tapping movements are measured using magnetic sensors, and eleven indices are computed for evaluation. After standardizing these indices based on those of normal subjects, they are input to LLGMNs to assess motor function. Then, motor ability is probabilistically discriminated to determine whether it is normal or not using a classifier combined with the output of multiple LLGMNs based on bagging and entropy. This paper reports on evaluation and discrimination experiments performed on finger tapping movements in 33 Parkinson’s disease (PD patients and 32 normal elderly subjects. The results showed that the patients could be classified correctly in terms of their impairment status with a high degree of accuracy (average rate: 93:1 § 3:69% using 12 LLGMNs, which was about 5% higher than the results obtained using a single LLGMN.
Directory of Open Access Journals (Sweden)
GHAREHPETIAN, G. B.
2009-06-01
Full Text Available The analysis of the risk of partial and total blackouts has a crucial role to determine safe limits in power system design, operation and upgrade. Due to huge cost of blackouts, it is very important to improve risk assessment methods. In this paper, Monte Carlo simulation (MCS was used to analyze the risk and Gaussian Mixture Method (GMM has been used to estimate the probability density function (PDF of the load curtailment, in order to improve the power system risk assessment method. In this improved method, PDF and a suggested index have been used to analyze the risk of loss of load. The effect of considering the number of generation units of power plants in the risk analysis has been studied too. The improved risk assessment method has been applied to IEEE 118 bus and the network of Khorasan Regional Electric Company (KREC and the PDF of the load curtailment has been determined for both systems. The effect of various network loadings, transmission unavailability, transmission capacity and generation unavailability conditions on blackout risk has been investigated too.
Self-similar Gaussian processes for modeling anomalous diffusion
Lim, S. C.; Muniandy, S. V.
2002-08-01
We study some Gaussian models for anomalous diffusion, which include the time-rescaled Brownian motion, two types of fractional Brownian motion, and models associated with fractional Brownian motion based on the generalized Langevin equation. Gaussian processes associated with these models satisfy the anomalous diffusion relation which requires the mean-square displacement to vary with tα, 0Brownian motion and time-rescaled Brownian motion all have the same probability distribution function, the Slepian theorem can be used to compare their first passage time distributions, which are different. Finally, in order to model anomalous diffusion with a variable exponent α(t) it is necessary to consider the multifractional extensions of these Gaussian processes.
A note on moving average models for Gaussian random fields
DEFF Research Database (Denmark)
Hansen, Linda Vadgård; Thorarinsdottir, Thordis L.
basis, a general modeling framework which includes several types of non-Gaussian models. We propose a new one-parameter spatial correlation model which arises from a power kernel and show that the associated Hausdorff dimension of the sample paths can take any value between 2 and 3. As a result...
Perturbative corrections for approximate inference in gaussian latent variable models
DEFF Research Database (Denmark)
Opper, Manfred; Paquet, Ulrich; Winther, Ole
2013-01-01
Expectation Propagation (EP) provides a framework for approximate inference. When the model under consideration is over a latent Gaussian field, with the approximation being Gaussian, we show how these approximations can systematically be corrected. A perturbative expansion is made of the exact b...... illustrate on tree-structured Ising model approximations. Furthermore, they provide a polynomial-time assessment of the approximation error. We also provide both theoretical and practical insights on the exactness of the EP solution. © 2013 Manfred Opper, Ulrich Paquet and Ole Winther....
Modelling and control of dynamic systems using gaussian process models
Kocijan, Juš
2016-01-01
This monograph opens up new horizons for engineers and researchers in academia and in industry dealing with or interested in new developments in the field of system identification and control. It emphasizes guidelines for working solutions and practical advice for their implementation rather than the theoretical background of Gaussian process (GP) models. The book demonstrates the potential of this recent development in probabilistic machine-learning methods and gives the reader an intuitive understanding of the topic. The current state of the art is treated along with possible future directions for research. Systems control design relies on mathematical models and these may be developed from measurement data. This process of system identification, when based on GP models, can play an integral part of control design in data-based control and its description as such is an essential aspect of the text. The background of GP regression is introduced first with system identification and incorporation of prior know...
inverse gaussian model for small area estimation via gibbs sampling
African Journals Online (AJOL)
ADMIN
(1994) extended the work by Fries and. Bhattacharyya (1983) to include the maximum likelihood analysis of the two-factor inverse. Gaussian model for the unbalanced and interaction case for the estimation of small area parameters in finite populations. The object of this article is to develop a Bayesian approach for small ...
Inverse Gaussian model for small area estimation via Gibbs sampling
African Journals Online (AJOL)
We present a Bayesian method for estimating small area parameters under an inverse Gaussian model. The method is extended to estimate small area parameters for finite populations. The Gibbs sampler is proposed as a mechanism for implementing the Bayesian paradigm. We illustrate the method by application to ...
Combinatorial bounds on the α-divergence of univariate mixture models
Nielsen, Frank
2017-06-20
We derive lower- and upper-bounds of α-divergence between univariate mixture models with components in the exponential family. Three pairs of bounds are presented in order with increasing quality and increasing computational cost. They are verified empirically through simulated Gaussian mixture models. The presented methodology generalizes to other divergence families relying on Hellinger-type integrals.
A mixture copula Bayesian network model for multimodal genomic data
Directory of Open Access Journals (Sweden)
Qingyang Zhang
2017-04-01
Full Text Available Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine expectation–maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling data set. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.
An approximate fractional Gaussian noise model with computational cost
Sørbye, Sigrunn H.
2017-09-18
Fractional Gaussian noise (fGn) is a stationary time series model with long memory properties applied in various fields like econometrics, hydrology and climatology. The computational cost in fitting an fGn model of length $n$ using a likelihood-based approach is ${\\\\mathcal O}(n^{2})$, exploiting the Toeplitz structure of the covariance matrix. In most realistic cases, we do not observe the fGn process directly but only through indirect Gaussian observations, so the Toeplitz structure is easily lost and the computational cost increases to ${\\\\mathcal O}(n^{3})$. This paper presents an approximate fGn model of ${\\\\mathcal O}(n)$ computational cost, both with direct or indirect Gaussian observations, with or without conditioning. This is achieved by approximating fGn with a weighted sum of independent first-order autoregressive processes, fitting the parameters of the approximation to match the autocorrelation function of the fGn model. The resulting approximation is stationary despite being Markov and gives a remarkably accurate fit using only four components. The performance of the approximate fGn model is demonstrated in simulations and two real data examples.
Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies.
Savitsky, Terrance; Vannucci, Marina; Sha, Naijun
2011-02-01
This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes. We explore alternative covariance formulations for the Gaussian process prior and demonstrate the flexibility of the construction. Next, we focus on the important problem of selecting variables from the set of possible predictors and describe a general framework that employs mixture priors. We compare alternative MCMC strategies for posterior inference and achieve a computationally efficient and practical approach. We demonstrate performances on simulated and benchmark data sets.
Directory of Open Access Journals (Sweden)
Douglas Scott C
2007-01-01
Full Text Available We derive new fixed-point algorithms for the blind separation of complex-valued mixtures of independent, noncircularly symmetric, and non-Gaussian source signals. Leveraging recently developed results on the separability of complex-valued signal mixtures, we systematically construct iterative procedures on a kurtosis-based contrast whose evolutionary characteristics are identical to those of the FastICA algorithm of Hyvarinen and Oja in the real-valued mixture case. Thus, our methods inherit the fast convergence properties, computational simplicity, and ease of use of the FastICA algorithm while at the same time extending this class of techniques to complex signal mixtures. For extracting multiple sources, symmetric and asymmetric signal deflation procedures can be employed. Simulations for both noiseless and noisy mixtures indicate that the proposed algorithms have superior finite-sample performance in data-starved scenarios as compared to existing complex ICA methods while performing about as well as the best of these techniques for larger data-record lengths.
Prediction of Geological Subsurfaces Based on Gaussian Random Field Models
Energy Technology Data Exchange (ETDEWEB)
Abrahamsen, Petter
1997-12-31
During the sixties, random functions became practical tools for predicting ore reserves with associated precision measures in the mining industry. This was the start of the geostatistical methods called kriging. These methods are used, for example, in petroleum exploration. This thesis reviews the possibilities for using Gaussian random functions in modelling of geological subsurfaces. It develops methods for including many sources of information and observations for precise prediction of the depth of geological subsurfaces. The simple properties of Gaussian distributions make it possible to calculate optimal predictors in the mean square sense. This is done in a discussion of kriging predictors. These predictors are then extended to deal with several subsurfaces simultaneously. It is shown how additional velocity observations can be used to improve predictions. The use of gradient data and even higher order derivatives are also considered and gradient data are used in an example. 130 refs., 44 figs., 12 tabs.
A model of non-Gaussian diffusion in heterogeneous media
Lanoiselée, Yann; Grebenkov, Denis S.
2018-04-01
Recent progress in single-particle tracking has shown evidence of the non-Gaussian distribution of displacements in living cells, both near the cellular membrane and inside the cytoskeleton. Similar behavior has also been observed in granular materials, turbulent flows, gels and colloidal suspensions, suggesting that this is a general feature of diffusion in complex media. A possible interpretation of this phenomenon is that a tracer explores a medium with spatio-temporal fluctuations which result in local changes of diffusivity. We propose and investigate an ergodic, easily interpretable model, which implements the concept of diffusing diffusivity. Depending on the parameters, the distribution of displacements can be either flat or peaked at small displacements with an exponential tail at large displacements. We show that the distribution converges slowly to a Gaussian one. We calculate statistical properties, derive the asymptotic behavior and discuss some implications and extensions.
Supervised Gaussian process latent variable model for dimensionality reduction.
Gao, Xinbo; Wang, Xiumei; Tao, Dacheng; Li, Xuelong
2011-04-01
The Gaussian process latent variable model (GP-LVM) has been identified to be an effective probabilistic approach for dimensionality reduction because it can obtain a low-dimensional manifold of a data set in an unsupervised fashion. Consequently, the GP-LVM is insufficient for supervised learning tasks (e.g., classification and regression) because it ignores the class label information for dimensionality reduction. In this paper, a supervised GP-LVM is developed for supervised learning tasks, and the maximum a posteriori algorithm is introduced to estimate positions of all samples in the latent variable space. We present experimental evidences suggesting that the supervised GP-LVM is able to use the class label information effectively, and thus, it outperforms the GP-LVM and the discriminative extension of the GP-LVM consistently. The comparison with some supervised classification methods, such as Gaussian process classification and support vector machines, is also given to illustrate the advantage of the proposed method.
Modelling of an homogeneous equilibrium mixture model
International Nuclear Information System (INIS)
Bernard-Champmartin, A.; Poujade, O.; Mathiaud, J.; Mathiaud, J.; Ghidaglia, J.M.
2014-01-01
We present here a model for two phase flows which is simpler than the 6-equations models (with two densities, two velocities, two temperatures) but more accurate than the standard mixture models with 4 equations (with two densities, one velocity and one temperature). We are interested in the case when the two-phases have been interacting long enough for the drag force to be small but still not negligible. The so-called Homogeneous Equilibrium Mixture Model (HEM) that we present is dealing with both mixture and relative quantities, allowing in particular to follow both a mixture velocity and a relative velocity. This relative velocity is not tracked by a conservation law but by a closure law (drift relation), whose expression is related to the drag force terms of the two-phase flow. After the derivation of the model, a stability analysis and numerical experiments are presented. (authors)
Evaluation of Gaussian approximations for data assimilation in reservoir models
Iglesias, Marco A.
2013-07-14
The Bayesian framework is the standard approach for data assimilation in reservoir modeling. This framework involves characterizing the posterior distribution of geological parameters in terms of a given prior distribution and data from the reservoir dynamics, together with a forward model connecting the space of geological parameters to the data space. Since the posterior distribution quantifies the uncertainty in the geologic parameters of the reservoir, the characterization of the posterior is fundamental for the optimal management of reservoirs. Unfortunately, due to the large-scale highly nonlinear properties of standard reservoir models, characterizing the posterior is computationally prohibitive. Instead, more affordable ad hoc techniques, based on Gaussian approximations, are often used for characterizing the posterior distribution. Evaluating the performance of those Gaussian approximations is typically conducted by assessing their ability at reproducing the truth within the confidence interval provided by the ad hoc technique under consideration. This has the disadvantage of mixing up the approximation properties of the history matching algorithm employed with the information content of the particular observations used, making it hard to evaluate the effect of the ad hoc approximations alone. In this paper, we avoid this disadvantage by comparing the ad hoc techniques with a fully resolved state-of-the-art probing of the Bayesian posterior distribution. The ad hoc techniques whose performance we assess are based on (1) linearization around the maximum a posteriori estimate, (2) randomized maximum likelihood, and (3) ensemble Kalman filter-type methods. In order to fully resolve the posterior distribution, we implement a state-of-the art Markov chain Monte Carlo (MCMC) method that scales well with respect to the dimension of the parameter space, enabling us to study realistic forward models, in two space dimensions, at a high level of grid refinement. Our
Graphical Gaussian models with edge and vertex symmetries
DEFF Research Database (Denmark)
Højsgaard, Søren; Lauritzen, Steffen L
2008-01-01
We introduce new types of graphical Gaussian models by placing symmetry restrictions on the concentration or correlation matrix. The models can be represented by coloured graphs, where parameters that are associated with edges or vertices of the same colour are restricted to being identical. We...... study the properties of such models and derive the necessary algorithms for calculating maximum likelihood estimates. We identify conditions for restrictions on the concentration and correlation matrices being equivalent. This is for example the case when symmetries are generated by permutation...
Perturbative corrections for approximate inference in gaussian latent variable models
DEFF Research Database (Denmark)
Opper, Manfred; Paquet, Ulrich; Winther, Ole
2013-01-01
orders, corrections of increasing polynomial complexity can be applied to the approximation. The second order provides a correction in quadratic time, which we apply to an array of Gaussian process and Ising models. The corrections generalize to arbitrarily complex approximating families, which we...... illustrate on tree-structured Ising model approximations. Furthermore, they provide a polynomial-time assessment of the approximation error. We also provide both theoretical and practical insights on the exactness of the EP solution. © 2013 Manfred Opper, Ulrich Paquet and Ole Winther....
Case studies in Gaussian process modelling of computer codes
International Nuclear Information System (INIS)
Kennedy, Marc C.; Anderson, Clive W.; Conti, Stefano; O'Hagan, Anthony
2006-01-01
In this paper we present a number of recent applications in which an emulator of a computer code is created using a Gaussian process model. Tools are then applied to the emulator to perform sensitivity analysis and uncertainty analysis. Sensitivity analysis is used both as an aid to model improvement and as a guide to how much the output uncertainty might be reduced by learning about specific inputs. Uncertainty analysis allows us to reflect output uncertainty due to unknown input parameters, when the finished code is used for prediction. The computer codes themselves are currently being developed within the UK Centre for Terrestrial Carbon Dynamics
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Fractional Gaussian noise: Prior specification and model comparison
Sørbye, Sigrunn Holbek
2017-07-07
Fractional Gaussian noise (fGn) is a stationary stochastic process used to model antipersistent or persistent dependency structures in observed time series. Properties of the autocovariance function of fGn are characterised by the Hurst exponent (H), which, in Bayesian contexts, typically has been assigned a uniform prior on the unit interval. This paper argues why a uniform prior is unreasonable and introduces the use of a penalised complexity (PC) prior for H. The PC prior is computed to penalise divergence from the special case of white noise and is invariant to reparameterisations. An immediate advantage is that the exact same prior can be used for the autocorrelation coefficient ϕ(symbol) of a first-order autoregressive process AR(1), as this model also reflects a flexible version of white noise. Within the general setting of latent Gaussian models, this allows us to compare an fGn model component with AR(1) using Bayes factors, avoiding the confounding effects of prior choices for the two hyperparameters H and ϕ(symbol). Among others, this is useful in climate regression models where inference for underlying linear or smooth trends depends heavily on the assumed noise model.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Performance of monitoring networks estimated from a Gaussian plume model
International Nuclear Information System (INIS)
Seebregts, A.J.; Hienen, J.F.A.
1990-10-01
In support of the ECN study on monitoring strategies after nuclear accidents, the present report describes the analysis of the performance of a monitoring network in a square grid. This network is used to estimate the distribution of the deposition pattern after a release of radioactivity into the atmosphere. The analysis is based upon a single release, a constant wind direction and an atmospheric dispersion according to a simplified Gaussian plume model. A technique is introduced to estimate the parameters in this Gaussian model based upon measurements at specific monitoring locations and linear regression, although this model is intrinsically non-linear. With these estimated parameters and the Gaussian model the distribution of the contamination due to deposition can be estimated. To investigate the relation between the network and the accuracy of the estimates for the deposition, deposition data have been generated by the Gaussian model, including a measurement error by a Monte Carlo simulation and this procedure has been repeated for several grid sizes, dispersion conditions, number of measurements per location, and errors per single measurement. The present technique has also been applied for the mesh sizes of two networks in the Netherlands, viz. the Landelijk Meetnet Radioaciviteit (National Measurement Network on Radioactivity, mesh size approx. 35 km) and the proposed Landelijk Meetnet Nucleaire Incidenten (National Measurement Network on Nuclear Incidents, mesh size approx. 15 km). The results show accuracies of 11 and 7 percent, respectively, if monitoring locations are used more than 10 km away from the postulated accident site. These figures are based upon 3 measurements per location and a dispersion during neutral weather with a wind velocity of 4 m/s. For stable weather conditions and low wind velocities, i.e. a small plume, the calculated accuracies are at least a factor 1.5 worse.The present type of analysis makes a cost-benefit approach to the
Missing data reconstruction using Gaussian mixture models for fingerprint images
Agaian, Sos S.; Yeole, Rushikesh D.; Rao, Shishir P.; Mulawka, Marzena; Troy, Mike; Reinecke, Gary
2016-05-01
Publisher's Note: This paper, originally published on 25 May 2016, was replaced with a revised version on 16 June 2016. If you downloaded the original PDF, but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance. One of the most important areas in biometrics is matching partial fingerprints in fingerprint databases. Recently, significant progress has been made in designing fingerprint identification systems for missing fingerprint information. However, a dependable reconstruction of fingerprint images still remains challenging due to the complexity and the ill-posed nature of the problem. In this article, both binary and gray-level images are reconstructed. This paper also presents a new similarity score to evaluate the performance of the reconstructed binary image. The offered fingerprint image identification system can be automated and extended to numerous other security applications such as postmortem fingerprints, forensic science, investigations, artificial intelligence, robotics, all-access control, and financial security, as well as for the verification of firearm purchasers, driver license applicants, etc.
Supervised Gaussian mixture model based remote sensing image ...
African Journals Online (AJOL)
The modules of the aforementioned image processing software are based on conventional multi-class classifiers/algorithms such as maximum likelihood classifier. These conventional multi-class classifiers/algorithms are usually written in programming languages such as C, C++, and python. The objective of this research ...
Consistency of the MLE under mixture models
Chen, Jiahua
2016-01-01
The large-sample properties of likelihood-based statistical inference under mixture models have received much attention from statisticians. Although the consistency of the nonparametric MLE is regarded as a standard conclusion, many researchers ignore the precise conditions required on the mixture model. An incorrect claim of consistency can lead to false conclusions even if the mixture model under investigation seems well behaved. Under a finite normal mixture model, for instance, the consis...
Fault Tolerant Control Using Gaussian Processes and Model Predictive Control
Directory of Open Access Journals (Sweden)
Yang Xiaoke
2015-03-01
Full Text Available Essential ingredients for fault-tolerant control are the ability to represent system behaviour following the occurrence of a fault, and the ability to exploit this representation for deciding control actions. Gaussian processes seem to be very promising candidates for the first of these, and model predictive control has a proven capability for the second. We therefore propose to use the two together to obtain fault-tolerant control functionality. Our proposal is illustrated by several reasonably realistic examples drawn from flight control.
Gaussian free turbulence: structures and relaxation in plasma models
International Nuclear Information System (INIS)
Gruzinov, A.V.
1993-01-01
Free-turbulent relaxation in two-dimensional MHD, the degenerate Hasegawa-Mima equation and a two-dimensional microtearing model are studied. The Gibbs distributions of these three systems can be completely analyzed, due to the special structure of their invariants and due to the existence of ultraviolet catastrophe. The free-turbulent field is seen to be a sum of a certain coherent structure (statistical attractor) and Gaussian random noise. Two-dimensional current layers are shown to be statistical attractors in 2D MHD. (author)
Directory of Open Access Journals (Sweden)
Zhang Zhi
2015-12-01
Full Text Available Since the features of low energy consumption and limited power supply are very important for wireless sensor networks (WSNs, the problems of distributed state estimation with quantized innovations are investigated in this paper. In the first place, the assumptions of prior and posterior probability density function (PDF with quantized innovations in the previous papers are analyzed. After that, an innovative Gaussian mixture estimator is proposed. On this basis, this paper presents a Gaussian mixture state estimation algorithm based on quantized innovations for WSNs. In order to evaluate and compare the performance of this kind of state estimation algorithms for WSNs, the posterior Cramér–Rao lower bound (CRLB with quantized innovations is put forward. Performance analysis and simulations show that the proposed Gaussian mixture state estimation algorithm is efficient than the others under the same number of quantization levels and the performance of these algorithms can be benchmarked by the theoretical lower bound.
Gaussian random bridges and a geometric model for information equilibrium
Mengütürk, Levent Ali
2018-03-01
The paper introduces a class of conditioned stochastic processes that we call Gaussian random bridges (GRBs) and proves some of their properties. Due to the anticipative representation of any GRB as the sum of a random variable and a Gaussian (T , 0) -bridge, GRBs can model noisy information processes in partially observed systems. In this spirit, we propose an asset pricing model with respect to what we call information equilibrium in a market with multiple sources of information. The idea is to work on a topological manifold endowed with a metric that enables us to systematically determine an equilibrium point of a stochastic system that can be represented by multiple points on that manifold at each fixed time. In doing so, we formulate GRB-based information diversity over a Riemannian manifold and show that it is pinned to zero over the boundary determined by Dirac measures. We then define an influence factor that controls the dominance of an information source in determining the best estimate of a signal in the L2-sense. When there are two sources, this allows us to construct information equilibrium as a functional of a geodesic-valued stochastic process, which is driven by an equilibrium convergence rate representing the signal-to-noise ratio. This leads us to derive price dynamics under what can be considered as an equilibrium probability measure. We also provide a semimartingale representation of Markovian GRBs associated with Gaussian martingales and a non-anticipative representation of fractional Brownian random bridges that can incorporate degrees of information coupling in a given system via the Hurst exponent.
Out-of-equilibrium dynamics in a Gaussian trap model
International Nuclear Information System (INIS)
Diezemann, Gregor
2007-01-01
The violations of the fluctuation-dissipation theorem are analysed for a trap model with a Gaussian density of states. In this model, the system reaches thermal equilibrium for long times after a quench to any finite temperature and therefore all ageing effect are of a transient nature. For not too long times after the quench it is found that the so-called fluctuation-dissipation ratio tends to a non-trivial limit, thus indicating the possibility for the definition of a timescale-dependent effective temperature. However, different definitions of the effective temperature yield distinct results. In particular, plots of the integrated response versus the correlation function strongly depend on the way they are constructed. Also the definition of effective temperatures in the frequency domain is not unique for the model considered. This may have some implications for the interpretation of results from computer simulations and experimental determinations of effective temperatures
A Gaussian graphical model approach to climate networks
Energy Technology Data Exchange (ETDEWEB)
Zerenner, Tanja, E-mail: tanjaz@uni-bonn.de [Meteorological Institute, University of Bonn, Auf dem Hügel 20, 53121 Bonn (Germany); Friederichs, Petra; Hense, Andreas [Meteorological Institute, University of Bonn, Auf dem Hügel 20, 53121 Bonn (Germany); Interdisciplinary Center for Complex Systems, University of Bonn, Brühler Straße 7, 53119 Bonn (Germany); Lehnertz, Klaus [Department of Epileptology, University of Bonn, Sigmund-Freud-Straße 25, 53105 Bonn (Germany); Helmholtz Institute for Radiation and Nuclear Physics, University of Bonn, Nussallee 14-16, 53115 Bonn (Germany); Interdisciplinary Center for Complex Systems, University of Bonn, Brühler Straße 7, 53119 Bonn (Germany)
2014-06-15
Distinguishing between direct and indirect connections is essential when interpreting network structures in terms of dynamical interactions and stability. When constructing networks from climate data the nodes are usually defined on a spatial grid. The edges are usually derived from a bivariate dependency measure, such as Pearson correlation coefficients or mutual information. Thus, the edges indistinguishably represent direct and indirect dependencies. Interpreting climate data fields as realizations of Gaussian Random Fields (GRFs), we have constructed networks according to the Gaussian Graphical Model (GGM) approach. In contrast to the widely used method, the edges of GGM networks are based on partial correlations denoting direct dependencies. Furthermore, GRFs can be represented not only on points in space, but also by expansion coefficients of orthogonal basis functions, such as spherical harmonics. This leads to a modified definition of network nodes and edges in spectral space, which is motivated from an atmospheric dynamics perspective. We construct and analyze networks from climate data in grid point space as well as in spectral space, and derive the edges from both Pearson and partial correlations. Network characteristics, such as mean degree, average shortest path length, and clustering coefficient, reveal that the networks posses an ordered and strongly locally interconnected structure rather than small-world properties. Despite this, the network structures differ strongly depending on the construction method. Straightforward approaches to infer networks from climate data while not regarding any physical processes may contain too strong simplifications to describe the dynamics of the climate system appropriately.
A Gaussian graphical model approach to climate networks
International Nuclear Information System (INIS)
Zerenner, Tanja; Friederichs, Petra; Hense, Andreas; Lehnertz, Klaus
2014-01-01
Distinguishing between direct and indirect connections is essential when interpreting network structures in terms of dynamical interactions and stability. When constructing networks from climate data the nodes are usually defined on a spatial grid. The edges are usually derived from a bivariate dependency measure, such as Pearson correlation coefficients or mutual information. Thus, the edges indistinguishably represent direct and indirect dependencies. Interpreting climate data fields as realizations of Gaussian Random Fields (GRFs), we have constructed networks according to the Gaussian Graphical Model (GGM) approach. In contrast to the widely used method, the edges of GGM networks are based on partial correlations denoting direct dependencies. Furthermore, GRFs can be represented not only on points in space, but also by expansion coefficients of orthogonal basis functions, such as spherical harmonics. This leads to a modified definition of network nodes and edges in spectral space, which is motivated from an atmospheric dynamics perspective. We construct and analyze networks from climate data in grid point space as well as in spectral space, and derive the edges from both Pearson and partial correlations. Network characteristics, such as mean degree, average shortest path length, and clustering coefficient, reveal that the networks posses an ordered and strongly locally interconnected structure rather than small-world properties. Despite this, the network structures differ strongly depending on the construction method. Straightforward approaches to infer networks from climate data while not regarding any physical processes may contain too strong simplifications to describe the dynamics of the climate system appropriately
Fast uncertainty reduction strategies relying on Gaussian process models
International Nuclear Information System (INIS)
Chevalier, Clement
2013-01-01
This work deals with sequential and batch-sequential evaluation strategies of real-valued functions under limited evaluation budget, using Gaussian process models. Optimal Stepwise Uncertainty Reduction (SUR) strategies are investigated for two different problems, motivated by real test cases in nuclear safety. First we consider the problem of identifying the excursion set above a given threshold T of a real-valued function f. Then we study the question of finding the set of 'safe controlled configurations', i.e. the set of controlled inputs where the function remains below T, whatever the value of some others non-controlled inputs. New SUR strategies are presented, together with efficient procedures and formulas to compute and use them in real world applications. The use of fast formulas to recalculate quickly the posterior mean or covariance function of a Gaussian process (referred to as the 'kriging update formulas') does not only provide substantial computational savings. It is also one of the key tools to derive closed form formulas enabling a practical use of computationally-intensive sampling strategies. A contribution in batch-sequential optimization (with the multi-points Expected Improvement) is also presented. (author)
Dirichlet Process Parsimonious Mixtures for clustering
Chamroukhi, Faicel; Bartcus, Marius; Glotin, Hervé
2015-01-01
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixtur...
Characterisation of non-Gaussian fluctuations in multiplicative log-normal models
Kiyono, Ken; Struzik, Zbigniew R.; Yamamoto, Yoshiharu
2007-07-01
Within the general framework of multiplicative log-normal models, we propose methods to characterise non-Gaussian and intermittent fluctuations, and study basic characteristics of non-Gaussian stochastic processes displaying slow convergence to a Gaussian with an increasing coarse-grained level of the time series. Here the multiplicative log-normal model stands for a stochastic process described by the multiplication of Gaussian and log-normally distributed variables. In other words, using two Gaussian variables, ξ and ω, the time series {xi} of this process can be described as xi = ξi expωi. Depending on the variance of ω, λ2, the probability density function (PDF) of x exhibits a non-Gaussian shape. As the non-Gaussianity parameter λ2 increases, the non-Gaussian tails become fatter. On the other hand, when λ2 → 0, the PDF converges to a Gaussian distribution. For the purpose of estimating the non-Gaussianity parameter λ2 from the observed time series, we evaluate a novel method based on analytical expressions of the absolute moments for the multiplicative log-normal models.
GaussianCpG: a Gaussian model for detection of CpG island in human genome sequences.
Yu, Ning; Guo, Xuan; Zelikovsky, Alexander; Pan, Yi
2017-05-24
As crucial markers in identifying biological elements and processes in mammalian genomes, CpG islands (CGI) play important roles in DNA methylation, gene regulation, epigenetic inheritance, gene mutation, chromosome inactivation and nuclesome retention. The generally accepted criteria of CGI rely on: (a) %G+C content is ≥ 50%, (b) the ratio of the observed CpG content and the expected CpG content is ≥ 0.6, and (c) the general length of CGI is greater than 200 nucleotides. Most existing computational methods for the prediction of CpG island are programmed on these rules. However, many experimentally verified CpG islands deviate from these artificial criteria. Experiments indicate that in many cases %G+C is human genome. We analyze the energy distribution over genomic primary structure for each CpG site and adopt the parameters from statistics of Human genome. The evaluation results show that the new model can predict CpG islands efficiently by balancing both sensitivity and specificity over known human CGI data sets. Compared with other models, GaussianCpG can achieve better performance in CGI detection. Our Gaussian model aims to simplify the complex interaction between nucleotides. The model is computed not by the linear statistical method but by the Gaussian energy distribution and accumulation. The parameters of Gaussian function are not arbitrarily designated but deliberately chosen by optimizing the biological statistics. By using the pseudopotential analysis on CpG islands, the novel model is validated on both the real and artificial data sets.
Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis
Directory of Open Access Journals (Sweden)
Zhanyu Ma
2014-06-01
Full Text Available As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.
Overfitting Bayesian Mixture Models with an Unknown Number of Components.
Directory of Open Access Journals (Sweden)
Zoé van Havre
Full Text Available This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Overfitting Bayesian Mixture Models with an Unknown Number of Components.
van Havre, Zoé; White, Nicole; Rousseau, Judith; Mengersen, Kerrie
2015-01-01
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Model structure selection in convolutive mixtures
DEFF Research Database (Denmark)
Dyrholm, Mads; Makeig, Scott; Hansen, Lars Kai
2006-01-01
The CICAAR algorithm (convolutive independent component analysis with an auto-regressive inverse model) allows separation of white (i.i.d) source signals from convolutive mixtures. We introduce a source color model as a simple extension to the CICAAR which allows for a more parsimoneous...... representation in many practical mixtures. The new filter-CICAAR allows Bayesian model selection and can help answer questions like: 'Are we actually dealing with a convolutive mixture?'. We try to answer this question for EEG data....
Vegetation Monitoring with Gaussian Processes and Latent Force Models
Camps-Valls, Gustau; Svendsen, Daniel; Martino, Luca; Campos, Manuel; Luengo, David
2017-04-01
Monitoring vegetation by biophysical parameter retrieval from Earth observation data is a challenging problem, where machine learning is currently a key player. Neural networks, kernel methods, and Gaussian Process (GP) regression have excelled in parameter retrieval tasks at both local and global scales. GP regression is based on solid Bayesian statistics, yield efficient and accurate parameter estimates, and provides interesting advantages over competing machine learning approaches such as confidence intervals. However, GP models are hampered by lack of interpretability, that prevented the widespread adoption by a larger community. In this presentation we will summarize some of our latest developments to address this issue. We will review the main characteristics of GPs and their advantages in vegetation monitoring standard applications. Then, three advanced GP models will be introduced. First, we will derive sensitivity maps for the GP predictive function that allows us to obtain feature ranking from the model and to assess the influence of examples in the solution. Second, we will introduce a Joint GP (JGP) model that combines in situ measurements and simulated radiative transfer data in a single GP model. The JGP regression provides more sensible confidence intervals for the predictions, respects the physics of the underlying processes, and allows for transferability across time and space. Finally, a latent force model (LFM) for GP modeling that encodes ordinary differential equations to blend data-driven modeling and physical models of the system is presented. The LFM performs multi-output regression, adapts to the signal characteristics, is able to cope with missing data in the time series, and provides explicit latent functions that allow system analysis and evaluation. Empirical evidence of the performance of these models will be presented through illustrative examples.
Mixture Modeling: Applications in Educational Psychology
Harring, Jeffrey R.; Hodis, Flaviu A.
2016-01-01
Model-based clustering methods, commonly referred to as finite mixture modeling, have been applied to a wide variety of cross-sectional and longitudinal data to account for heterogeneity in population characteristics. In this article, we elucidate 2 such approaches: growth mixture modeling and latent profile analysis. Both techniques are…
Gaussian Process Domain Experts for Modeling of Facial Affect.
Eleftheriadis, Stefanos; Rudovic, Ognjen; Deisenroth, Marc Peter; Pantic, Maja
2017-10-01
Most of existing models for facial behavior analysis rely on generic classifiers, which fail to generalize well to previously unseen data. This is because of inherent differences in source (training) and target (test) data, mainly caused by variation in subjects' facial morphology, camera views, and so on. All of these account for different contexts in which target and source data are recorded, and thus, may adversely affect the performance of the models learned solely from source data. In this paper, we exploit the notion of domain adaptation and propose a data efficient approach to adapt already learned classifiers to new unseen contexts. Specifically, we build upon the probabilistic framework of Gaussian processes (GPs), and introduce domain-specific GP experts (e.g., for each subject). The model adaptation is facilitated in a probabilistic fashion, by conditioning the target expert on the predictions from multiple source experts. We further exploit the predictive variance of each expert to define an optimal weighting during inference. We evaluate the proposed model on three publicly available data sets for multi-class (MultiPIE) and multi-label (DISFA, FERA2015) facial expression analysis by performing adaptation of two contextual factors: "where" (view) and "who" (subject). In our experiments, the proposed approach consistently outperforms: 1) both source and target classifiers, while using a small number of target examples during the adaptation and 2) related state-of-the-art approaches for supervised domain adaptation.
Sequencing batch-reactor control using Gaussian-process models.
Kocijan, Juš; Hvala, Nadja
2013-06-01
This paper presents a Gaussian-process (GP) model for the design of sequencing batch-reactor (SBR) control for wastewater treatment. The GP model is a probabilistic, nonparametric model with uncertainty predictions. In the case of SBR control, it is used for the on-line optimisation of the batch-phases duration. The control algorithm follows the course of the indirect process variables (pH, redox potential and dissolved oxygen concentration) and recognises the characteristic patterns in their time profile. The control algorithm uses GP-based regression to smooth the signals and GP-based classification for the pattern recognition. When tested on the signals from an SBR laboratory pilot plant, the control algorithm provided a satisfactory agreement between the proposed completion times and the actual termination times of the biodegradation processes. In a set of tested batches the final ammonia and nitrate concentrations were below 1 and 0.5 mg L(-1), respectively, while the aeration time was shortened considerably. Copyright © 2013 Elsevier Ltd. All rights reserved.
Gaussian copula as a likelihood function for environmental models
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an
Estimator of a non-Gaussian parameter in multiplicative log-normal models
Kiyono, Ken; Struzik, Zbigniew R.; Yamamoto, Yoshiharu
2007-10-01
We study non-Gaussian probability density functions (PDF’s) of multiplicative log-normal models in which the multiplication of Gaussian and log-normally distributed random variables is considered. To describe the PDF of the velocity difference between two points in fully developed turbulent flows, the non-Gaussian PDF model was originally introduced by Castaing [Physica D 46, 177 (1990)]. In practical applications, an experimental PDF is approximated with Castaing’s model by tuning a single non-Gaussian parameter, which corresponds to the logarithmic variance of the log-normally distributed variable in the model. In this paper, we propose an estimator of the non-Gaussian parameter based on the q th order absolute moments. To test the estimator, we introduce two types of stochastic processes within the framework of the multiplicative log-normal model. One is a sequence of independent and identically distributed random variables. The other is a log-normal cascade-type multiplicative process. By analyzing the numerically generated time series, we demonstrate that the estimator can reliably determine the theoretical value of the non-Gaussian parameter. Scale dependence of the non-Gaussian parameter in multiplicative log-normal models is also studied, both analytically and numerically. As an application of the estimator, we demonstrate that non-Gaussian PDF’s observed in the S&P500 index fluctuations are well described by the multiplicative log-normal model.
Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series
Foreman-Mackey, Daniel; Agol, Eric; Ambikasaran, Sivaram; Angus, Ruth
2017-12-01
The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators—providing a physical motivation for and interpretation of this choice—but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.
Gaussian graphical modeling reveals specific lipid correlations in glioblastoma cells
Mueller, Nikola S.; Krumsiek, Jan; Theis, Fabian J.; Böhm, Christian; Meyer-Bäse, Anke
2011-06-01
Advances in high-throughput measurements of biological specimens necessitate the development of biologically driven computational techniques. To understand the molecular level of many human diseases, such as cancer, lipid quantifications have been shown to offer an excellent opportunity to reveal disease-specific regulations. The data analysis of the cell lipidome, however, remains a challenging task and cannot be accomplished solely based on intuitive reasoning. We have developed a method to identify a lipid correlation network which is entirely disease-specific. A powerful method to correlate experimentally measured lipid levels across the various samples is a Gaussian Graphical Model (GGM), which is based on partial correlation coefficients. In contrast to regular Pearson correlations, partial correlations aim to identify only direct correlations while eliminating indirect associations. Conventional GGM calculations on the entire dataset can, however, not provide information on whether a correlation is truly disease-specific with respect to the disease samples and not a correlation of control samples. Thus, we implemented a novel differential GGM approach unraveling only the disease-specific correlations, and applied it to the lipidome of immortal Glioblastoma tumor cells. A large set of lipid species were measured by mass spectrometry in order to evaluate lipid remodeling as a result to a combination of perturbation of cells inducing programmed cell death, while the other perturbations served solely as biological controls. With the differential GGM, we were able to reveal Glioblastoma-specific lipid correlations to advance biomedical research on novel gene therapies.
Markov random field and Gaussian mixture for segmented MRI-based partial volume correction in PET
International Nuclear Information System (INIS)
Bousse, Alexandre; Thomas, Benjamin A; Erlandsson, Kjell; Hutton, Brian F; Pedemonte, Stefano; Ourselin, Sébastien; Arridge, Simon
2012-01-01
In this paper we propose a segmented magnetic resonance imaging (MRI) prior-based maximum penalized likelihood deconvolution technique for positron emission tomography (PET) images. The model assumes the existence of activity classes that behave like a hidden Markov random field (MRF) driven by the segmented MRI. We utilize a mean field approximation to compute the likelihood of the MRF. We tested our method on both simulated and clinical data (brain PET) and compared our results with PET images corrected with the re-blurred Van Cittert (VC) algorithm, the simplified Guven (SG) algorithm and the region-based voxel-wise (RBV) technique. We demonstrated our algorithm outperforms the VC algorithm and outperforms SG and RBV corrections when the segmented MRI is inconsistent (e.g. mis-segmentation, lesions, etc) with the PET image. (paper)
Monte Carlo estimation for nonlinear non-Gaussian state space models
Jungbacker, B.M.J.P.; Koopman, S.J.
2007-01-01
We develop a proposal or importance density for state space models with a nonlinear non-Gaussian observation vector y ∼ p(yθ) and an unobserved linear Gaussian signal vector θ ∼ p(θ). The proposal density is obtained from the Laplace approximation of the smoothing density p(θy). We present efficient
Using Gaussian Processes to Construct Flexible Models of Stellar Spectra
Czekala, Ian
2018-01-01
The use of spectra is fundamental to astrophysical fields ranging from exoplanets to stars to galaxies. In spite of this ubiquity, or perhaps because of it, there are a plethora of use cases that do not yet have physics-based forward models that can fit high signal-to-noise data to within the observational noise. These inadequacies result in subtle but systematic residuals not captured by any model, which complicates and biases parameter inference. Fortunately, the now-prevalent collection and archiving of large spectral datasets also provides an opening for empirical, data-driven approaches. We introduce one example of a time-series dataset of high-resolution stellar spectra, as is commonly delivered by planet-search radial velocity instruments like TRES, HIRES, and HARPS. Measurements of radial velocity variations of stars and their companions are essential for stellar and exoplanetary study; these measurements provide access to the fundamental physical properties that dictate all phases of stellar evolution and facilitate the quantitative study of planetary systems. In observations of a (spatially unresolved) spectroscopic binary star, one only ever records the composite sum of the spectra from the primary and secondary stars, complicating photospheric analysis of each individual star. Our technique “disentangles” the composite spectra by treating each underlying stellar spectrum as a Gaussian process, whose posterior predictive distribution is inferred simultaneously with the orbital parameters. To demonstrate the potential of this technique, we deploy it on red-optical time-series spectra of the mid-M-dwarf eclipsing binary LP661-13, which was recently discovered by the MEarth project. We successfully reconstruct the primary and secondary stellar spectra and report orbital parameters with improved precision compared to traditional radial velocity analysis techniques.
A comparison of the Gaussian Plume models of Pasquill and Smith
International Nuclear Information System (INIS)
Barker, C.D.
1978-03-01
The Gaussian Plume models of Pasquill and Smith are compared over the full range of atmospheric stability for both short and continuous releases of material. For low level releases the two models compare well (to within a factor of approximately 2) except for very unstable conditions. The agreement between the two models for high level sources is not so good. It is concluded that the two Gaussian models are cheap and simple to use, but may require experimental verification in specific applications. (author)
Model structure selection in convolutive mixtures
DEFF Research Database (Denmark)
Dyrholm, Mads; Makeig, S.; Hansen, Lars Kai
2006-01-01
The CICAAR algorithm (convolutive independent component analysis with an auto-regressive inverse model) allows separation of white (i.i.d) source signals from convolutive mixtures. We introduce a source color model as a simple extension to the CICAAR which allows for a more parsimonious represent......The CICAAR algorithm (convolutive independent component analysis with an auto-regressive inverse model) allows separation of white (i.i.d) source signals from convolutive mixtures. We introduce a source color model as a simple extension to the CICAAR which allows for a more parsimonious...... representation in many practical mixtures. The new filter-CICAAR allows Bayesian model selection and can help answer questions like: ’Are we actually dealing with a convolutive mixture?’. We try to answer this question for EEG data....
Modeling for pollution dispersion and air quality 4.: the Gaussian model
International Nuclear Information System (INIS)
Bertagna, Silvia
2005-01-01
The Gaussian Model is the simulation model for atmospheric pollutant dispersion most used in practice, in particular for engineering applications; it has been the first model used in the United States to predict the impact of pollutant sources on air quality and for many years it has constituted the projecting instrument in environmental and territory planning; today it is still a very useful instrument, above all when the meteorological input data are not so abundant. In recent year, great efforts have been made to extend the original Gaussian model to different typologies of sources and to make it able to treat more detailed effects, as, for example, a complex terrain, the dry deposition, the gravity effect on heavy particulate matter and other microscale effects. In this work, the main characteristics of the Gaussian model and the equations which govern its description of the dispersion of air pollutants are discussed; moreover, the main used codices which implement Gaussian models which can be easily found in commerce or, sometimes, in the net, are briefly described [it
Referenceless magnetic resonance temperature imaging using Gaussian process modeling.
Yung, Joshua P; Fuentes, David; MacLellan, Christopher J; Maier, Florian; Liapis, Yannis; Hazle, John D; Stafford, R Jason
2017-07-01
During magnetic resonance (MR)-guided thermal therapies, water proton resonance frequency shift (PRFS)-based MR temperature imaging can quantitatively monitor tissue temperature changes. It is widely known that the PRFS technique is easily perturbed by tissue motion, tissue susceptibility changes, magnetic field drift, and modality-dependent applicator-induced artifacts. Here, a referenceless Gaussian process modeling (GPM)-based estimation of the PRFS is investigated as a methodology to mitigate unwanted background field changes. The GPM offers a complementary trade-off between data fitting and smoothing and allows prior information to be used. The end result being the GPM provides a full probabilistic prediction and an estimate of the uncertainty. GPM was employed to estimate the covariance between the spatial position and MR phase measurements. The mean and variance provided by the statistical model extrapolated background phase values from nonheated neighboring voxels used to train the model. MR phase predictions in the heating ROI are computed using the spatial coordinates as the test input. The method is demonstrated in ex vivo rabbit liver tissue during focused ultrasound heating with manually introduced perturbations (n = 6) and in vivo during laser-induced interstitial thermal therapy to treat the human brain (n = 1) and liver (n = 1). Temperature maps estimated using the GPM referenceless method demonstrated a RMS error of <0.8°C with artifact-induced reference-based MR thermometry during ex vivo heating using focused ultrasound. Nonheated surrounding areas were <0.5°C from the artifact-free MR measurements. The GPM referenceless MR temperature values and thermally damaged regions were within the 95% confidence interval during in vivo laser ablations. A new approach to estimation for referenceless PRFS temperature imaging is introduced that allows for an accurate probabilistic extrapolation of the background phase. The technique demonstrated reliable
Directory of Open Access Journals (Sweden)
Silva-Aguilar Martín
2011-01-01
Full Text Available Metals are ubiquitous pollutants present as mixtures. In particular, mixture of arsenic-cadmium-lead is among the leading toxic agents detected in the environment. These metals have carcinogenic and cell-transforming potential. In this study, we used a two step cell transformation model, to determine the role of oxidative stress in transformation induced by a mixture of arsenic-cadmium-lead. Oxidative damage and antioxidant response were determined. Metal mixture treatment induces the increase of damage markers and the antioxidant response. Loss of cell viability and increased transforming potential were observed during the promotion phase. This finding correlated significantly with generation of reactive oxygen species. Cotreatment with N-acetyl-cysteine induces effect on the transforming capacity; while a diminution was found in initiation, in promotion phase a total block of the transforming capacity was observed. Our results suggest that oxidative stress generated by metal mixture plays an important role only in promotion phase promoting transforming capacity.
The Gaussian Graphical Model in Cross-Sectional and Time-Series Data.
Epskamp, Sacha; Waldorp, Lourens J; Mõttus, René; Borsboom, Denny
2018-04-16
We discuss the Gaussian graphical model (GGM; an undirected network of partial correlation coefficients) and detail its utility as an exploratory data analysis tool. The GGM shows which variables predict one-another, allows for sparse modeling of covariance structures, and may highlight potential causal relationships between observed variables. We describe the utility in three kinds of psychological data sets: data sets in which consecutive cases are assumed independent (e.g., cross-sectional data), temporally ordered data sets (e.g., n = 1 time series), and a mixture of the 2 (e.g., n > 1 time series). In time-series analysis, the GGM can be used to model the residual structure of a vector-autoregression analysis (VAR), also termed graphical VAR. Two network models can then be obtained: a temporal network and a contemporaneous network. When analyzing data from multiple subjects, a GGM can also be formed on the covariance structure of stationary means-the between-subjects network. We discuss the interpretation of these models and propose estimation methods to obtain these networks, which we implement in the R packages graphicalVAR and mlVAR. The methods are showcased in two empirical examples, and simulation studies on these methods are included in the supplementary materials.
Improved Expectation Maximization Algorithm for Gaussian Mixed Model Using the Kernel Method
Directory of Open Access Journals (Sweden)
Mohd Izhan Mohd Yusoff
2013-01-01
Full Text Available Fraud activities have contributed to heavy losses suffered by telecommunication companies. In this paper, we attempt to use Gaussian mixed model, which is a probabilistic model normally used in speech recognition to identify fraud calls in the telecommunication industry. We look at several issues encountered when calculating the maximum likelihood estimates of the Gaussian mixed model using an Expectation Maximization algorithm. Firstly, we look at a mechanism for the determination of the initial number of Gaussian components and the choice of the initial values of the algorithm using the kernel method. We show via simulation that the technique improves the performance of the algorithm. Secondly, we developed a procedure for determining the order of the Gaussian mixed model using the log-likelihood function and the Akaike information criteria. Finally, for illustration, we apply the improved algorithm to real telecommunication data. The modified method will pave the way to introduce a comprehensive method for detecting fraud calls in future work.
Probabilistic wind power forecasting with online model selection and warped gaussian process
International Nuclear Information System (INIS)
Kou, Peng; Liang, Deliang; Gao, Feng; Gao, Lin
2014-01-01
Highlights: • A new online ensemble model for the probabilistic wind power forecasting. • Quantifying the non-Gaussian uncertainties in wind power. • Online model selection that tracks the time-varying characteristic of wind generation. • Dynamically altering the input features. • Recursive update of base models. - Abstract: Based on the online model selection and the warped Gaussian process (WGP), this paper presents an ensemble model for the probabilistic wind power forecasting. This model provides the non-Gaussian predictive distributions, which quantify the non-Gaussian uncertainties associated with wind power. In order to follow the time-varying characteristics of wind generation, multiple time dependent base forecasting models and an online model selection strategy are established, thus adaptively selecting the most probable base model for each prediction. WGP is employed as the base model, which handles the non-Gaussian uncertainties in wind power series. Furthermore, a regime switch strategy is designed to modify the input feature set dynamically, thereby enhancing the adaptiveness of the model. In an online learning framework, the base models should also be time adaptive. To achieve this, a recursive algorithm is introduced, thus permitting the online updating of WGP base models. The proposed model has been tested on the actual data collected from both single and aggregated wind farms
Residual-based model diagnosis methods for mixture cure models.
Peng, Yingwei; Taylor, Jeremy M G
2017-06-01
Model diagnosis, an important issue in statistical modeling, has not yet been addressed adequately for cure models. We focus on mixture cure models in this work and propose some residual-based methods to examine the fit of the mixture cure model, particularly the fit of the latency part of the mixture cure model. The new methods extend the classical residual-based methods to the mixture cure model. Numerical work shows that the proposed methods are capable of detecting lack-of-fit of a mixture cure model, particularly in the latency part, such as outliers, improper covariate functional form, or nonproportionality in hazards if the proportional hazards assumption is employed in the latency part. The methods are illustrated with two real data sets that were previously analyzed with mixture cure models. © 2016, The International Biometric Society.
Modeling and analysis of personal exposures to VOC mixtures using copulas
Su, Feng-Chiao; Mukherjee, Bhramar; Batterman, Stuart
2014-01-01
Environmental exposures typically involve mixtures of pollutants, which must be understood to evaluate cumulative risks, that is, the likelihood of adverse health effects arising from two or more chemicals. This study uses several powerful techniques to characterize dependency structures of mixture components in personal exposure measurements of volatile organic compounds (VOCs) with aims of advancing the understanding of environmental mixtures, improving the ability to model mixture components in a statistically valid manner, and demonstrating broadly applicable techniques. We first describe characteristics of mixtures and introduce several terms, including the mixture fraction which represents a mixture component's share of the total concentration of the mixture. Next, using VOC exposure data collected in the Relationship of Indoor Outdoor and Personal Air (RIOPA) study, mixtures are identified using positive matrix factorization (PMF) and by toxicological mode of action. Dependency structures of mixture components are examined using mixture fractions and modeled using copulas, which address dependencies of multiple variables across the entire distribution. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) are evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks are calculated for mixtures, and results from copulas and multivariate lognormal models are compared to risks calculated using the observed data. Results obtained using the RIOPA dataset showed four VOC mixtures, representing gasoline vapor, vehicle exhaust, chlorinated solvents and disinfection by-products, and cleaning products and odorants. Often, a single compound dominated the mixture, however, mixture fractions were generally heterogeneous in that the VOC composition of the mixture changed with concentration. Three mixtures were identified by mode of action, representing VOCs associated with hematopoietic, liver
Exact Fit of Simple Finite Mixture Models
Directory of Open Access Journals (Sweden)
Dirk Tasche
2014-11-01
Full Text Available How to forecast next year’s portfolio-wide credit default rate based on last year’s default observations and the current score distribution? A classical approach to this problem consists of fitting a mixture of the conditional score distributions observed last year to the current score distribution. This is a special (simple case of a finite mixture model where the mixture components are fixed and only the weights of the components are estimated. The optimum weights provide a forecast of next year’s portfolio-wide default rate. We point out that the maximum-likelihood (ML approach to fitting the mixture distribution not only gives an optimum but even an exact fit if we allow the mixture components to vary but keep their density ratio fixed. From this observation we can conclude that the standard default rate forecast based on last year’s conditional default rates will always be located between last year’s portfolio-wide default rate and the ML forecast for next year. As an application example, cost quantification is then discussed. We also discuss how the mixture model based estimation methods can be used to forecast total loss. This involves the reinterpretation of an individual classification problem as a collective quantification problem.
Information Geometric Complexity of a Trivariate Gaussian Statistical Model
Directory of Open Access Journals (Sweden)
Domenico Felice
2014-05-01
Full Text Available We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult it is to make macroscopic predictions about systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that, for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.
Bayesian mixture models for partially verified data
DEFF Research Database (Denmark)
Kostoulas, Polychronis; Browne, William J.; Nielsen, Søren Saxmose
2013-01-01
Bayesian mixture models can be used to discriminate between the distributions of continuous test responses for different infection stages. These models are particularly useful in case of chronic infections with a long latent period, like Mycobacterium avium subsp. paratuberculosis (MAP) infection...
Spatially adaptive mixture modeling for analysis of FMRI time series.
Vincent, Thomas; Risser, Laurent; Ciuciu, Philippe
2010-04-01
Within-subject analysis in fMRI essentially addresses two problems, the detection of brain regions eliciting evoked activity and the estimation of the underlying dynamics. In Makni et aL, 2005 and Makni et aL, 2008, a detection-estimation framework has been proposed to tackle these problems jointly, since they are connected to one another. In the Bayesian formalism, detection is achieved by modeling activating and nonactivating voxels through independent mixture models (IMM) within each region while hemodynamic response estimation is performed at a regional scale in a nonparametric way. Instead of IMMs, in this paper we take advantage of spatial mixture models (SMM) for their nonlinear spatial regularizing properties. The proposed method is unsupervised and spatially adaptive in the sense that the amount of spatial correlation is automatically tuned from the data and this setting automatically varies across brain regions. In addition, the level of regularization is specific to each experimental condition since both the signal-to-noise ratio and the activation pattern may vary across stimulus types in a given brain region. These aspects require the precise estimation of multiple partition functions of underlying Ising fields. This is addressed efficiently using first path sampling for a small subset of fields and then using a recently developed fast extrapolation technique for the large remaining set. Simulation results emphasize that detection relying on supervised SMM outperforms its IMM counterpart and that unsupervised spatial mixture models achieve similar results without any hand-tuning of the correlation parameter. On real datasets, the gain is illustrated in a localizer fMRI experiment: brain activations appear more spatially resolved using SMM in comparison with classical general linear model (GLM)-based approaches, while estimating a specific parcel-based HRF shape. Our approach therefore validates the treatment of unsmoothed fMRI data without fixed GLM
American Option Pricing using GARCH models and the Normal Inverse Gaussian distribution
DEFF Research Database (Denmark)
Stentoft, Lars Peter
In this paper we propose a feasible way to price American options in a model with time varying volatility and conditional skewness and leptokurtosis using GARCH processes and the Normal Inverse Gaussian distribution. We show how the risk neutral dynamics can be obtained in this model, we interpre....... In particular, improvements are found when considering the smile in implied standard deviations.......In this paper we propose a feasible way to price American options in a model with time varying volatility and conditional skewness and leptokurtosis using GARCH processes and the Normal Inverse Gaussian distribution. We show how the risk neutral dynamics can be obtained in this model, we interpret...... the effect of the riskneutralization, and we derive approximation procedures which allow for a computationally efficient implementation of the model. When the model is estimated on financial returns data the results indicate that compared to the Gaussian case the extension is important. A study of the model...
Spatio-Temporal Data Analysis at Scale Using Models Based on Gaussian Processes
Energy Technology Data Exchange (ETDEWEB)
Stein, Michael [Univ. of Chicago, IL (United States)
2017-03-13
Gaussian processes are the most commonly used statistical model for spatial and spatio-temporal processes that vary continuously. They are broadly applicable in the physical sciences and engineering and are also frequently used to approximate the output of complex computer models, deterministic or stochastic. We undertook research related to theory, computation, and applications of Gaussian processes as well as some work on estimating extremes of distributions for which a Gaussian process assumption might be inappropriate. Our theoretical contributions include the development of new classes of spatial-temporal covariance functions with desirable properties and new results showing that certain covariance models lead to predictions with undesirable properties. To understand how Gaussian process models behave when applied to deterministic computer models, we derived what we believe to be the first significant results on the large sample properties of estimators of parameters of Gaussian processes when the actual process is a simple deterministic function. Finally, we investigated some theoretical issues related to maxima of observations with varying upper bounds and found that, depending on the circumstances, standard large sample results for maxima may or may not hold. Our computational innovations include methods for analyzing large spatial datasets when observations fall on a partially observed grid and methods for estimating parameters of a Gaussian process model from observations taken by a polar-orbiting satellite. In our application of Gaussian process models to deterministic computer experiments, we carried out some matrix computations that would have been infeasible using even extended precision arithmetic by focusing on special cases in which all elements of the matrices under study are rational and using exact arithmetic. The applications we studied include total column ozone as measured from a polar-orbiting satellite, sea surface temperatures over the
Receiver design for SPAD-based VLC systems under Poisson-Gaussian mixed noise model.
Mao, Tianqi; Wang, Zhaocheng; Wang, Qi
2017-01-23
Single-photon avalanche diode (SPAD) is a promising photosensor because of its high sensitivity to optical signals in weak illuminance environment. Recently, it has drawn much attention from researchers in visible light communications (VLC). However, existing literature only deals with the simplified channel model, which only considers the effects of Poisson noise introduced by SPAD, but neglects other noise sources. Specifically, when an analog SPAD detector is applied, there exists Gaussian thermal noise generated by the transimpedance amplifier (TIA) and the digital-to-analog converter (D/A). Therefore, in this paper, we propose an SPAD-based VLC system with pulse-amplitude-modulation (PAM) under Poisson-Gaussian mixed noise model, where Gaussian-distributed thermal noise at the receiver is also investigated. The closed-form conditional likelihood of received signals is derived using the Laplace transform and the saddle-point approximation method, and the corresponding quasi-maximum-likelihood (quasi-ML) detector is proposed. Furthermore, the Poisson-Gaussian-distributed signals are converted to Gaussian variables with the aid of the generalized Anscombe transform (GAT), leading to an equivalent additive white Gaussian noise (AWGN) channel, and a hard-decision-based detector is invoked. Simulation results demonstrate that, the proposed GAT-based detector can reduce the computational complexity with marginal performance loss compared with the proposed quasi-ML detector, and both detectors are capable of accurately demodulating the SPAD-based PAM signals.
Fitting non-gaussian Models to Financial data: An Empirical Study
Directory of Open Access Journals (Sweden)
Pablo Olivares
2011-04-01
Full Text Available In this paper are presented some experiences about the modeling of financial data by three classes of models as alternative to Gaussian Linear models. Dynamic Volatility, Stable L'evy and Diffusion with Jumps models are considered. The techniques are illustrated with some examples of financial series on currency, futures and indexes.
An innovation approach to non-Gaussian time series analysis
Ozaki, Tohru; Iino, Mitsunori
2001-01-01
The paper shows that the use of both types of random noise, white noise and Poisson noise, can be justified when using an innovations approach. The historical background for this is sketched, and then several methods of whitening dependent time series are outlined, including a mixture of Gaussian white noise and a compound Poisson process: this appears as a natural extension of the Gaussian white noise model for the prediction errors of a non-Gaussian time series. A stati...
A Skew-Normal Mixture Regression Model
Liu, Min; Lin, Tsung-I
2014-01-01
A challenge associated with traditional mixture regression models (MRMs), which rest on the assumption of normally distributed errors, is determining the number of unobserved groups. Specifically, even slight deviations from normality can lead to the detection of spurious classes. The current work aims to (a) examine how sensitive the commonly…
Mixture model analysis of complex samples
Wedel, M; ter Hofstede, F; Steenkamp, JBEM
1998-01-01
We investigate the effects of a complex sampling design on the estimation of mixture models. An approximate or pseudo likelihood approach is proposed to obtain consistent estimates of class-specific parameters when the sample arises from such a complex design. The effects of ignoring the sample
International Nuclear Information System (INIS)
Ma, Denglong; Zhang, Zaoxiao
2016-01-01
Highlights: • The intelligent network models were built to predict contaminant gas concentrations. • The improved network models coupled with Gaussian dispersion model were presented. • New model has high efficiency and accuracy for concentration prediction. • New model were applied to indentify the leakage source with satisfied results. - Abstract: Gas dispersion model is important for predicting the gas concentrations when contaminant gas leakage occurs. Intelligent network models such as radial basis function (RBF), back propagation (BP) neural network and support vector machine (SVM) model can be used for gas dispersion prediction. However, the prediction results from these network models with too many inputs based on original monitoring parameters are not in good agreement with the experimental data. Then, a new series of machine learning algorithms (MLA) models combined classic Gaussian model with MLA algorithm has been presented. The prediction results from new models are improved greatly. Among these models, Gaussian-SVM model performs best and its computation time is close to that of classic Gaussian dispersion model. Finally, Gaussian-MLA models were applied to identifying the emission source parameters with the particle swarm optimization (PSO) method. The estimation performance of PSO with Gaussian-MLA is better than that with Gaussian, Lagrangian stochastic (LS) dispersion model and network models based on original monitoring parameters. Hence, the new prediction model based on Gaussian-MLA is potentially a good method to predict contaminant gas dispersion as well as a good forward model in emission source parameters identification problem.
Statistically tuned Gaussian background subtraction technique for ...
Indian Academy of Sciences (India)
The non-parametric background modelling approach proposed by Martin Hofmann et al (2012) involves modelling of foreground by the history of recently ... background subtraction system with mixture of Gaussians, deviation scaling factor and max– min background model for outdoor environment. Selection of detection ...
Energy Technology Data Exchange (ETDEWEB)
Hoejstrup, J. [NEG Micon Project Development A/S, Randers (Denmark); Hansen, K.S. [Denmarks Technical Univ., Dept. of Energy Engineering, Lyngby (Denmark); Pedersen, B.J. [VESTAS Wind Systems A/S, Lem (Denmark); Nielsen, M. [Risoe National Lab., Wind Energy and Atmospheric Physics, Roskilde (Denmark)
1999-03-01
The pdf`s of atmospheric turbulence have somewhat wider tails than a Gaussian, especially regarding accelerations, whereas velocities are close to Gaussian. This behaviour is being investigated using data from a large WEB-database in order to quantify the amount of non-Gaussianity. Models for non-Gaussian turbulence have been developed, by which artificial turbulence can be generated with specified distributions, spectra and cross-correlations. The artificial time series will then be used in load models and the resulting loads in the Gaussian and the non-Gaussian cases will be compared. (au)
Thermodynamic modeling of CO2 mixtures
DEFF Research Database (Denmark)
Bjørner, Martin Gamel
Knowledge of the thermodynamic properties and phase equilibria of mixtures containing carbon dioxide (CO2) is important in several industrial processes such as enhanced oil recovery, carbon capture and storage, and supercritical extractions, where CO2 is used as a solvent. Despite this importance......, accurate predictions of the thermodynamic properties and phase equilibria of mixtures containing CO2 are challenging with classical models such as the Soave-Redlich-Kwong (SRK) equation of state (EoS). This is believed to be due to the fact, that CO2 has a large quadrupole moment which the classical models...... and with or without introducing an additional pure compound parameter. In the absence of quadrupolar compounds qCPA reduces to CPA, which itself reduces toSRK in the absence of association. As the number of adjustable parameters in thermodynamic models increase, the parameter estimation problem becomes increasingly...
Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models
DEFF Research Database (Denmark)
Vehtari, Aki; Mononen, Tommi; Tolvanen, Ville
2016-01-01
The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation. In this article, we consider Gaussian latent variable models where the integration over the latent values is approximated using the Laplace method or expectation propagation (EP). We study the ...
Ability of the Gaussian plume model to predict and describe spore dispersal over a potato crop
Spijkerboer, H.P.; Beniers, J.E.; Jaspers, D.; Schouten, H.J.; Goudriaan, J.; Rabbinge, R.; Werf, van der W.
2002-01-01
The Gaussian plume model (GPM) is considered as a valuable tool in predictions of the atmospheric transport of fungal spores and plant pollen in risk assessments. The validity of the model in this important area of application has not been extensively evaluated. A field experiment was set up to test
Probabilistic Discrete Mixtures Colour Texture Models
Czech Academy of Sciences Publication Activity Database
Haindl, Michal; Havlíček, Vojtěch; Grim, Jiří
2008-01-01
Roč. 2008, č. 5197 (2008), s. 675-682 ISSN 0302-9743. [Iberoamerican Congress on Pattern Recognition /13./. Havana, 09.092008-12.09.2008] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572; GA ČR GA102/07/1594; GA ČR GA102/08/0593 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Discrete distribution mixtures * EM algorithm * texture modeling Subject RIV: BD - Theory of Information http://library.utia.cas.cz/separaty/2008/RO/haindl-havlicek-grim-probabilistic%20discrete%20mixtures%20colour%20texture%20models.pdf
Krohn, Olivia; Armbruster, Aaron; Gao, Yongsheng; Atlas Collaboration
2017-01-01
Software tools developed for the purpose of modeling CERN LHC pp collision data to aid in its interpretation are presented. Some measurements are not adequately described by a Gaussian distribution; thus an interpretation assuming Gaussian uncertainties will inevitably introduce bias, necessitating analytical tools to recreate and evaluate non-Gaussian features. One example is the measurements of Higgs boson production rates in different decay channels, and the interpretation of these measurements. The ratios of data to Standard Model expectations (μ) for five arbitrary signals were modeled by building five Poisson distributions with mixed signal contributions such that the measured values of μ are correlated. Algorithms were designed to recreate probability distribution functions of μ as multi-variate Gaussians, where the standard deviation (σ) and correlation coefficients (ρ) are parametrized. There was good success with modeling 1-D likelihood contours of μ, and the multi-dimensional distributions were well modeled within 1- σ but the model began to diverge after 2- σ due to unmerited assumptions in developing ρ. Future plans to improve the algorithms and develop a user-friendly analysis package will also be discussed. NSF International Research Experiences for Students
Texture modelling by discrete distribution mixtures
Czech Academy of Sciences Publication Activity Database
Grim, Jiří; Haindl, Michal
2003-01-01
Roč. 41, 3-4 (2003), s. 603-615 ISSN 0167-9473 R&D Projects: GA ČR GA102/00/0030; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : discrete distribution mixtures * EM algorithm * texture modelling Subject RIV: JC - Computer Hardware ; Software Impact factor: 0.711, year: 2003
Text document classification based on mixture models
Czech Academy of Sciences Publication Activity Database
Novovičová, Jana; Malík, Antonín
2004-01-01
Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004
Computational aspects of N-mixture models.
Dennis, Emily B; Morgan, Byron J T; Ridout, Martin S
2015-03-01
The N-mixture model is widely used to estimate the abundance of a population in the presence of unknown detection probability from only a set of counts subject to spatial and temporal replication (Royle, 2004, Biometrics 60, 105-115). We explain and exploit the equivalence of N-mixture and multivariate Poisson and negative-binomial models, which provides powerful new approaches for fitting these models. We show that particularly when detection probability and the number of sampling occasions are small, infinite estimates of abundance can arise. We propose a sample covariance as a diagnostic for this event, and demonstrate its good performance in the Poisson case. Infinite estimates may be missed in practice, due to numerical optimization procedures terminating at arbitrarily large values. It is shown that the use of a bound, K, for an infinite summation in the N-mixture likelihood can result in underestimation of abundance, so that default values of K in computer packages should be avoided. Instead we propose a simple automatic way to choose K. The methods are illustrated by analysis of data on Hermann's tortoise Testudo hermanni. © 2014 The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.
Joint Bayesian Analysis of Parameters and States in Nonlinear, Non-Gaussian State Space Models
Barra, I.; Hoogerheide, L.F.; Koopman, S.J.; Lucas, A.
2017-01-01
We propose a new methodology for designing flexible proposal densities for the joint posterior density of parameters and states in a nonlinear, non-Gaussian state space model. We show that a highly efficient Bayesian procedure emerges when these proposal densities are used in an independent
Ground states and formal duality relations in the Gaussian core model
Cohn, H.; Kumar, A.; Schürmann, A.
2009-01-01
We study dimensional trends in ground states for soft-matter systems. Specifically, using a high-dimensional version of Parrinello-Rahman dynamics, we investigate the behavior of the Gaussian core model in up to eight dimensions. The results include unexpected geometric structures, with surprising
Numerically Accelerated Importance Sampling for Nonlinear Non-Gaussian State Space Models
Koopman, S.J.; Lucas, A.; Scharth, M.
2015-01-01
We propose a general likelihood evaluation method for nonlinear non-Gaussian state-space models using the simulation-based method of efficient importance sampling. We minimize the simulation effort by replacing some key steps of the likelihood estimation procedure by numerical integration. We refer
Modeling non-Gaussian time-varying vector autoregressive process
National Aeronautics and Space Administration — We present a novel and general methodology for modeling time-varying vector autoregressive processes which are widely used in many areas such as modeling of chemical...
Adaptive Gaussian Predictive Process Models for Large Spatial Datasets
Guhaniyogi, Rajarshi; Finley, Andrew O.; Banerjee, Sudipto; Gelfand, Alan E.
2011-01-01
Large point referenced datasets occur frequently in the environmental and natural sciences. Use of Bayesian hierarchical spatial models for analyzing these datasets is undermined by onerous computational burdens associated with parameter estimation. Low-rank spatial process models attempt to resolve this problem by projecting spatial effects to a lower-dimensional subspace. This subspace is determined by a judicious choice of “knots” or locations that are fixed a priori. One such representation yields a class of predictive process models (e.g., Banerjee et al., 2008) for spatial and spatial-temporal data. Our contribution here expands upon predictive process models with fixed knots to models that accommodate stochastic modeling of the knots. We view the knots as emerging from a point pattern and investigate how such adaptive specifications can yield more flexible hierarchical frameworks that lead to automated knot selection and substantial computational benefits. PMID:22298952
Investigation of a Gamma model for mixture STR samples
DEFF Research Database (Denmark)
Christensen, Susanne; Bøttcher, Susanne Gammelgaard; Lauritzen, Steffen L.
The behaviour of PCR Amplification Kit, when used for mixture STR samples, is investigated. A model based on the Gamma distribution is fitted to the amplifier output for constructed mixtures, and the assumptions of the model is evaluated via residual analysis.......The behaviour of PCR Amplification Kit, when used for mixture STR samples, is investigated. A model based on the Gamma distribution is fitted to the amplifier output for constructed mixtures, and the assumptions of the model is evaluated via residual analysis....
Bayes factor between Student t and Gaussian mixed models within an animal breeding context
Directory of Open Access Journals (Sweden)
García-Cortés Luis
2008-07-01
Full Text Available Abstract The implementation of Student t mixed models in animal breeding has been suggested as a useful statistical tool to effectively mute the impact of preferential treatment or other sources of outliers in field data. Nevertheless, these additional sources of variation are undeclared and we do not know whether a Student t mixed model is required or if a standard, and less parameterized, Gaussian mixed model would be sufficient to serve the intended purpose. Within this context, our aim was to develop the Bayes factor between two nested models that only differed in a bounded variable in order to easily compare a Student t and a Gaussian mixed model. It is important to highlight that the Student t density converges to a Gaussian process when degrees of freedom tend to infinity. The twomodels can then be viewed as nested models that differ in terms of degrees of freedom. The Bayes factor can be easily calculated from the output of a Markov chain Monte Carlo sampling of the complex model (Student t mixed model. The performance of this Bayes factor was tested under simulation and on a real dataset, using the deviation information criterion (DIC as the standard reference criterion. The two statistical tools showed similar trends along the parameter space, although the Bayes factor appeared to be the more conservative. There was considerable evidence favoring the Student t mixed model for data sets simulated under Student t processes with limited degrees of freedom, and moderate advantages associated with using the Gaussian mixed model when working with datasets simulated with 50 or more degrees of freedom. For the analysis of real data (weight of Pietrain pigs at six months, both the Bayes factor and DIC slightly favored the Student t mixed model, with there being a reduced incidence of outlier individuals in this population.
Gulliver, Eric A.
The objective of this thesis to identify and develop techniques providing direct comparison between simulated and real packed particle mixture microstructures containing submicron-sized particles. This entailed devising techniques for simulating powder mixtures, producing real mixtures with known powder characteristics, sectioning real mixtures, interrogating mixture cross-sections, evaluating and quantifying the mixture interrogation process and for comparing interrogation results between mixtures. A drop and roll-type particle-packing model was used to generate simulations of random mixtures. The simulated mixtures were then evaluated to establish that they were not segregated and free from gross defects. A powder processing protocol was established to provide real mixtures for direct comparison and for use in evaluating the simulation. The powder processing protocol was designed to minimize differences between measured particle size distributions and the particle size distributions in the mixture. A sectioning technique was developed that was capable of producing distortion free cross-sections of fine scale particulate mixtures. Tessellation analysis was used to interrogate mixture cross sections and statistical quality control charts were used to evaluate different types of tessellation analysis and to establish the importance of differences between simulated and real mixtures. The particle-packing program generated crescent shaped pores below large particles but realistic looking mixture microstructures otherwise. Focused ion beam milling was the only technique capable of sectioning particle compacts in a manner suitable for stereological analysis. Johnson-Mehl and Voronoi tessellation of the same cross-sections produced tessellation tiles with different the-area populations. Control charts analysis showed Johnson-Mehl tessellation measurements are superior to Voronoi tessellation measurements for detecting variations in mixture microstructure, such as altered
Statistically tuned Gaussian background subtraction technique for ...
Indian Academy of Sciences (India)
ground, small objects, moving background and multiple objects are considered for evaluation. The technique is statistically compared with frame differencing technique, temporal median method and mixture of Gaussian model and performance evaluation is done to check the effectiveness of the proposed technique after ...
Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model
Directory of Open Access Journals (Sweden)
Lotter Thomas
2005-01-01
Full Text Available This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.
Perfect posterior simulation for mixture and hidden Marko models
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Breyer, Laird A.; Roberts, Gareth O.
2010-01-01
In this paper we present an application of the read-once coupling from the past algorithm to problems in Bayesian inference for latent statistical models. We describe a method for perfect simulation from the posterior distribution of the unknown mixture weights in a mixture model. Our method...... is extended to a more general mixture problem, where unknown parameters exist for the mixture components, and to a hidden Markov model....
Modelling Inverse Gaussian Data with Censored Response Values: EM versus MCMC
Directory of Open Access Journals (Sweden)
R. S. Sparks
2011-01-01
Full Text Available Low detection limits are common in measure environmental variables. Building models using data containing low or high detection limits without adjusting for the censoring produces biased models. This paper offers approaches to estimate an inverse Gaussian distribution when some of the data used are censored because of low or high detection limits. Adjustments for the censoring can be made if there is between 2% and 20% censoring using either the EM algorithm or MCMC. This paper compares these approaches.
Durbin, J.; Koopman, S.J.M.
1998-01-01
The analysis of non-Gaussian time series using state space models is considered from both classical and Bayesian perspectives. The treatment in both cases is based on simulation using importance sampling and antithetic variables; Monte Carlo Markov chain methods are not employed. Non-Gaussian
Robust non-rigid point set registration using student's-t mixture model.
Directory of Open Access Journals (Sweden)
Zhiyong Zhou
Full Text Available The Student's-t mixture model, which is heavily tailed and more robust than the Gaussian mixture model, has recently received great attention on image processing. In this paper, we propose a robust non-rigid point set registration algorithm using the Student's-t mixture model. Specifically, first, we consider the alignment of two point sets as a probability density estimation problem and treat one point set as Student's-t mixture model centroids. Then, we fit the Student's-t mixture model centroids to the other point set which is treated as data. Finally, we get the closed-form solutions of registration parameters, leading to a computationally efficient registration algorithm. The proposed algorithm is especially effective for addressing the non-rigid point set registration problem when significant amounts of noise and outliers are present. Moreover, less registration parameters have to be set manually for our algorithm compared to the popular coherent points drift (CPD algorithm. We have compared our algorithm with other state-of-the-art registration algorithms on both 2D and 3D data with noise and outliers, where our non-rigid registration algorithm showed accurate results and outperformed the other algorithms.
Gaussian tunneling model of c-axis twist Josephson junctions
International Nuclear Information System (INIS)
Bille, A.; Klemm, R.A.; Scharnberg, K.
2001-01-01
We calculate the critical current density J c J ((var p hi) 0 ) for Josephson tunneling between identical high-temperature superconductors twisted an angle (var p hi) 0 about the c axis. Regardless of the shape of the two-dimensional Fermi surface and for very general tunneling matrix elements, an order parameter (OP) with general d-wave symmetry leads to J c J (π/4)=0. This general result is inconsistent with the data of Li et al. [Phys. Rev. Lett. 83, 4160 (1999)] on Bi 2 Sr 2 CaCu 2 O 8+δ (Bi2212), which showed J c J to be independent of (var p hi) 0 . If the momentum parallel to the barrier is conserved in the tunneling process, J c J should vary substantially with the twist angle (var p hi) 0 when the tight-binding Fermi surface appropriate for Bi2212 is taken into account, even if the OP is completely isotropic. We quantify the degree of momentum nonconservation necessary to render J c J ((var p hi) 0 ) constant within experimental error for a variety of pair states by interpolating between the coherent and incoherent limits using five specific models to describe the momentum dependence of the tunneling matrix element squared. From the data of Li et al., we conclude that the c-axis tunneling in Bi2212 must be very nearly incoherent, and that the OP must have a nonvanishing Fermi-surface average for T c . We further show that the apparent conventional sum-rule violation observed by Basov et al. [Science 283, 49 (1999)] can be consistent with such strongly incoherent c-axis tunneling.
Maximum Correntropy Criterion Kalman Filter for α-Jerk Tracking Model with Non-Gaussian Noise
Directory of Open Access Journals (Sweden)
Bowen Hou
2017-11-01
Full Text Available As one of the most critical issues for target track, α -jerk model is an effective maneuver target track model. Non-Gaussian noises always exist in the track process, which usually lead to inconsistency and divergence of the track filter. A novel Kalman filter is derived and applied on α -jerk tracking model to handle non-Gaussian noise. The weighted least square solution is presented and the standard Kalman filter is deduced firstly. A novel Kalman filter with the weighted least square based on the maximum correntropy criterion is deduced. The robustness of the maximum correntropy criterion is also analyzed with the influence function and compared with the Huber-based filter, and, moreover, the kernel size of Gaussian kernel plays an important role in the filter algorithm. A new adaptive kernel method is proposed in this paper to adjust the parameter in real time. Finally, simulation results indicate the validity and the efficiency of the proposed filter. The comparison study shows that the proposed filter can significantly reduce the noise influence for α -jerk model.
International Nuclear Information System (INIS)
Emad, A.A.; El Shazly, S.M.; Kassem, Kh.O.
2010-01-01
A line source model, developed in laboratory of environmental physics, faculty of science at Qena, Egypt is proposed to describe the downwind dispersion of pollutants near roadways, at different cities in Egypt. The model is based on the Gaussian plume methodology and is used to predict air pollutants' concentrations near roadways. In this direction, simple software has been presented in this paper, developed by authors, adopted completely Graphical User Interface (GUI) technique for operating in various windows-based microcomputers. The software interface and code have been designed by Microsoft Visual basic 6.0 based on the Gaussian diffusion equation. This software is developed to predict concentrations of gaseous pollutants (eg. CO, SO 2 , NO 2 and particulates) at a user specified receptor grid
Lam, Lun Tak; Sun, Yi; Davey, Neil; Adams, Rod; Prapopoulou, Maria; Brown, Marc B; Moss, Gary P
2010-06-01
The aim was to employ Gaussian processes to assess mathematically the nature of a skin permeability dataset and to employ these methods, particularly feature selection, to determine the key physicochemical descriptors which exert the most significant influence on percutaneous absorption, and to compare such models with established existing models. Gaussian processes, including automatic relevance detection (GPRARD) methods, were employed to develop models of percutaneous absorption that identified key physicochemical descriptors of percutaneous absorption. Using MatLab software, the statistical performance of these models was compared with single linear networks (SLN) and quantitative structure-permeability relationships (QSPRs). Feature selection methods were used to examine in more detail the physicochemical parameters used in this study. A range of statistical measures to determine model quality were used. The inherently nonlinear nature of the skin data set was confirmed. The Gaussian process regression (GPR) methods yielded predictive models that offered statistically significant improvements over SLN and QSPR models with regard to predictivity (where the rank order was: GPR > SLN > QSPR). Feature selection analysis determined that the best GPR models were those that contained log P, melting point and the number of hydrogen bond donor groups as significant descriptors. Further statistical analysis also found that great synergy existed between certain parameters. It suggested that a number of the descriptors employed were effectively interchangeable, thus questioning the use of models where discrete variables are output, usually in the form of an equation. The use of a nonlinear GPR method produced models with significantly improved predictivity, compared with SLN or QSPR models. Feature selection methods were able to provide important mechanistic information. However, it was also shown that significant synergy existed between certain parameters, and as such it
Extreme-Strike and Small-time Asymptotics for Gaussian Stochastic Volatility Models
Zhang, Xin
2016-01-01
Asymptotic behavior of implied volatility is of our interest in this dissertation. For extreme strike, we consider a stochastic volatility asset price model in which the volatility is the absolute value of a continuous Gaussian process with arbitrary prescribed mean and covariance. By exhibiting a Karhunen-Loève expansion for the integrated variance, and using sharp estimates of the density of a general second-chaos variable, we derive asymptotics for the asset price density for large or smal...
Beam conditions for radiation generated by an electromagnetic Gaussian Schell-model source.
Korotkova, Olga; Salem, Mohamed; Wolf, Emil
2004-06-01
It was shown recently that the basic properties of a fluctuating electromagnetic beam can be derived from knowledge of a 2 x 2 cross-spectral density matrix of the electric field in the source plane. However, not every such matrix represents a source that will generate a beamlike field. We derive conditions that the matrix must satisfy for the source to generate an electromagnetic Gaussian Schell-model beam.
Mixture of Regression Models with Single-Index
Xiang, Sijia; Yao, Weixin
2016-01-01
In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Gaussian Process Noise Modeling with RadVel: a Case Study of HD 3167
Blunt, Sarah; Fulton, Benjamin; Petigura, Erik; Howard, Andrew; Sinukoff, Evan
2018-01-01
Gaussian process regression is a promising technique to account for the presence of correlated noise in radial velocity (RV) time series. We present version 2 of RadVel, an open-source RV fitting toolkit that can model the effects of stellar variability using Gaussian process regression. To illustrate the features of our code and the power of Gaussian process regression, we present a re-analysis of the HD 3167 system (Vanderberg et al. 2016, Christiansen et al. 2017, Gandolfi et al. 2017), using a quasi-periodic kernel to model the stellar activity. We combine RV datasets from HARPS, HARPS-N, FIES, APF, and HIRES in our analysis, yielding a total of 366 RV measurements. Our fit indicates that the magnitude of the RV variation due to stellar activity has an amplitude comparable to those of the planetary signals, confirming that a detailed activity model is needed for this system. We obtain a planet b mass consistent with that of Christiansen et al, but a significantly higher planet c mass and a lower mass for planet d.
Nguyen, Ngoc Minh; Corff, Sylvain Le; Moulines, Éric
2017-12-01
This paper focuses on sequential Monte Carlo approximations of smoothing distributions in conditionally linear and Gaussian state spaces. To reduce Monte Carlo variance of smoothers, it is typical in these models to use Rao-Blackwellization: particle approximation is used to sample sequences of hidden regimes while the Gaussian states are explicitly integrated conditional on the sequence of regimes and observations, using variants of the Kalman filter/smoother. The first successful attempt to use Rao-Blackwellization for smoothing extends the Bryson-Frazier smoother for Gaussian linear state space models using the generalized two-filter formula together with Kalman filters/smoothers. More recently, a forward-backward decomposition of smoothing distributions mimicking the Rauch-Tung-Striebel smoother for the regimes combined with backward Kalman updates has been introduced. This paper investigates the benefit of introducing additional rejuvenation steps in all these algorithms to sample at each time instant new regimes conditional on the forward and backward particles. This defines particle-based approximations of the smoothing distributions whose support is not restricted to the set of particles sampled in the forward or backward filter. These procedures are applied to commodity markets which are described using a two-factor model based on the spot price and a convenience yield for crude oil data.
Performance modeling and analysis of parallel Gaussian elimination on multi-core computers
Directory of Open Access Journals (Sweden)
Fadi N. Sibai
2014-01-01
Full Text Available Gaussian elimination is used in many applications and in particular in the solution of systems of linear equations. This paper presents mathematical performance models and analysis of four parallel Gaussian Elimination methods (precisely the Original method and the new Meet in the Middle –MiM– algorithms and their variants with SIMD vectorization on multi-core systems. Analytical performance models of the four methods are formulated and presented followed by evaluations of these models with modern multi-core systems’ operation latencies. Our results reveal that the four methods generally exhibit good performance scaling with increasing matrix size and number of cores. SIMD vectorization only makes a large difference in performance for low number of cores. For a large matrix size (n ⩾ 16 K, the performance difference between the MiM and Original methods falls from 16× with four cores to 4× with 16 K cores. The efficiencies of all four methods are low with 1 K cores or more stressing a major problem of multi-core systems where the network-on-chip and memory latencies are too high in relation to basic arithmetic operations. Thus Gaussian Elimination can greatly benefit from the resources of multi-core systems, but higher performance gains can be achieved if multi-core systems can be designed with lower memory operation, synchronization, and interconnect communication latencies, requirements of utmost importance and challenge in the exascale computing age.
Modeling and forecasting foreign exchange daily closing prices with normal inverse Gaussian
Teneng, Dean
2013-09-01
We fit the normal inverse Gaussian(NIG) distribution to foreign exchange closing prices using the open software package R and select best models by Käärik and Umbleja (2011) proposed strategy. We observe that daily closing prices (12/04/2008 - 07/08/2012) of CHF/JPY, AUD/JPY, GBP/JPY, NZD/USD, QAR/CHF, QAR/EUR, SAR/CHF, SAR/EUR, TND/CHF and TND/EUR are excellent fits while EGP/EUR and EUR/GBP are good fits with a Kolmogorov-Smirnov test p-value of 0.062 and 0.08 respectively. It was impossible to estimate normal inverse Gaussian parameters (by maximum likelihood; computational problem) for JPY/CHF but CHF/JPY was an excellent fit. Thus, while the stochastic properties of an exchange rate can be completely modeled with a probability distribution in one direction, it may be impossible the other way around. We also demonstrate that foreign exchange closing prices can be forecasted with the normal inverse Gaussian (NIG) Lévy process, both in cases where the daily closing prices can and cannot be modeled by NIG distribution.
Gaussian and Affine Approximation of Stochastic Diffusion Models for Interest and Mortality Rates
Directory of Open Access Journals (Sweden)
Marcus C. Christiansen
2013-10-01
Full Text Available In the actuarial literature, it has become common practice to model future capital returns and mortality rates stochastically in order to capture market risk and forecasting risk. Although interest rates often should and mortality rates always have to be non-negative, many authors use stochastic diffusion models with an affine drift term and additive noise. As a result, the diffusion process is Gaussian and, thus, analytically tractable, but negative values occur with positive probability. The argument is that the class of Gaussian diffusions would be a good approximation of the real future development. We challenge that reasoning and study the asymptotics of diffusion processes with affine drift and a general noise term with corresponding diffusion processes with an affine drift term and an affine noise term or additive noise. Our study helps to quantify the error that is made by approximating diffusive interest and mortality rate models with Gaussian diffusions and affine diffusions. In particular, we discuss forward interest and forward mortality rates and the error that approximations cause on the valuation of life insurance claims.
Linear-quadratic-Gaussian control for adaptive optics systems using a hybrid model.
Looze, Douglas P
2009-01-01
This paper presents a linear-quadratic-Gaussian (LQG) design based on the equivalent discrete-time model of an adaptive optics (AO) system. The design model incorporates deformable mirror dynamics, an asynchronous wavefront sensor and zero-order hold operation, and a continuous-time model of the incident wavefront. Using the structure of the discrete-time model, the dimensions of the Riccati equations to be solved are reduced. The LQG controller is shown to improve AO system performance under several conditions.
Inflation with multiple sound speeds: A model of multiple DBI type actions and non-Gaussianities
International Nuclear Information System (INIS)
Cai Yifu; Xia Haiying
2009-01-01
In this Letter we study adiabatic and isocurvature perturbations in the frame of inflation with multiple sound speeds involved. We suggest this scenario can be realized by a number of generalized scalar fields with arbitrary kinetic forms. These scalars have their own sound speeds respectively, so the propagations of field fluctuations are individual. Specifically, we study a model constructed by two DBI type actions. We find that the critical length scale for the freezing of perturbations corresponds to the maximum sound horizon. Moreover, if the mass term of one field is much lighter than that of the other, the entropy perturbation could be quite large and so may give rise to a growth outside sound horizon. At cubic order, we find that the non-Gaussianity of local type is possibly large when entropy perturbations are able to convert into curvature perturbations. We also calculate the non-Gaussianity of equilateral type approximately.
Following a trend with an exponential moving average: Analytical results for a Gaussian model
Grebenkov, Denis S.; Serror, Jeremy
2014-01-01
We investigate how price variations of a stock are transformed into profits and losses (P&Ls) of a trend following strategy. In the frame of a Gaussian model, we derive the probability distribution of P&Ls and analyze its moments (mean, variance, skewness and kurtosis) and asymptotic behavior (quantiles). We show that the asymmetry of the distribution (with often small losses and less frequent but significant profits) is reminiscent to trend following strategies and less dependent on peculiarities of price variations. At short times, trend following strategies admit larger losses than one may anticipate from standard Gaussian estimates, while smaller losses are ensured at longer times. Simple explicit formulas characterizing the distribution of P&Ls illustrate the basic mechanisms of momentum trading, while general matrix representations can be applied to arbitrary Gaussian models. We also compute explicitly annualized risk adjusted P&L and strategy turnover to account for transaction costs. We deduce the trend following optimal timescale and its dependence on both auto-correlation level and transaction costs. Theoretical results are illustrated on the Dow Jones index.
GLIMMIX : Software for estimating mixtures and mixtures of generalized linear models
Wedel, M
2001-01-01
GLIMMIX is a commercial WINDOWS-based computer program that implements the EM algorithm (Dempster, Laird and Rubin 1977) for the estimation of finite mixtures and mixtures of generalized linear models. The program allows for the specification of a number of distributions in the exponential family,
Numerical simulation of slurry jets using mixture model
Directory of Open Access Journals (Sweden)
Wen-xin Huai
2013-01-01
Full Text Available Slurry jets in a static uniform environment were simulated with a two-phase mixture model in which flow-particle interactions were considered. A standard k-ε turbulence model was chosen to close the governing equations. The computational results were in agreement with previous laboratory measurements. The characteristics of the two-phase flow field and the influences of hydraulic and geometric parameters on the distribution of the slurry jets were analyzed on the basis of the computational results. The calculated results reveal that if the initial velocity of the slurry jet is high, the jet spreads less in the radial direction. When the slurry jet is less influenced by the ambient fluid (when the Stokes number St is relatively large, the turbulent kinetic energy k and turbulent dissipation rate ε, which are relatively concentrated around the jet axis, decrease more rapidly after the slurry jet passes through the nozzle. For different values of St, the radial distributions of streamwise velocity and particle volume fraction are both self-similar and fit a Gaussian profile after the slurry jet fully develops. The decay rate of the particle velocity is lower than that of water velocity along the jet axis, and the axial distributions of the centerline particle streamwise velocity are self-similar along the jet axis. The pattern of particle dispersion depends on the Stokes number St. When St = 0.39, the particle dispersion along the radial direction is considerable, and the relative velocity is very low due to the low dynamic response time. When St = 3.08, the dispersion of particles along the radial direction is very little, and most of the particles have high relative velocities along the streamwise direction.
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
mixtools: An R Package for Analyzing Mixture Models
Directory of Open Access Journals (Sweden)
Tatiana Benaglia
2009-10-01
Full Text Available The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture models. In the latter category, mixtools provides algorithms for estimating parameters in a wide range of different mixture-of-regression contexts, in multinomial mixtures such as those arising from discretizing continuous multivariate data, in nonparametric situations where the multivariate component densities are completely unspecified, and in semiparametric situations such as a univariate location mixture of symmetric but otherwise unspecified densities. Many of the algorithms of the mixtools package are EM algorithms or are based on EM-like ideas, so this article includes an overview of EM algorithms for finite mixture models.
DEFF Research Database (Denmark)
Jacobsen, Christian Robert Dahl; Møller, Jesper
2017-01-01
We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...... problems in two-dimensional images....
Gaussian model for emission rate measurement of heated plumes using hyperspectral data
Grauer, Samuel J.; Conrad, Bradley M.; Miguel, Rodrigo B.; Daun, Kyle J.
2018-02-01
This paper presents a novel model for measuring the emission rate of a heated gas plume using hyperspectral data from an FTIR imaging spectrometer. The radiative transfer equation (RTE) is used to relate the spectral intensity of a pixel to presumed Gaussian distributions of volume fraction and temperature within the plume, along a line-of-sight that corresponds to the pixel, whereas previous techniques exclusively presume uniform distributions for these parameters. Estimates of volume fraction and temperature are converted to a column density by integrating the local molecular density along each path. Image correlation velocimetry is then employed on raw spectral intensity images to estimate the volume-weighted normal velocity at each pixel. Finally, integrating the product of velocity and column density along a control surface yields an estimate of the instantaneous emission rate. For validation, emission rate estimates were derived from synthetic hyperspectral images of a heated methane plume, generated using data from a large-eddy simulation. Calculating the RTE with Gaussian distributions of volume fraction and temperature, instead of uniform distributions, improved the accuracy of column density measurement by 14%. Moreover, the mean methane emission rate measured using our approach was within 4% of the ground truth. These results support the use of Gaussian distributions of thermodynamic properties in calculation of the RTE for optical gas diagnostics.
Application of Gaussian cubature to model two-dimensional population balances
Directory of Open Access Journals (Sweden)
Bałdyga Jerzy
2017-09-01
Full Text Available In many systems of engineering interest the moment transformation of population balance is applied. One of the methods to solve the transformed population balance equations is the quadrature method of moments. It is based on the approximation of the density function in the source term by the Gaussian quadrature so that it preserves the moments of the original distribution. In this work we propose another method to be applied to the multivariate population problem in chemical engineering, namely a Gaussian cubature (GC technique that applies linear programming for the approximation of the multivariate distribution. Examples of the application of the Gaussian cubature (GC are presented for four processes typical for chemical engineering applications. The first and second ones are devoted to crystallization modeling with direction-dependent two-dimensional and three-dimensional growth rates, the third one represents drop dispersion accompanied by mass transfer in liquid-liquid dispersions and finally the fourth case regards the aggregation and sintering of particle populations.
Oscillatory Behavior of Critical Amplitudes of the Gaussian Model on a Hierarchical Structure
Knezevic, Milan; Knezevic, Dragica
1999-01-01
We studied oscillatory behavior of critical amplitudes for the Gaussian model on a hierarchical structure presented by a modified Sierpinski gasket lattice. This model is known to display non-standard critical behavior on the lattice under study. The leading singular behavior of the correlation length $\\xi$ near the critical coupling $K=K_c$ is modulated by a function which is periodic in $\\ln|\\ln(K_c-K)|$. We have also shown that the common finite-size scaling hypothesis, according to which ...
Evaluating Mixture Modeling for Clustering: Recommendations and Cautions
Steinley, Douglas; Brusco, Michael J.
2011-01-01
This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…
Recursive Gaussian Process Regression Model for Adaptive Quality Monitoring in Batch Processes
Directory of Open Access Journals (Sweden)
Le Zhou
2015-01-01
Full Text Available In chemical batch processes with slow responses and a long duration, it is time-consuming and expensive to obtain sufficient normal data for statistical analysis. With the persistent accumulation of the newly evolving data, the modelling becomes adequate gradually and the subsequent batches will change slightly owing to the slow time-varying behavior. To efficiently make use of the small amount of initial data and the newly evolving data sets, an adaptive monitoring scheme based on the recursive Gaussian process (RGP model is designed in this paper. Based on the initial data, a Gaussian process model and the corresponding SPE statistic are constructed at first. When the new batches of data are included, a strategy based on the RGP model is used to choose the proper data for model updating. The performance of the proposed method is finally demonstrated by a penicillin fermentation batch process and the result indicates that the proposed monitoring scheme is effective for adaptive modelling and online monitoring.
Content-adaptive pentary steganography using the multivariate generalized Gaussian cover model
Sedighi, Vahid; Fridrich, Jessica; Cogranne, Rémi
2015-03-01
The vast majority of steganographic schemes for digital images stored in the raster format limit the amplitude of embedding changes to the smallest possible value. In this paper, we investigate the possibility to further improve the empirical security by allowing the embedding changes in highly textured areas to have a larger amplitude and thus embedding there a larger payload. Our approach is entirely model driven in the sense that the probabilities with which the cover pixels should be changed by a certain amount are derived from the cover model to minimize the power of an optimal statistical test. The embedding consists of two steps. First, the sender estimates the cover model parameters, the pixel variances, when modeling the pixels as a sequence of independent but not identically distributed generalized Gaussian random variables. Then, the embedding change probabilities for changing each pixel by 1 or 2, which can be transformed to costs for practical embedding using syndrome-trellis codes, are computed by solving a pair of non-linear algebraic equations. Using rich models and selection-channel-aware features, we compare the security of our scheme based on the generalized Gaussian model with pentary versions of two popular embedding algorithms: HILL and S-UNIWARD.
A mechanistic model for rational design of optimal cellulase mixtures.
Levine, Seth E; Fox, Jerome M; Clark, Douglas S; Blanch, Harvey W
2011-11-01
A model-based framework is described that permits the optimal composition of cellulase enzyme mixtures to be found for lignocellulose hydrolysis. The rates of hydrolysis are shown to be dependent on the nature of the substrate. For bacterial microcrystalline cellulose (BMCC) hydrolyzed by a ternary cellulase mixture of EG2, CBHI, and CBHII, the optimal predicted mixture was 1:0:1 EG2:CBHI:CBHII at 24 h and 1:1:0 at 72 h, at loadings of 10 mg enzyme per g substrate. The model was validated with measurements of soluble cello-oligosaccharide production from BMCC during both single enzyme and mixed enzyme hydrolysis. Three-dimensional diagrams illustrating cellulose conversion were developed for mixtures of EG2, CBHI, CBHII acting on BMCC and predicted for other substrates with a range of substrate properties. Model predictions agreed well with experimental values of conversion after 24 h for a variety of enzyme mixtures. The predicted mixture performances for substrates with varying properties demonstrated the effects of initial degree of polymerization (DP) and surface area on the performance of cellulase mixtures. For substrates with a higher initial DP, endoglucanase enzymes accounted for a larger fraction of the optimal mixture. Substrates with low surface areas showed significantly reduced hydrolysis rates regardless of mixture composition. These insights, along with the quantitative predictions, demonstrate the utility of this model-based framework for optimizing cellulase mixtures. Copyright © 2011 Wiley Periodicals, Inc.
Gaussian covariance graph models accounting for correlated marker effects in genome-wide prediction.
Martínez, C A; Khare, K; Rahman, S; Elzo, M A
2017-10-01
Several statistical models used in genome-wide prediction assume uncorrelated marker allele substitution effects, but it is known that these effects may be correlated. In statistics, graphical models have been identified as a useful tool for covariance estimation in high-dimensional problems and it is an area that has recently experienced a great expansion. In Gaussian covariance graph models (GCovGM), the joint distribution of a set of random variables is assumed to be Gaussian and the pattern of zeros of the covariance matrix is encoded in terms of an undirected graph G. In this study, methods adapting the theory of GCovGM to genome-wide prediction were developed (Bayes GCov, Bayes GCov-KR and Bayes GCov-H). In simulated data sets, improvements in correlation between phenotypes and predicted breeding values and accuracies of predicted breeding values were found. Our models account for correlation of marker effects and permit to accommodate general structures as opposed to models proposed in previous studies, which consider spatial correlation only. In addition, they allow incorporation of biological information in the prediction process through its use when constructing graph G, and their extension to the multi-allelic loci case is straightforward. © 2017 Blackwell Verlag GmbH.
Discrimination of numerical proportions: A comparison of binomial and Gaussian models.
Raidvee, Aire; Lember, Jüri; Allik, Jüri
2017-01-01
Observers discriminated the numerical proportion of two sets of elements (N = 9, 13, 33, and 65) that differed either by color or orientation. According to the standard Thurstonian approach, the accuracy of proportion discrimination is determined by irreducible noise in the nervous system that stochastically transforms the number of presented visual elements onto a continuum of psychological states representing numerosity. As an alternative to this customary approach, we propose a Thurstonian-binomial model, which assumes discrete perceptual states, each of which is associated with a certain visual element. It is shown that the probability β with which each visual element can be noticed and registered by the perceptual system can explain data of numerical proportion discrimination at least as well as the continuous Thurstonian-Gaussian model, and better, if the greater parsimony of the Thurstonian-binomial model is taken into account using AIC model selection. We conclude that Gaussian and binomial models represent two different fundamental principles-internal noise vs. using only a fraction of available information-which are both plausible descriptions of visual perception.
A non-Gaussian Ornstein-Uhlenbeck model for pricing wind power futures
DEFF Research Database (Denmark)
Benth, Fred Espen; Pircalabu, Anca
2018-01-01
The recent introduction of wind power futures written on the German wind power production index has brought with it new interesting challenges in terms of modeling and pricing. Some particularities of this product are the strong seasonal component embedded in the underlying, the fact that the wind...... index is bounded from both above and below, and also that the futures are settled against a synthetically generated spot index. Here, we consider the non-Gaussian Ornstein-Uhlenbeck type processes proposed by Barndorff-Nielsen and Shephard (2001) in the context of modeling the wind power production...... index. We discuss the properties of the model and estimation of the model parameters. Further, the model allows for an analytical formula for pricing wind power futures. We provide an empirical study, where the model is calibrated to 37 years of German wind power production index that is synthetically...
Lattice Models of Amphiphile and Solvent Mixtures.
Brindle, David
Available from UMI in association with The British Library. Materials based on amphiphilic molecules have a wide range of industrial applications and are of fundamental importance in the structure of many biological systems. Their importance derives from their behaviour as surface-active agents in solubilization applications and because of their ability to form systems with varying degrees of structural order such as micelles, bilayers and liquid crystal phases. The nature of the molecular ordering is of importance both during the processing of these materials and in their final application. A Monte Carlo simulation of a three dimensional lattice model of an amphiphile and solvent mixture has been developed as an extension of earlier work in two dimensions. In the earlier investigation the simulation was carried out with three segment amphiphiles on a two dimensional lattice and cluster size distributions were determined for a range of temperatures, amphiphile concentrations and intermolecular interaction energies. In the current work, a wider range of structures are observed including micelles, bilayers and a vesicle. The structures are studied as a function of temperature, chain length, amphiphile concentration and intermolecular interaction energies. Clusters are characterised according to their shape, size and surface roughness. A detailed temperature -concentration phase diagram is presented for a system with four segment amphiphiles. The phase diagram shows a critical micelle concentration (c.m.c.) at low amphiphile concentrations and a transition from a bicontinuous to lamellar region at amphiphile concentrations around 50%. At high amphiphile concentrations, there is some evidence for the formation of a gel. The results obtained question the validity of current models of the c.m.c. The Monte Carlo simulations require extensive computing power and the simulation was carried out on a transputer array, where the parallel architecture allows high speed. The
Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes.
Nakamura, Tomoaki; Nagai, Takayuki; Mochihashi, Daichi; Kobayashi, Ichiro; Asoh, Hideki; Kaneko, Masahide
2017-01-01
Humans divide perceived continuous information into segments to facilitate recognition. For example, humans can segment speech waves into recognizable morphemes. Analogously, continuous motions are segmented into recognizable unit actions. People can divide continuous information into segments without using explicit segment points. This capacity for unsupervised segmentation is also useful for robots, because it enables them to flexibly learn languages, gestures, and actions. In this paper, we propose a Gaussian process-hidden semi-Markov model (GP-HSMM) that can divide continuous time series data into segments in an unsupervised manner. Our proposed method consists of a generative model based on the hidden semi-Markov model (HSMM), the emission distributions of which are Gaussian processes (GPs). Continuous time series data is generated by connecting segments generated by the GP. Segmentation can be achieved by using forward filtering-backward sampling to estimate the model's parameters, including the lengths and classes of the segments. In an experiment using the CMU motion capture dataset, we tested GP-HSMM with motion capture data containing simple exercise motions; the results of this experiment showed that the proposed GP-HSMM was comparable with other methods. We also conducted an experiment using karate motion capture data, which is more complex than exercise motion capture data; in this experiment, the segmentation accuracy of GP-HSMM was 0.92, which outperformed other methods.
A novel multitarget model of radiation-induced cell killing based on the Gaussian distribution.
Zhao, Lei; Mi, Dong; Sun, Yeqing
2017-05-07
The multitarget version of the traditional target theory based on the Poisson distribution is still used to describe the dose-survival curves of cells after ionizing radiation in radiobiology and radiotherapy. However, noting that the usual ionizing radiation damage is the result of two sequential stochastic processes, the probability distribution of the damage number per cell should follow a compound Poisson distribution, like e.g. Neyman's distribution of type A (N. A.). In consideration of that the Gaussian distribution can be considered as the approximation of the N. A. in the case of high flux, a multitarget model based on the Gaussian distribution is proposed to describe the cell inactivation effects in low linear energy transfer (LET) radiation with high dose-rate. Theoretical analysis and experimental data fitting indicate that the present theory is superior to the traditional multitarget model and similar to the Linear - Quadratic (LQ) model in describing the biological effects of low-LET radiation with high dose-rate, and the parameter ratio in the present model can be used as an alternative indicator to reflect the radiation damage and radiosensitivity of the cells. Copyright © 2017 Elsevier Ltd. All rights reserved.
Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes
Directory of Open Access Journals (Sweden)
Tomoaki Nakamura
2017-12-01
Full Text Available Humans divide perceived continuous information into segments to facilitate recognition. For example, humans can segment speech waves into recognizable morphemes. Analogously, continuous motions are segmented into recognizable unit actions. People can divide continuous information into segments without using explicit segment points. This capacity for unsupervised segmentation is also useful for robots, because it enables them to flexibly learn languages, gestures, and actions. In this paper, we propose a Gaussian process-hidden semi-Markov model (GP-HSMM that can divide continuous time series data into segments in an unsupervised manner. Our proposed method consists of a generative model based on the hidden semi-Markov model (HSMM, the emission distributions of which are Gaussian processes (GPs. Continuous time series data is generated by connecting segments generated by the GP. Segmentation can be achieved by using forward filtering-backward sampling to estimate the model's parameters, including the lengths and classes of the segments. In an experiment using the CMU motion capture dataset, we tested GP-HSMM with motion capture data containing simple exercise motions; the results of this experiment showed that the proposed GP-HSMM was comparable with other methods. We also conducted an experiment using karate motion capture data, which is more complex than exercise motion capture data; in this experiment, the segmentation accuracy of GP-HSMM was 0.92, which outperformed other methods.
Lattice Boltzmann model for thermal binary-mixture gas flows.
Kang, Jinfen; Prasianakis, Nikolaos I; Mantzaras, John
2013-05-01
A lattice Boltzmann model for thermal gas mixtures is derived. The kinetic model is designed in a way that combines properties of two previous literature models, namely, (a) a single-component thermal model and (b) a multicomponent isothermal model. A comprehensive platform for the study of various practical systems involving multicomponent mixture flows with large temperature differences is constructed. The governing thermohydrodynamic equations include the mass, momentum, energy conservation equations, and the multicomponent diffusion equation. The present model is able to simulate mixtures with adjustable Prandtl and Schmidt numbers. Validation in several flow configurations with temperature and species concentration ratios up to nine is presented.
A Multilevel Mixture IRT Model with an Application to DIF
Cho, Sun-Joo; Cohen, Allan S.
2010-01-01
Mixture item response theory models have been suggested as a potentially useful methodology for identifying latent groups formed along secondary, possibly nuisance dimensions. In this article, we describe a multilevel mixture item response theory (IRT) model (MMixIRTM) that allows for the possibility that this nuisance dimensionality may function…
Modeling amplitude SAR image with the Cauchy-Rayleigh mixture
Peng, Qiangqiang; Du, Qingyu; Yao, Yinwei; Huang, Huang
2017-10-01
In this paper, we introduce a novel mixture model of the SAR amplitude image, which is proposed as an approximation to the heavy-tailed Rayleigh model. The limitation of the heavy-tailed Rayleigh model in SAR image application is discussed. We also present an expectation-maximization (EM) algorithm based parameter estimation method for the Cauchy-Rayleigh mixture. We test the new model on some simulated data in order to confirm that is appropriate to the heavy-tailed Rayleigh model. The performance is evaluated by some statistic values (cumulative square errors (CSE) 0.99 and Kolmogorov-Smirnov distance (K-S) the performance of the proposed mixture model is tested on some real SAR images and compared with other models, including the heavy-tailed Rayleigh and Nakagami mixture models. The result indicates that the proposed model can be an optional statistical model for amplitude SAR images.
Fast fitting of non-Gaussian state-space models to animal movement data via Template Model Builder
DEFF Research Database (Denmark)
Albertsen, Christoffer Moesgaard; Whoriskey, Kim; Yurkowski, David
2015-01-01
recommend using the Laplace approximation combined with automatic differentiation (as implemented in the novel R package Template Model Builder; TMB) for the fast fitting of continuous-time multivariate non-Gaussian SSMs. Through Argos satellite tracking data, we demonstrate that the use of continuous...... are able to estimate additional parameters compared to previous methods, all without requiring a substantial increase in computational time. The model implementation is made available through the R package argosTrack....
Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection
White, P. A.; Reese, S.; Christensen, W. F.; Rupper, S.
2017-12-01
Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB on the Antarctic ice sheet. Using Favier et al.'s (2013) quality-controlled aggregate data set of SMB field measurements, a fully Bayesian spatial model is posed to estimate Antarctic SMB and propose new field measurement locations. Utilizing Nearest-Neighbor Gaussian process (NNGP) models, SMB is estimated over the Antarctic ice sheet. An Antarctic SMB map is rendered using this model and is compared with previous estimates. A prediction uncertainty map is created to identify regions of high SMB uncertainty. The model estimates net SMB to be 2173 Gton yr-1 with 95% credible interval (2021,2331) Gton yr-1. On average, these results suggest lower Antarctic SMB and higher uncertainty than previously purported [Vaughan et al. (1999); Van de Berg et al. (2006); Arthern, Winebrenner and Vaughan (2006); Bromwich et al. (2004); Lenaerts et al. (2012)], even though this model utilizes significantly more observations than previous models. Using the Gaussian process' uncertainty and model parameters, we propose 15 new measurement locations for field study utilizing a maximin space-filling, error-minimizing design; these potential measurements are identied to minimize future estimation uncertainty. Using currently accepted Antarctic mass balance estimates and our SMB estimate, we estimate net mass loss [Shepherd et al. (2012); Jacob et al. (2012)]. Furthermore, we discuss modeling details for both space-time data and combining field measurement data with output from mathematical models using the NNGP framework.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Using convex quadratic programming to model random media with Gaussian random fields
International Nuclear Information System (INIS)
Quintanilla, John A.; Jones, W. Max
2007-01-01
Excursion sets of Gaussian random fields (GRFs) have been frequently used in the literature to model two-phase random media with measurable phase autocorrelation functions. The goal of successful modeling is finding the optimal field autocorrelation function that best approximates the prescribed phase autocorrelation function. In this paper, we present a technique which uses convex quadratic programming to find the best admissible field autocorrelation function under a prescribed discretization. Unlike previous methods, this technique efficiently optimizes over all admissible field autocorrelation functions, instead of optimizing only over a predetermined parametrized family. The results from using this technique indicate that the GRF model is significantly more versatile than observed in previous studies. An application to modeling a base-catalyzed tetraethoxysilane aerogel system given small-angle neutron scattering data is also presented
Estimating Compressive Strength of High Performance Concrete with Gaussian Process Regression Model
Directory of Open Access Journals (Sweden)
Nhat-Duc Hoang
2016-01-01
Full Text Available This research carries out a comparative study to investigate a machine learning solution that employs the Gaussian Process Regression (GPR for modeling compressive strength of high-performance concrete (HPC. This machine learning approach is utilized to establish the nonlinear functional mapping between the compressive strength and HPC ingredients. To train and verify the aforementioned prediction model, a data set containing 239 HPC experimental tests, recorded from an overpass construction project in Danang City (Vietnam, has been collected for this study. Based on experimental outcomes, prediction results of the GPR model are superior to those of the Least Squares Support Vector Machine and the Artificial Neural Network. Furthermore, GPR model is strongly recommended for estimating HPC strength because this method demonstrates good learning performance and can inherently express prediction outputs coupled with prediction intervals.
Bayesian sensitivity analysis of a 1D vascular model with Gaussian process emulators.
Melis, Alessandro; Clayton, Richard H; Marzo, Alberto
2017-12-01
One-dimensional models of the cardiovascular system can capture the physics of pulse waves but involve many parameters. Since these may vary among individuals, patient-specific models are difficult to construct. Sensitivity analysis can be used to rank model parameters by their effect on outputs and to quantify how uncertainty in parameters influences output uncertainty. This type of analysis is often conducted with a Monte Carlo method, where large numbers of model runs are used to assess input-output relations. The aim of this study was to demonstrate the computational efficiency of variance-based sensitivity analysis of 1D vascular models using Gaussian process emulators, compared to a standard Monte Carlo approach. The methodology was tested on four vascular networks of increasing complexity to analyse its scalability. The computational time needed to perform the sensitivity analysis with an emulator was reduced by the 99.96% compared to a Monte Carlo approach. Despite the reduced computational time, sensitivity indices obtained using the two approaches were comparable. The scalability study showed that the number of mechanistic simulations needed to train a Gaussian process for sensitivity analysis was of the order O(d), rather than O(d×103) needed for Monte Carlo analysis (where d is the number of parameters in the model). The efficiency of this approach, combined with capacity to estimate the impact of uncertain parameters on model outputs, will enable development of patient-specific models of the vascular system, and has the potential to produce results with clinical relevance. © 2017 The Authors International Journal for Numerical Methods in Biomedical Engineering Published by John Wiley & Sons Ltd.
laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R
Directory of Open Access Journals (Sweden)
Robert B. Gramacy
2016-08-01
Full Text Available Gaussian process (GP regression models make for powerful predictors in out of sample exercises, but cubic runtimes for dense matrix decompositions severely limit the size of data - training and testing - on which they can be deployed. That means that in computer experiment, spatial/geo-physical, and machine learning contexts, GPs no longer enjoy privileged status as data sets continue to balloon in size. We discuss an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures. The laGP approach can be seen as an update on the spatial statistical method of local kriging neighborhoods. We briefly review the method, and provide extensive illustrations of the features in the package through worked-code examples. The appendix covers custom building options for symmetric multi-processor and graphical processing units, and built-in wrapper routines that automate distribution over a simple network of workstations.
Finite size scaling of the Higgs-Yukawa model near the Gaussian fixed point
Energy Technology Data Exchange (ETDEWEB)
Chu, David Y.J.; Lin, C.J. David [National Chiao-Tung Univ., Hsinchu, Taiwan (China); Jansen, Karl [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Knippschild, Bastian [HISKP, Bonn (Germany); Nagy, Attila [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Humboldt-Univ. Berlin (Germany)
2016-12-15
We study the scaling properties of Higgs-Yukawa models. Using the technique of Finite-Size Scaling, we are able to derive scaling functions that describe the observables of the model in the vicinity of a Gaussian fixed point. A feasibility study of our strategy is performed for the pure scalar theory in the weak-coupling regime. Choosing the on-shell renormalisation scheme gives us an advantage to fit the scaling functions against lattice data with only a small number of fit parameters. These formulae can be used to determine the universality of the observed phase transitions, and thus play an essential role in future investigations of Higgs-Yukawa models, in particular in the strong Yukawa coupling region.
Non-Gaussianity and statistical anisotropy from vector field populated inflationary models
Dimastrogiovanni, Emanuela; Matarrese, Sabino; Riotto, Antonio
2010-01-01
We present a review of vector field models of inflation and, in particular, of the statistical anisotropy and non-Gaussianity predictions of models with SU(2) vector multiplets. Non-Abelian gauge groups introduce a richer amount of predictions compared to the Abelian ones, mostly because of the presence of vector fields self-interactions. Primordial vector fields can violate isotropy leaving their imprint in the comoving curvature fluctuations zeta at late times. We provide the analytic expressions of the correlation functions of zeta up to fourth order and an analysis of their amplitudes and shapes. The statistical anisotropy signatures expected in these models are important and, potentially, the anisotropic contributions to the bispectrum and the trispectrum can overcome the isotropic parts.
Communication: Modeling electrolyte mixtures with concentration dependent dielectric permittivity
Chen, Hsieh; Panagiotopoulos, Athanassios Z.
2018-01-01
We report a new implicit-solvent simulation model for electrolyte mixtures based on the concept of concentration dependent dielectric permittivity. A combining rule is found to predict the dielectric permittivity of electrolyte mixtures based on the experimentally measured dielectric permittivity for pure electrolytes as well as the mole fractions of the electrolytes in mixtures. Using grand canonical Monte Carlo simulations, we demonstrate that this approach allows us to accurately reproduce the mean ionic activity coefficients of NaCl in NaCl-CaCl2 mixtures at ionic strengths up to I = 3M. These results are important for thermodynamic studies of geologically relevant brines and physiological fluids.
A Dirichlet process mixture model for brain MRI tissue classification.
Ferreira da Silva, Adelino R
2007-04-01
Accurate classification of magnetic resonance images according to tissue type or region of interest has become a critical requirement in diagnosis, treatment planning, and cognitive neuroscience. Several authors have shown that finite mixture models give excellent results in the automated segmentation of MR images of the human normal brain. However, performance and robustness of finite mixture models deteriorate when the models have to deal with a variety of anatomical structures. In this paper, we propose a nonparametric Bayesian model for tissue classification of MR images of the brain. The model, known as Dirichlet process mixture model, uses Dirichlet process priors to overcome the limitations of current parametric finite mixture models. To validate the accuracy and robustness of our method we present the results of experiments carried out on simulated MR brain scans, as well as on real MR image data. The results are compared with similar results from other well-known MRI segmentation methods.
Bayesian Sensitivity Analysis of a Cardiac Cell Model Using a Gaussian Process Emulator
Chang, Eugene T Y; Strong, Mark; Clayton, Richard H
2015-01-01
Models of electrical activity in cardiac cells have become important research tools as they can provide a quantitative description of detailed and integrative physiology. However, cardiac cell models have many parameters, and how uncertainties in these parameters affect the model output is difficult to assess without undertaking large numbers of model runs. In this study we show that a surrogate statistical model of a cardiac cell model (the Luo-Rudy 1991 model) can be built using Gaussian process (GP) emulators. Using this approach we examined how eight outputs describing the action potential shape and action potential duration restitution depend on six inputs, which we selected to be the maximum conductances in the Luo-Rudy 1991 model. We found that the GP emulators could be fitted to a small number of model runs, and behaved as would be expected based on the underlying physiology that the model represents. We have shown that an emulator approach is a powerful tool for uncertainty and sensitivity analysis in cardiac cell models. PMID:26114610
Directory of Open Access Journals (Sweden)
Robert B. Gramacy
2007-06-01
Full Text Available The tgp package for R is a tool for fully Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes with jumps to the limiting linear model. Special cases also implemented include Bayesian linear models, linear CART, stationary separable and isotropic Gaussian processes. In addition to inference and posterior prediction, the package supports the (sequential design of experiments under these models paired with several objective criteria. 1-d and 2-d plotting, with higher dimension projection and slice capabilities, and tree drawing functions (requiring maptree and combinat packages, are also provided for visualization of tgp objects.
Directory of Open Access Journals (Sweden)
Hyungsuk Tak
2017-06-01
Full Text Available Rgbp is an R package that provides estimates and verifiable confidence intervals for random effects in two-level conjugate hierarchical models for overdispersed Gaussian, Poisson, and binomial data. Rgbp models aggregate data from k independent groups summarized by observed sufficient statistics for each random effect, such as sample means, possibly with covariates. Rgbp uses approximate Bayesian machinery with unique improper priors for the hyper-parameters, which leads to good repeated sampling coverage properties for random effects. A special feature of Rgbp is an option that generates synthetic data sets to check whether the interval estimates for random effects actually meet the nominal confidence levels. Additionally, Rgbp provides inference statistics for the hyper-parameters, e.g., regression coefficients.
Modeling dynamic functional connectivity using a wishart mixture model
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer Hougaard; Schmidt, Mikkel Nørgaard
2017-01-01
Dynamic functional connectivity (dFC) has recently become a popular way of tracking the temporal evolution of the brains functional integration. However, there does not seem to be a consensus on how to choose the complexity, i.e. number of brain states, and the time-scale of the dynamics, i.......e. the window length. In this work we use the Wishart Mixture Model (WMM) as a probabilistic model for dFC based on variational inference. The framework admits arbitrary window lengths and number of dynamic components and includes the static one-component model as a special case. We exploit that the WMM...... framework provides model selection by quantifying models generalization to new data. We use this to quantify the number of states within a prespecified window length. We further propose a heuristic procedure for choosing the window length based on contrasting for each window length the predictive...
Directory of Open Access Journals (Sweden)
Yun Wang
2016-01-01
Full Text Available Gamma Gaussian inverse Wishart cardinalized probability hypothesis density (GGIW-CPHD algorithm was always used to track group targets in the presence of cluttered measurements and missing detections. A multiple models GGIW-CPHD algorithm based on best-fitting Gaussian approximation method (BFG and strong tracking filter (STF is proposed aiming at the defect that the tracking error of GGIW-CPHD algorithm will increase when the group targets are maneuvering. The best-fitting Gaussian approximation method is proposed to implement the fusion of multiple models using the strong tracking filter to correct the predicted covariance matrix of the GGIW component. The corresponding likelihood functions are deduced to update the probability of multiple tracking models. From the simulation results we can see that the proposed tracking algorithm MM-GGIW-CPHD can effectively deal with the combination/spawning of groups and the tracking error of group targets in the maneuvering stage is decreased.
Schifferstein, H.N.J.
1996-01-01
The Equiratio Mixture Model predicts the psychophysical function for an equiratio mixture type on the basis of the psychophysical functions for the unmixed components. The model reliably estimates the sweetness of mixtures of sugars and sugar-alchohols, but is unable to predict intensity for
Vakanski, A; Ferguson, J M; Lee, S
2016-12-01
The objective of the proposed research is to develop a methodology for modeling and evaluation of human motions, which will potentially benefit patients undertaking a physical rehabilitation therapy (e.g., following a stroke or due to other medical conditions). The ultimate aim is to allow patients to perform home-based rehabilitation exercises using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of a patient's exercises, will perform data analysis by comparing the performed motions to a reference model of prescribed motions, and will send the analysis results to the patient's physician with recommendations for improvement. The modeling approach employs an artificial neural network, consisting of layers of recurrent neuron units and layers of neuron units for estimating a mixture density function over the spatio-temporal dependencies within the human motion sequences. Input data are sequences of motions related to a prescribed exercise by a physiotherapist to a patient, and recorded with a motion capture system. An autoencoder subnet is employed for reducing the dimensionality of captured sequences of human motions, complemented with a mixture density subnet for probabilistic modeling of the motion data using a mixture of Gaussian distributions. The proposed neural network architecture produced a model for sets of human motions represented with a mixture of Gaussian density functions. The mean log-likelihood of observed sequences was employed as a performance metric in evaluating the consistency of a subject's performance relative to the reference dataset of motions. A publically available dataset of human motions captured with Microsoft Kinect was used for validation of the proposed method. The article presents a novel approach for modeling and evaluation of human motions with a potential application in home-based physical therapy and rehabilitation. The described approach employs the recent progress in the field of
Vakanski, A; Ferguson, JM; Lee, S
2016-01-01
Objective The objective of the proposed research is to develop a methodology for modeling and evaluation of human motions, which will potentially benefit patients undertaking a physical rehabilitation therapy (e.g., following a stroke or due to other medical conditions). The ultimate aim is to allow patients to perform home-based rehabilitation exercises using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of a patient’s exercises, will perform data analysis by comparing the performed motions to a reference model of prescribed motions, and will send the analysis results to the patient’s physician with recommendations for improvement. Methods The modeling approach employs an artificial neural network, consisting of layers of recurrent neuron units and layers of neuron units for estimating a mixture density function over the spatio-temporal dependencies within the human motion sequences. Input data are sequences of motions related to a prescribed exercise by a physiotherapist to a patient, and recorded with a motion capture system. An autoencoder subnet is employed for reducing the dimensionality of captured sequences of human motions, complemented with a mixture density subnet for probabilistic modeling of the motion data using a mixture of Gaussian distributions. Results The proposed neural network architecture produced a model for sets of human motions represented with a mixture of Gaussian density functions. The mean log-likelihood of observed sequences was employed as a performance metric in evaluating the consistency of a subject’s performance relative to the reference dataset of motions. A publically available dataset of human motions captured with Microsoft Kinect was used for validation of the proposed method. Conclusion The article presents a novel approach for modeling and evaluation of human motions with a potential application in home-based physical therapy and rehabilitation. The described approach
Inference in Graphical Gaussian Models with Edge and Vertex Symmetries with the gRc Package for R
DEFF Research Database (Denmark)
Højsgaard, Søren; Lauritzen, Steffen L
2007-01-01
In this paper we present the R package gRc for statistical inference in graphical Gaussian models in which symmetry restrictions have been imposed on the concentration or partial correlation matrix. The models are represented by coloured graphs where parameters associated with edges or vertices...
Klein, A.A.B.; Melard, G.; Zahaf, T.
2000-01-01
The Fisher information matrix is of fundamental importance for the analysis of parameter estimation of time series models. In this paper the exact information matrix of a multivariate Gaussian time series model expressed in state space form is derived. A computationally efficient procedure is used
Optimal multigrid algorithms for the massive Gaussian model and path integrals
International Nuclear Information System (INIS)
Brandt, A.; Galun, M.
1996-01-01
Multigrid algorithms are presented which, in addition to eliminating the critical slowing down, can also eliminate the open-quotes volume factorclose quotes. The elimination of the volume factor removes the need to produce many independent fine-grid configurations for averaging out their statistical deviations, by averaging over the many samples produced on coarse grids during the multigrid cycle. Thermodynamic limits of observables can be calculated to relative accuracy var-epsilon r in just O(var-epsilon r -2 ) computer operations, where var-epsilon r is the error relative to the standard deviation of the observable. In this paper, we describe in detail the calculation of the susceptibility in the one-dimensional massive Gaussian model, which is also a simple example of path integrals. Numerical experiments show that the susceptibility can be calculated to relative accuracy var-epsilon r in about 8 var-epsilon r -2 random number generations, independent of the mass size
Statistical properties of a Laguerre-Gaussian Schell-model beam in turbulent atmosphere.
Chen, Rong; Liu, Lin; Zhu, Shijun; Wu, Gaofeng; Wang, Fei; Cai, Yangjian
2014-01-27
Laguerre-Gaussian Schell-model (LGSM) beam was proposed in theory [Opt. Lett.38, 91 (2013 Opt. Lett.38, 1814 (2013)] just recently. In this paper, we study the propagation of a LGSM beam in turbulent atmosphere. Analytical expressions for the cross-spectral density and the second-order moments of the Wigner distribution function of a LGSM beam in turbulent atmosphere are derived. The statistical properties, such as the degree of coherence and the propagation factor, of a LGSM beam in turbulent atmosphere are studied in detail. It is found that a LGSM beam with larger mode order n is less affected by turbulence than a LGSM beam with smaller mode order n or a GSM beam under certain condition, which will be useful in free-space optical communications.
Fast Pencil Beam Dose Calculation for Proton Therapy Using a Double-Gaussian Beam Model.
da Silva, Joakim; Ansorge, Richard; Jena, Rajesh
2015-01-01
The highly conformal dose distributions produced by scanned proton pencil beams (PBs) are more sensitive to motion and anatomical changes than those produced by conventional radiotherapy. The ability to calculate the dose in real-time as it is being delivered would enable, for example, online dose monitoring, and is therefore highly desirable. We have previously described an implementation of a PB algorithm running on graphics processing units (GPUs) intended specifically for online dose calculation. Here, we present an extension to the dose calculation engine employing a double-Gaussian beam model to better account for the low-dose halo. To the best of our knowledge, it is the first such PB algorithm for proton therapy running on a GPU. We employ two different parameterizations for the halo dose, one describing the distribution of secondary particles from nuclear interactions found in the literature and one relying on directly fitting the model to Monte Carlo simulations of PBs in water. Despite the large width of the halo contribution, we show how in either case the second Gaussian can be included while prolonging the calculation of the investigated plans by no more than 16%, or the calculation of the most time-consuming energy layers by about 25%. Furthermore, the calculation time is relatively unaffected by the parameterization used, which suggests that these results should hold also for different systems. Finally, since the implementation is based on an algorithm employed by a commercial treatment planning system, it is expected that with adequate tuning, it should be able to reproduce the halo dose from a general beam line with sufficient accuracy.
Fast pencil beam dose calculation for proton therapy using a double-Gaussian beam model
Directory of Open Access Journals (Sweden)
Joakim eda Silva
2015-12-01
Full Text Available The highly conformal dose distributions produced by scanned proton pencil beams are more sensitive to motion and anatomical changes than those produced by conventional radiotherapy. The ability to calculate the dose in real time as it is being delivered would enable, for example, online dose monitoring, and is therefore highly desirable. We have previously described an implementation of a pencil beam algorithm running on graphics processing units (GPUs intended specifically for online dose calculation. Here we present an extension to the dose calculation engine employing a double-Gaussian beam model to better account for the low-dose halo. To the best of our knowledge, it is the first such pencil beam algorithm for proton therapy running on a GPU. We employ two different parametrizations for the halo dose, one describing the distribution of secondary particles from nuclear interactions found in the literature and one relying on directly fitting the model to Monte Carlo simulations of pencil beams in water. Despite the large width of the halo contribution, we show how in either case the second Gaussian can be included whilst prolonging the calculation of the investigated plans by no more than 16%, or the calculation of the most time-consuming energy layers by about 25%. Further, the calculation time is relatively unaffected by the parametrization used, which suggests that these results should hold also for different systems. Finally, since the implementation is based on an algorithm employed by a commercial treatment planning system, it is expected that with adequate tuning, it should be able to reproduce the halo dose from a general beam line with sufficient accuracy.
Dynamic Socialized Gaussian Process Models for Human Behavior Prediction in a Health Social Network.
Shen, Yelong; Phan, NhatHai; Xiao, Xiao; Jin, Ruoming; Sun, Junfeng; Piniewski, Brigitte; Kil, David; Dou, Dejing
2016-11-01
Modeling and predicting human behaviors, such as the level and intensity of physical activity, is a key to preventing the cascade of obesity and helping spread healthy behaviors in a social network. In our conference paper, we have developed a social influence model, named Socialized Gaussian Process (SGP), for socialized human behavior modeling. Instead of explicitly modeling social influence as individuals' behaviors influenced by their friends' previous behaviors, SGP models the dynamic social correlation as the result of social influence. The SGP model naturally incorporates personal behavior factor and social correlation factor (i.e., the homophily principle: Friends tend to perform similar behaviors) into a unified model. And it models the social influence factor (i.e., an individual's behavior can be affected by his/her friends) implicitly in dynamic social correlation schemes. The detailed experimental evaluation has shown the SGP model achieves better prediction accuracy compared with most of baseline methods. However, a Socialized Random Forest model may perform better at the beginning compared with the SGP model. One of the main reasons is the dynamic social correlation function is purely based on the users' sequential behaviors without considering other physical activity-related features. To address this issue, we further propose a novel " multi-feature SGP model " (mfSGP) which improves the SGP model by using multiple physical activity-related features in the dynamic social correlation learning. Extensive experimental results illustrate that the mfSGP model clearly outperforms all other models in terms of prediction accuracy and running time.
International Nuclear Information System (INIS)
Barker, C.D.
1979-07-01
The Gaussian Plume Diffusion Model, using Smith's scheme for σsub(z) and various models for σsub(y), is compared with measured values of the location and strength of maximum ground level concentration taken during the Tilbury and Northfleet experiments. The position of maximum ground level concentration (xsub(m)) is found to be relatively insensitive to σsub(y) and Smith's model for σsub(z) is found to predict xsub(m) on average to within 50% for plume heights less than 200 - 400m (dependent on atmosphere stability). Several models for σsub(y) are examined by comparing predicted and observed values for the normalised maximum ground level concentration (Xsub(m)) and a modified form of Moore's model for σsub(y) is found to give the best overall fit, on average to within 30%. Gifford's release duration dependent model for σsub(y) is found to consistently underestimate Xsub(m) by 35 - 45%. This comparison is only a partial validation of the models described above and suggestions are made as to where further work is required. (author)
Duan, Leo L; Wang, Xia; Clancy, John P; Szczesniak, Rhonda D
2018-01-01
A two-level Gaussian process (GP) joint model is proposed to improve personalized prediction of medical monitoring data. The proposed model is applied to jointly analyze multiple longitudinal biomedical outcomes, including continuous measurements and binary outcomes, to achieve better prediction in disease progression. At the population level of the hierarchy, two independent GPs are used to capture the nonlinear trends in both the continuous biomedical marker and the binary outcome, respectively; at the individual level, a third GP, which is shared by the longitudinal measurement model and the longitudinal binary model, induces the correlation between these two model components and strengthens information borrowing across individuals. The proposed model is particularly advantageous in personalized prediction. It is applied to the motivating clinical data on cystic fibrosis disease progression, for which lung function measurements and onset of acute respiratory events are monitored jointly throughout each patient's clinical course. The results from both the simulation studies and the cystic fibrosis data application suggest that the inclusion of the shared individual-level GPs under the joint model framework leads to important improvements in personalized disease progression prediction.
Gaussian Process Regression (GPR) Representation in Predictive Model Markup Language (PMML).
Park, J; Lechevalier, D; Ak, R; Ferguson, M; Law, K H; Lee, Y-T T; Rachuri, S
2017-01-01
This paper describes Gaussian process regression (GPR) models presented in predictive model markup language (PMML). PMML is an extensible-markup-language (XML) -based standard language used to represent data-mining and predictive analytic models, as well as pre- and post-processed data. The previous PMML version, PMML 4.2, did not provide capabilities for representing probabilistic (stochastic) machine-learning algorithms that are widely used for constructing predictive models taking the associated uncertainties into consideration. The newly released PMML version 4.3, which includes the GPR model, provides new features: confidence bounds and distribution for the predictive estimations. Both features are needed to establish the foundation for uncertainty quantification analysis. Among various probabilistic machine-learning algorithms, GPR has been widely used for approximating a target function because of its capability of representing complex input and output relationships without predefining a set of basis functions, and predicting a target output with uncertainty quantification. GPR is being employed to various manufacturing data-analytics applications, which necessitates representing this model in a standardized form for easy and rapid employment. In this paper, we present a GPR model and its representation in PMML. Furthermore, we demonstrate a prototype using a real data set in the manufacturing domain.
About a solvable mean field model of a Gaussian spin glass
Barra, Adriano; Genovese, Giuseppe; Guerra, Francesco; Tantari, Daniele
2014-04-01
In a series of papers, we have studied a modified Hopfield model of a neural network, with learned words characterized by a Gaussian distribution. The model can be represented as a bipartite spin glass, with one party described by dichotomic Ising spins, and the other party by continuous spin variables, with an a priori Gaussian distribution. By application of standard interpolation methods, we have found it useful to compare the neural network model (bipartite) from one side, with two spin glass models, each monopartite, from the other side. Of these, the first is the usual Sherrington-Kirkpatrick model, the second is a spin glass model, with continuous spins and inbuilt highly nonlinear smooth cut-off interactions. This model is an invaluable laboratory for testing all techniques which have been useful in the study of spin glasses. The purpose of this paper is to give a synthetic description of the most peculiar aspects, by stressing the necessary novelties in the treatment. In particular, it will be shown that the control of the infinite volume limit, according to the well-known Guerra-Toninelli strategy, requires in addition one to consider the involvement of the cut-off interaction in the interpolation procedure. Moreover, the control of the ergodic region, the annealed case, cannot be directly achieved through the standard application of the Borel-Cantelli lemma, but requires previous modification of the interaction. This remark could find useful application in other cases. The replica symmetric expression for the free energy can be easily reached through a suitable version of the doubly stochastic interpolation technique. However, this model shares the unique property that the fully broken replica symmetry ansatz can be explicitly calculated. A very simple sum rule connects the general expression of the fully broken free energy trial function with the replica symmetric one. The definite sign of the error term shows that the replica solution is optimal. Then
Dynamic Socialized Gaussian Process Models for Human Behavior Prediction in a Health Social Network
Shen, Yelong; Phan, NhatHai; Xiao, Xiao; Jin, Ruoming; Sun, Junfeng; Piniewski, Brigitte; Kil, David; Dou, Dejing
2016-01-01
Modeling and predicting human behaviors, such as the level and intensity of physical activity, is a key to preventing the cascade of obesity and helping spread healthy behaviors in a social network. In our conference paper, we have developed a social influence model, named Socialized Gaussian Process (SGP), for socialized human behavior modeling. Instead of explicitly modeling social influence as individuals' behaviors influenced by their friends' previous behaviors, SGP models the dynamic social correlation as the result of social influence. The SGP model naturally incorporates personal behavior factor and social correlation factor (i.e., the homophily principle: Friends tend to perform similar behaviors) into a unified model. And it models the social influence factor (i.e., an individual's behavior can be affected by his/her friends) implicitly in dynamic social correlation schemes. The detailed experimental evaluation has shown the SGP model achieves better prediction accuracy compared with most of baseline methods. However, a Socialized Random Forest model may perform better at the beginning compared with the SGP model. One of the main reasons is the dynamic social correlation function is purely based on the users' sequential behaviors without considering other physical activity-related features. To address this issue, we further propose a novel “multi-feature SGP model” (mfSGP) which improves the SGP model by using multiple physical activity-related features in the dynamic social correlation learning. Extensive experimental results illustrate that the mfSGP model clearly outperforms all other models in terms of prediction accuracy and running time. PMID:27746515
Beta Regression Finite Mixture Models of Polarization and Priming
Smithson, Michael; Merkle, Edgar C.; Verkuilen, Jay
2011-01-01
This paper describes the application of finite-mixture general linear models based on the beta distribution to modeling response styles, polarization, anchoring, and priming effects in probability judgments. These models, in turn, enhance our capacity for explicitly testing models and theories regarding the aforementioned phenomena. The mixture…
Dirichlet Process Gaussian Mixture Model for Activity Discovery in Smart Homes with Ambient Sensors
Nguyen, Thuong; Le Viet Duc, Duc Viet; Zhang, Quing; Karunanithi, Mohan
2017-01-01
Most of the existing approaches to activity recognition in smart homes rely on supervised learning with well annotated sensor data. However obtaining such labeled data is not only challenging but sometimes also an unobtainable task, especially for senior citizens who may suffer various mental health
Directory of Open Access Journals (Sweden)
Yanuar Risah Prayogi
2015-07-01
Uji coba menggunakan lima tipe derau dengan tujuh tingkat SNR. Tipe derau yang digunakan adalah f16, hfchannel, pink, volvo, dan white. Sedangkan tingkat SNR yang digunakan adalah bersih, 25, 20, 15, 10, 5, dan 0 dB. Hasil uji coba menunjukkan bahwa metode yang diusulkan unggul pada mayoritas pembicara. Selain itu metode yang diusulkan juga unggul pada semua tipe derau dan unggul hampir pada semua tingkat SNR. Metode yang diusulkan menunjukkan rata-rata akurasi sebesar 14.69% lebih tinggi dari metode MFCC, 2.74% dari MFCC+Spectral Subtraction (SS, dan 6.4% dari MFCC+wiener.
On-line signature verification using Gaussian Mixture Models and small-sample learning strategies
Directory of Open Access Journals (Sweden)
Gabriel Jaime Zapata-Zapata
2016-01-01
Full Text Available El artículo aborda el problema de entrenamiento de sistemas de verificación de firmas en línea cuando el número de muestras disponibles para el entrenamiento es bajo, debido a que en la mayoría de situaciones reales el número de firmas disponibles por usuario es muy limitado. El artículo evalúa nueve diferentes estrategias de clasificación basadas en modelos de mezclas de Gaussianas (GMM por sus siglas en inglés y la estrategia conocida como modelo histórico universal (UBM por sus siglas en inglés, la cual está diseñada con el objetivo de trabajar bajo condiciones de menor número de muestras. Las estrategias de aprendizaje de los GMM incluyen el algoritmo convencional de Esperanza y Maximización, y una aproximación Bayesiana basada en aprendizaje variacional. Las firmas son caracterizadas principalmente en términos de velocidades y aceleraciones de los patrones de escritura a mano de los usuarios. Los resultados muestran que cuando se evalúa el sistema en una configuración genuino vs. impostor, el método GMM-UBM es capaz de mantener una precisión por encima del 93%, incluso en casos en los que únicamente se usa para entrenamiento el 20% de las muestras disponibles (equivalente a 5 firmas, mientras que la combinación de un modelo Bayesiano UBM con una Máquina de Soporte Vectorial (SVM por sus siglas en inglés, modelo conocido como GMM-Supervector, logra un 99% de acierto cuando las muestras de entrenamiento exceden las 20. Por otro lado, cuando se simula un ambiente real en el que no están disponibles muestras impostoras y se usa
A univocal definition of the neuronal soma morphology using Gaussian mixture models
Luengo-Sanchez, Sergio; Bielza, Concha; Benavides-Piccione, Ruth; Fernaud-Espinosa, Isabel; DeFelipe, Javier; Larrañaga, Pedro
2015-01-01
The definition of the soma is fuzzy, as there is no clear line demarcating the soma of the labeled neurons and the origin of the dendrites and axon. Thus, the morphometric analysis of the neuronal soma is highly subjective. In this paper, we provide a mathematical definition and an automatic segmentation method to delimit the neuronal soma. We applied this method to the characterization of pyramidal cells, which are the most abundant neurons in the cerebral cortex. Since there are no benchmarks with which to compare the proposed procedure, we validated the goodness of this automatic segmentation method against manual segmentation by neuroanatomists to set up a framework for comparison. We concluded that there were no significant differences between automatically and manually segmented somata, i.e., the proposed procedure segments the neurons similarly to how a neuroanatomist does. It also provides univocal, justifiable and objective cutoffs. Thus, this study is a means of characterizing pyramidal neurons in order to objectively compare the morphometry of the somata of these neurons in different cortical areas and species. PMID:26578898
Single-step emulation of nonlinear fiber-optic link with gaussian mixture model
DEFF Research Database (Denmark)
Borkowski, Robert; Doberstein, Andy; Haisch, Hansjörg
2015-01-01
We use a fast and low-complexity statistical signal processing method to emulate nonlinear noise in fiber links. The proposed emulation technique stands in good agreement with the numerical NLSE simulation for 32 Gbaud DP-16QAM nonlinear transmission.......We use a fast and low-complexity statistical signal processing method to emulate nonlinear noise in fiber links. The proposed emulation technique stands in good agreement with the numerical NLSE simulation for 32 Gbaud DP-16QAM nonlinear transmission....
On Diagnostic Checking of Vector ARMA-GARCH Models with Gaussian and Student-t Innovations
Directory of Open Access Journals (Sweden)
Yongning Wang
2013-04-01
Full Text Available This paper focuses on the diagnostic checking of vector ARMA (VARMA models with multivariate GARCH errors. For a fitted VARMA-GARCH model with Gaussian or Student-t innovations, we derive the asymptotic distributions of autocorrelation matrices of the cross-product vector of standardized residuals. This is different from the traditional approach that employs only the squared series of standardized residuals. We then study two portmanteau statistics, called Q1(M and Q2(M, for model checking. A residual-based bootstrap method is provided and demonstrated as an effective way to approximate the diagnostic checking statistics. Simulations are used to compare the performance of the proposed statistics with other methods available in the literature. In addition, we also investigate the effect of GARCH shocks on checking a fitted VARMA model. Empirical sizes and powers of the proposed statistics are investigated and the results suggest a procedure of using jointly Q1(M and Q2(M in diagnostic checking. The bivariate time series of FTSE 100 and DAX index returns is used to illustrate the performance of the proposed portmanteau statistics. The results show that it is important to consider the cross-product series of standardized residuals and GARCH effects in model checking.
Bayesian modeling of JET Li-BES for edge electron density profiles using Gaussian processes
Kwak, Sehyun; Svensson, Jakob; Brix, Mathias; Ghim, Young-Chul; JET Contributors Collaboration
2015-11-01
A Bayesian model for the JET lithium beam emission spectroscopy (Li-BES) system has been developed to infer edge electron density profiles. The 26 spatial channels measure emission profiles with ~15 ms temporal resolution and ~1 cm spatial resolution. The lithium I (2p-2s) line radiation in an emission spectrum is calculated using a multi-state model, which expresses collisions between the neutral lithium beam atoms and the plasma particles as a set of differential equations. The emission spectrum is described in the model including photon and electronic noise, spectral line shapes, interference filter curves, and relative calibrations. This spectral modeling gets rid of the need of separate background measurements for calculating the intensity of the line radiation. Gaussian processes are applied to model both emission spectrum and edge electron density profile, and the electron temperature to calculate all the rate coefficients is obtained from the JET high resolution Thomson scattering (HRTS) system. The posterior distributions of the edge electron density profile are explored via the numerical technique and the Markov chain Monte Carlo (MCMC) samplings. See the Appendix of F. Romanelli et al., Proceedings of the 25th IAEA Fusion Energy Conference 2014, Saint Petersburg, Russia.
Bivariate Gaussian bridges: directional factorization of diffusion in Brownian bridge models.
Kranstauber, Bart; Safi, Kamran; Bartumeus, Frederic
2014-01-01
In recent years high resolution animal tracking data has become the standard in movement ecology. The Brownian Bridge Movement Model (BBMM) is a widely adopted approach to describe animal space use from such high resolution tracks. One of the underlying assumptions of the BBMM is isotropic diffusive motion between consecutive locations, i.e. invariant with respect to the direction. Here we propose to relax this often unrealistic assumption by separating the Brownian motion variance into two directional components, one parallel and one orthogonal to the direction of the motion. Our new model, the Bivariate Gaussian bridge (BGB), tracks movement heterogeneity across time. Using the BGB and identifying directed and non-directed movement within a trajectory resulted in more accurate utilisation distributions compared to dynamic Brownian bridges, especially for trajectories with a non-isotropic diffusion, such as directed movement or Lévy like movements. We evaluated our model with simulated trajectories and observed tracks, demonstrating that the improvement of our model scales with the directional correlation of a correlated random walk. We find that many of the animal trajectories do not adhere to the assumptions of the BBMM. The proposed model improves accuracy when describing the space use both in simulated correlated random walks as well as observed animal tracks. Our novel approach is implemented and available within the "move" package for R.
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Kontogeorgis, Georgios; Michelsen, Michael Locht
2011-01-01
In Part I of this series of articles, the study of H2S mixtures has been presented with CPA. In this study the phase behavior of CO2 containing mixtures is modeled. Binary mixtures with water, alcohols, glycols and hydrocarbons are investigated. Both phase equilibria (vapor–liquid and liquid–liqu...
Narukawa, Masaki; Nohara, Katsuhito
2018-04-01
This study proposes an estimation approach to panel count data, truncated at zero, in order to apply a contingent behavior travel cost method to revealed and stated preference data collected via a web-based survey. We develop zero-truncated panel Poisson mixture models by focusing on respondents who visited a site. In addition, we introduce an inverse Gaussian distribution to unobserved individual heterogeneity as an alternative to a popular gamma distribution, making it possible to capture effectively the long tail typically observed in trip data. We apply the proposed method to estimate the impact on tourism benefits in Fukushima Prefecture as a result of the Fukushima Nuclear Power Plant No. 1 accident. Copyright © 2018 Elsevier Ltd. All rights reserved.
Supervised latent linear Gaussian process latent variable model for dimensionality reduction.
Jiang, Xinwei; Gao, Junbin; Wang, Tianjiang; Zheng, Lihong
2012-12-01
The Gaussian process (GP) latent variable model (GPLVM) has the capability of learning low-dimensional manifold from highly nonlinear data of high dimensionality. As an unsupervised dimensionality reduction (DR) algorithm, the GPLVM has been successfully applied in many areas. However, in its current setting, GPLVM is unable to use label information, which is available for many tasks; therefore, researchers proposed many kinds of extensions to the GPLVM in order to utilize extra information, among which the supervised GPLVM (SGPLVM) has shown better performance compared with other SGPLVM extensions. However, the SGPLVM suffers in its high computational complexity. Bearing in mind the issues of the complexity and the need of incorporating additionally available information, in this paper, we propose a novel SGPLVM, called supervised latent linear GPLVM (SLLGPLVM). Our approach is motivated by both SGPLVM and supervised probabilistic principal component analysis (SPPCA). The proposed SLLGPLVM can be viewed as an appropriate compromise between the SGPLVM and the SPPCA. Furthermore, it is also appropriate to interpret the SLLGPLVM as a semiparametric regression model for supervised DR by making use of the GP to model the unknown smooth link function. Complexity analysis and experiments show that the developed SLLGPLVM outperforms the SGPLVM not only in the computational complexity but also in its accuracy. We also compared the SLLGPLVM with two classical supervised classifiers, i.e., a GP classifier and a support vector machine, to illustrate the advantages of the proposed model.
Schoups, G.; Vrugt, J.A.
2010-01-01
Estimation of parameter and predictive uncertainty of hydrologic models has traditionally relied on several simplifying assumptions. Residual errors are often assumed to be independent and to be adequately described by a Gaussian probability distribution with a mean of zero and a constant variance.
DEFF Research Database (Denmark)
Andreasen, Martin Møller; Christensen, Bent Jesper
This paper suggests a new and easy approach to estimate linear and non-linear dynamic term structure models with latent factors. We impose no distributional assumptions on the factors and they may therefore be non-Gaussian. The novelty of our approach is to use many observables (yields or bonds p...
Estimating the number of sources in a noisy convolutive mixture using BIC
DEFF Research Database (Denmark)
Olsson, Rasmus Kongsgaard; Hansen, Lars Kai
2004-01-01
of the sources. The algorithm, known as ‘KaBSS’, employs a Gaussian linear model for the mixture, i.e. AR models for the sources, linear mixing filters and a white Gaussian noise model. Using an EM algorithm, which invokes the Kalman smoother in the E-step, all model parameters are estimated and the exact...
Normal Inverse Gaussian Model-Based Image Denoising in the NSCT Domain
Directory of Open Access Journals (Sweden)
Jian Jia
2015-01-01
Full Text Available The objective of image denoising is to retain useful details while removing as much noise as possible to recover an original image from its noisy version. This paper proposes a novel normal inverse Gaussian (NIG model-based method that uses a Bayesian estimator to carry out image denoising in the nonsubsampled contourlet transform (NSCT domain. In the proposed method, the NIG model is first used to describe the distributions of the image transform coefficients of each subband in the NSCT domain. Then, the corresponding threshold function is derived from the model using Bayesian maximum a posteriori probability estimation theory. Finally, optimal linear interpolation thresholding algorithm (OLI-Shrink is employed to guarantee a gentler thresholding effect. The results of comparative experiments conducted indicate that the denoising performance of our proposed method in terms of peak signal-to-noise ratio is superior to that of several state-of-the-art methods, including BLS-GSM, K-SVD, BivShrink, and BM3D. Further, the proposed method achieves structural similarity (SSIM index values that are comparable to those of the block-matching 3D transformation (BM3D method.
Schoups, Gerrit; Vrugt, Jasper A.
2010-05-01
Estimation of parameter and predictive uncertainty of hydrologic models usually relies on the assumption of additive residual errors that are independent and identically distributed according to a normal distribution with a mean of zero and a constant variance. Here, we investigate to what extent estimates of parameter and predictive uncertainty are affected when these assumptions are relaxed. Parameter and predictive uncertainty are estimated by Monte Carlo Markov Chain sampling from a generalized likelihood function that accounts for correlation, heteroscedasticity, and non-normality of residual errors. Application to rainfall-runoff modeling using daily data from a humid basin reveals that: (i) residual errors are much better described by a heteroscedastic, first-order auto-correlated error model with a Laplacian density characterized by heavier tails than a Gaussian density, and (ii) proper representation of the statistical distribution of residual errors yields tighter predictive uncertainty bands and more physically realistic parameter estimates that are less sensitive to the particular time period used for inference. The latter is especially useful for regionalization and extrapolation of parameter values to ungauged basins. Application to daily rainfall-runoff data from a semi-arid basin shows that allowing skew in the error distribution yields improved estimates of predictive uncertainty when flows are close to zero.
Neural network-based nonlinear model predictive control vs. linear quadratic gaussian control
Cho, C.; Vance, R.; Mardi, N.; Qian, Z.; Prisbrey, K.
1997-01-01
One problem with the application of neural networks to the multivariable control of mineral and extractive processes is determining whether and how to use them. The objective of this investigation was to compare neural network control to more conventional strategies and to determine if there are any advantages in using neural network control in terms of set-point tracking, rise time, settling time, disturbance rejection and other criteria. The procedure involved developing neural network controllers using both historical plant data and simulation models. Various control patterns were tried, including both inverse and direct neural network plant models. These were compared to state space controllers that are, by nature, linear. For grinding and leaching circuits, a nonlinear neural network-based model predictive control strategy was superior to a state space-based linear quadratic gaussian controller. The investigation pointed out the importance of incorporating state space into neural networks by making them recurrent, i.e., feeding certain output state variables into input nodes in the neural network. It was concluded that neural network controllers can have better disturbance rejection, set-point tracking, rise time, settling time and lower set-point overshoot, and it was also concluded that neural network controllers can be more reliable and easy to implement in complex, multivariable plants.
Dipole saturated absorption modeling in gas phase: Dealing with a Gaussian beam
Dupré, Patrick
2018-01-01
With the advent of new accurate and sensitive spectrometers, cf. combining optical cavities (for absorption enhancement), the requirement for reliable molecular transition modeling is becoming more pressing. Unfortunately, there is no trivial approach which can provide a definitive formalism allowing us to solve the coupled systems of equations associated with nonlinear absorption. Here, we propose a general approach to deal with any spectral shape of the electromagnetic field interacting with a molecular species under saturation conditions. The development is specifically applied to Gaussian-shaped beams. To make the analytical expressions tractable, approximations are proposed. Finally, two or three numerical integrations are required for describing the Lamb-dip profile. The implemented model allows us to describe the saturated absorption under low pressure conditions where the broadening by the transit-time may dominate the collision rates. The model is applied to two specific overtone transitions of the molecular acetylene. The simulated line shapes are discussed versus the collision and the transit-time rates. The specific collisional and collision-free regimes are illustrated, while the Rabi frequency controls the intermediate regime. We illustrate how to recover the input parameters by fitting the simulated profiles.
Regularized estimation of large-scale gene association networks using graphical Gaussian models.
Krämer, Nicole; Schäfer, Juliane; Boulesteix, Anne-Laure
2009-11-24
Graphical Gaussian models are popular tools for the estimation of (undirected) gene association networks from microarray data. A key issue when the number of variables greatly exceeds the number of samples is the estimation of the matrix of partial correlations. Since the (Moore-Penrose) inverse of the sample covariance matrix leads to poor estimates in this scenario, standard methods are inappropriate and adequate regularization techniques are needed. Popular approaches include biased estimates of the covariance matrix and high-dimensional regression schemes, such as the Lasso and Partial Least Squares. In this article, we investigate a general framework for combining regularized regression methods with the estimation of Graphical Gaussian models. This framework includes various existing methods as well as two new approaches based on ridge regression and adaptive lasso, respectively. These methods are extensively compared both qualitatively and quantitatively within a simulation study and through an application to six diverse real data sets. In addition, all proposed algorithms are implemented in the R package "parcor", available from the R repository CRAN. In our simulation studies, the investigated non-sparse regression methods, i.e. Ridge Regression and Partial Least Squares, exhibit rather conservative behavior when combined with (local) false discovery rate multiple testing in order to decide whether or not an edge is present in the network. For networks with higher densities, the difference in performance of the methods decreases. For sparse networks, we confirm the Lasso's well known tendency towards selecting too many edges, whereas the two-stage adaptive Lasso is an interesting alternative that provides sparser solutions. In our simulations, both sparse and non-sparse methods are able to reconstruct networks with cluster structures. On six real data sets, we also clearly distinguish the results obtained using the non-sparse methods and those obtained
Regularized estimation of large-scale gene association networks using graphical Gaussian models
Directory of Open Access Journals (Sweden)
Schäfer Juliane
2009-11-01
Full Text Available Abstract Background Graphical Gaussian models are popular tools for the estimation of (undirected gene association networks from microarray data. A key issue when the number of variables greatly exceeds the number of samples is the estimation of the matrix of partial correlations. Since the (Moore-Penrose inverse of the sample covariance matrix leads to poor estimates in this scenario, standard methods are inappropriate and adequate regularization techniques are needed. Popular approaches include biased estimates of the covariance matrix and high-dimensional regression schemes, such as the Lasso and Partial Least Squares. Results In this article, we investigate a general framework for combining regularized regression methods with the estimation of Graphical Gaussian models. This framework includes various existing methods as well as two new approaches based on ridge regression and adaptive lasso, respectively. These methods are extensively compared both qualitatively and quantitatively within a simulation study and through an application to six diverse real data sets. In addition, all proposed algorithms are implemented in the R package "parcor", available from the R repository CRAN. Conclusion In our simulation studies, the investigated non-sparse regression methods, i.e. Ridge Regression and Partial Least Squares, exhibit rather conservative behavior when combined with (local false discovery rate multiple testing in order to decide whether or not an edge is present in the network. For networks with higher densities, the difference in performance of the methods decreases. For sparse networks, we confirm the Lasso's well known tendency towards selecting too many edges, whereas the two-stage adaptive Lasso is an interesting alternative that provides sparser solutions. In our simulations, both sparse and non-sparse methods are able to reconstruct networks with cluster structures. On six real data sets, we also clearly distinguish the results
Modeling self-assembly and phase behavior in complex mixtures.
Balazs, Anna C
2007-01-01
Using a variety of computational techniques, I investigate how the self-assembly of complex mixtures can be guided by surfaces or external stimuli to form spatially regular or temporally periodic patterns. Focusing on mixtures in confined geometries, I examine how thermodynamic and hydrodynamic effects can be exploited to create regular arrays of nanowires or monodisperse, particle-filled droplets. I also show that an applied light source and chemical reaction can be harnessed to create hierarchically ordered patterns in ternary, phase-separating mixtures. Finally, I consider the combined effects of confining walls and a chemical reaction to demonstrate that a swollen polymer gel can be driven to form dynamically periodic structures. In addition to illustrating the effectiveness of external factors in directing the self-organization of multicomponent mixtures, the selected examples illustrate how coarse-grained models can be used to capture both the equilibrium phase behavior and the dynamics of these complex systems.
Bianchi, Davide; Chiesa, Matteo; Guzzo, Luigi
2015-01-01
As a step towards a more accurate modelling of redshift-space distortions (RSD) in galaxy surveys, we develop a general description of the probability distribution function of galaxy pairwise velocities within the framework of the so-called streaming model. For a given galaxy separation r, such function can be described as a superposition of virtually infinite local distributions. We characterize these in terms of their moments and then consider the specific case in which they are Gaussian functions, each with its own mean μ and dispersion σ. Based on physical considerations, we make the further crucial assumption that these two parameters are in turn distributed according to a bivariate Gaussian, with its own mean and covariance matrix. Tests using numerical simulations explicitly show that with this compact description one can correctly model redshift-space distortions on all scales, fully capturing the overall linear and non-linear dynamics of the galaxy flow at different separations. In particular, we naturally obtain Gaussian/exponential, skewed/unskewed distribution functions, depending on separation as observed in simulations and data. Also, the recently proposed single-Gaussian description of RSD is included in this model as a limiting case, when the bivariate Gaussian is collapsed to a two-dimensional Dirac delta function. We also show how this description naturally allows for the Taylor expansion of 1 + ξS(s) around 1 + ξR(r), which leads to the Kaiser linear formula when truncated to second order, explicating its connection with the moments of the velocity distribution functions. More work is needed, but these results indicate a very promising path to make definitive progress in our programme to improve RSD estimators.
A Gaussian process regression model for walking speed estimation using a head-worn IMU.
Zihajehzadeh, Shaghayegh; Park, Edward J
2017-07-01
Miniature inertial sensors mainly worn on waist, ankle and wrist have been widely used to measure walking speed of the individuals for lifestyle and/or health monitoring. Recent emergence of head-worn inertial sensors in the form of a smart eyewear (e.g. Recon Jet) or a smart ear-worn device (e.g. Sensixa e-AR) provides an opportunity to use these sensors for estimation of walking speed in real-world environment. This work studies the feasibility of using a head-worn inertial sensor for estimation of walking speed. A combination of time-domain and frequency-domain features of tri-axial acceleration norm signal were used in a Gaussian process regression model to estimate walking speed. An experimental evaluation was performed on 15 healthy subjects during free walking trials in an indoor environment. The results show that the proposed method can provide accuracies of better than around 10% for various walking speed regimes. Additionally, further evaluation of the model for long (15-minutes) outdoor walking trials reveals high correlation of the estimated walking speed values to the ones obtained from fusion of GPS with inertial sensors.
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
Directory of Open Access Journals (Sweden)
Robert B. Gramacy
2010-02-01
Full Text Available This document describes the new features in version 2.x of the tgp package for R, implementing treed Gaussian process (GP models. The topics covered include methods for dealing with categorical inputs and excluding inputs from the tree or GP part of the model; fully Bayesian sensitivity analysis for inputs/covariates; sequential optimization of black-box functions; and a new Monte Carlo method for inference in multi-modal posterior distributions that combines simulated tempering and importance sampling. These additions extend the functionality of tgp across all models in the hierarchy: from Bayesian linear models, to classification and regression trees (CART, to treed Gaussian processes with jumps to the limiting linear model. It is assumed that the reader is familiar with the baseline functionality of the package, outlined in the first vignette (Gramacy 2007.
A MIXTURE LIKELIHOOD APPROACH FOR GENERALIZED LINEAR-MODELS
WEDEL, M; DESARBO, WS
1995-01-01
A mixture model approach is developed that simultaneously estimates the posterior membership probabilities of observations to a number of unobservable groups or latent classes, and the parameters of a generalized linear model which relates the observations, distributed according to some member of
A Gamma Model for Mixture STR Samples
DEFF Research Database (Denmark)
Christensen, Susanne; Bøttcher, Susanne Gammelgaard; Morling, Niels
This project investigates the behavior of the PCR Amplification Kit. A number of known DNA-profiles are mixed two by two in "known" proportions and analyzed. Gamma distribution models are fitted to the resulting data to learn to what extent actual mixing proportions can be rediscovered in the amp...
Atomic forces for geometry-dependent point multipole and gaussian multipole models.
Elking, Dennis M; Perera, Lalith; Duke, Robert; Darden, Thomas; Pedersen, Lee G
2010-11-30
In standard treatments of atomic multipole models, interaction energies, total molecular forces, and total molecular torques are given for multipolar interactions between rigid molecules. However, if the molecules are assumed to be flexible, two additional multipolar atomic forces arise because of (1) the transfer of torque between neighboring atoms and (2) the dependence of multipole moment on internal geometry (bond lengths, bond angles, etc.) for geometry-dependent multipole models. In this study, atomic force expressions for geometry-dependent multipoles are presented for use in simulations of flexible molecules. The atomic forces are derived by first proposing a new general expression for Wigner function derivatives partial derivative D(m'm)(l)/partial derivative Omega. The force equations can be applied to electrostatic models based on atomic point multipoles or gaussian multipole charge density. Hydrogen-bonded dimers are used to test the intermolecular electrostatic energies and atomic forces calculated by geometry-dependent multipoles fit to the ab initio electrostatic potential. The electrostatic energies and forces are compared with their reference ab initio values. It is shown that both static and geometry-dependent multipole models are able to reproduce total molecular forces and torques with respect to ab initio, whereas geometry-dependent multipoles are needed to reproduce ab initio atomic forces. The expressions for atomic force can be used in simulations of flexible molecules with atomic multipoles. In addition, the results presented in this work should lead to further development of next generation force fields composed of geometry-dependent multipole models. 2010 Wiley Periodicals, Inc.
Schifferstein, H N
1996-02-01
The Equiratio Mixture Model predicts the psychophysical function for an equiratio mixture type on the basis of the psychophysical functions for the unmixed components. The model reliably estimates the sweetness of mixtures of sugars and sugar-alcohols, but is unable to predict intensity for aspartame/sucrose mixtures. In this paper, the sweetness of aspartame/acesulfame-K mixtures in aqueous and acidic solutions is investigated. These two intensive sweeteners probably do not comply with the model's original assumption of sensory dependency among components. However, they reveal how the Equiratio Mixture Model could be modified to describe and predict mixture functions for non-additive substances. To predict equiratio functions for all similar tasting substances, a new Equiratio Mixture Model should yield accurate predictions for components eliciting similar intensities at widely differing concentration levels, and for substances exhibiting hypo- or hyperadditivity. In addition, it should be able to correct violations of Stevens's power law. These three problems are resolved in a model that uses equi-intense units as the measure of physical concentration. An interaction index in the formula for the constant accounts for the degree of interaction between mixture components. Deviations from the power law are corrected by a nonlinear response output transformation, assuming a two-stage model of psychophysical judgment.
Microbial comparative pan-genomics using binomial mixture models
DEFF Research Database (Denmark)
Ussery, David; Snipen, L; Almøy, T
2009-01-01
The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...... occurring genes in the population. CONCLUSION: Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes....
Identifying Clusters with Mixture Models that Include Radial Velocity Observations
Czarnatowicz, Alexis; Ybarra, Jason E.
2018-01-01
The study of stellar clusters plays an integral role in the study of star formation. We present a cluster mixture model that considers radial velocity data in addition to spatial data. Maximum likelihood estimation through the Expectation-Maximization (EM) algorithm is used for parameter estimation. Our mixture model analysis can be used to distinguish adjacent or overlapping clusters, and estimate properties for each cluster.Work supported by awards from the Virginia Foundation for Independent Colleges (VFIC) Undergraduate Science Research Fellowship and The Research Experience @Bridgewater (TREB).
Monitoring the trajectory of urban nighttime light hotspots using a Gaussian volume model
Zheng, Qiming; Jiang, Ruowei; Wang, Ke; Huang, Lingyan; Ye, Ziran; Gan, Muye; Ji, Biyong
2018-03-01
Urban nighttime light hotspot is an ideal representation of the spatial heterogeneity of human activities within a city, which is sensitive to regional urban expansion pattern. However, most of previous studies related to nighttime light imageries focused on extracting urban extent, leaving the spatial variation of radiance intensity insufficiently explored. With the help of global radiance calibrated DMSP-OLS datasets (NTLgrc), we proposed an innovative framework to explore the spatio-temporal trajectory of polycentric urban nighttime light hotspots. Firstly, NTLgrc was inter-annually calibrated to improve the consistency. Secondly, multi-resolution segmentation and region-growing SVM classification were employed to remove blooming effect and to extract potential clusters. At last, the urban hotspots were identified by a Gaussian volume model, and the resulting parameters were used to quantitatively depict hotspot features (i.e., intensity, morphology and centroid dynamics). The result shows that our framework successfully captures hotspots in polycentric urban area, whose Ra2 are over 0.9. Meanwhile, the spatio-temporal dynamics of the hotspot features intuitively reveal the impact of the regional urban growth pattern and planning strategies on human activities. Compared to previous studies, our framework is more robust and offers an effective way to describe hotspot pattern. Also, it provides a more comprehensive and spatial-explicit understanding regarding the interaction between urbanization pattern and human activities. Our findings are expected to be beneficial to governors in term of sustainable urban planning and decision making.
Fast Kalman-like filtering for large-dimensional linear and Gaussian state-space models
Ait-El-Fquih, Boujemaa
2015-08-13
This paper considers the filtering problem for linear and Gaussian state-space models with large dimensions, a setup in which the optimal Kalman Filter (KF) might not be applicable owing to the excessive cost of manipulating huge covariance matrices. Among the most popular alternatives that enable cheaper and reasonable computation is the Ensemble KF (EnKF), a Monte Carlo-based approximation. In this paper, we consider a class of a posteriori distributions with diagonal covariance matrices and propose fast approximate deterministic-based algorithms based on the Variational Bayesian (VB) approach. More specifically, we derive two iterative KF-like algorithms that differ in the way they operate between two successive filtering estimates; one involves a smoothing estimate and the other involves a prediction estimate. Despite its iterative nature, the prediction-based algorithm provides a computational cost that is, on the one hand, independent of the number of iterations in the limit of very large state dimensions, and on the other hand, always much smaller than the cost of the EnKF. The cost of the smoothing-based algorithm depends on the number of iterations that may, in some situations, make this algorithm slower than the EnKF. The performances of the proposed filters are studied and compared to those of the KF and EnKF through a numerical example.
Directory of Open Access Journals (Sweden)
Saito Shigeru
2007-01-01
Full Text Available Hepatocellular carcinoma (HCC in a liver with advanced-stage chronic hepatitis C (CHC is induced by hepatitis C virus, which chronically infects about 170 million people worldwide. To elucidate the associations between gene groups in hepatocellular carcinogenesis, we analyzed the profiles of the genes characteristically expressed in the CHC and HCC cell stages by a statistical method for inferring the network between gene systems based on the graphical Gaussian model. A systematic evaluation of the inferred network in terms of the biological knowledge revealed that the inferred network was strongly involved in the known gene-gene interactions with high significance , and that the clusters characterized by different cancer-related responses were associated with those of the gene groups related to metabolic pathways and morphological events. Although some relationships in the network remain to be interpreted, the analyses revealed a snapshot of the orchestrated expression of cancer-related groups and some pathways related with metabolisms and morphological events in hepatocellular carcinogenesis, and thus provide possible clues on the disease mechanism and insights that address the gap between molecular and clinical assessments.
Spectrum recovery method based on sparse representation for segmented multi-Gaussian model
Teng, Yidan; Zhang, Ye; Ti, Chunli; Su, Nan
2016-09-01
Hyperspectral images can realize crackajack features discriminability for supplying diagnostic characteristics with high spectral resolution. However, various degradations may generate negative influence on the spectral information, including water absorption, bands-continuous noise. On the other hand, the huge data volume and strong redundancy among spectrums produced intense demand on compressing HSIs in spectral dimension, which also leads to the loss of spectral information. The reconstruction of spectral diagnostic characteristics has irreplaceable significance for the subsequent application of HSIs. This paper introduces a spectrum restoration method for HSIs making use of segmented multi-Gaussian model (SMGM) and sparse representation. A SMGM is established to indicating the unsymmetrical spectral absorption and reflection characteristics, meanwhile, its rationality and sparse property are discussed. With the application of compressed sensing (CS) theory, we implement sparse representation to the SMGM. Then, the degraded and compressed HSIs can be reconstructed utilizing the uninjured or key bands. Finally, we take low rank matrix recovery (LRMR) algorithm for post processing to restore the spatial details. The proposed method was tested on the spectral data captured on the ground with artificial water absorption condition and an AVIRIS-HSI data set. The experimental results in terms of qualitative and quantitative assessments demonstrate that the effectiveness on recovering the spectral information from both degradations and loss compression. The spectral diagnostic characteristics and the spatial geometry feature are well preserved.
Singh, Kunal; Kumar, Sanjay; Goel, Ekta; Singh, Balraj; Kumar, Mirgender; Dubey, Sarvesh; Jit, Satyabrata
2017-01-01
This paper proposes a new model for the subthreshold current and swing of the short-channel symmetric underlap ultrathin double gate metal oxide field effect transistors with a source/drain lateral Gaussian doping profile. The channel potential model already reported earlier has been utilized to formulate the closed form expression for the subthreshold current and swing of the device. The effects of the lateral straggle and geometrical parameters such as the channel length, channel thickness, and oxide thickness on the off current and subthreshold slope have been demonstrated. The devices with source/drain lateral Gaussian doping profiles in the underlap structure are observed to be highly resistant to short channel effects while improving the current drive. The proposed model is validated by comparing the results with the numerical simulation data obtained by using the commercially available ATLAS™, a two-dimensional (2-D) device simulator from SILVACO.
Models for the computation of opacity of mixtures
International Nuclear Information System (INIS)
Klapisch, Marcel; Busquet, Michel
2013-01-01
We compare four models for the partial densities of the components of mixtures. These models yield different opacities as shown on polystyrene, acrylic and polyimide in local thermodynamical equilibrium (LTE). Two of these models, the ‘whole volume partial pressure’ model (M1) and its modification (M2) are not thermodynamically consistent (TC). The other two models are TC and minimize free energy. M3, the ‘partial volume equal pressure’ model, uses equality of chemical potential. M4 uses commonality of free electron density. The latter two give essentially identical results in LTE, but M4’s convergence is slower. M4 is easily generalized to non-LTE conditions. Non-LTE effects are shown by the variation of the Planck mean opacity of the mixtures with temperature and density. (paper)
Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models
Martin Burda; Artem Prokhorov
2012-01-01
Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...
The R Package bgmm : Mixture Modeling with Uncertain Knowledge
Directory of Open Access Journals (Sweden)
Przemys law Biecek
2012-04-01
Full Text Available Classical supervised learning enjoys the luxury of accessing the true known labels for the observations in a modeled dataset. Real life, however, poses an abundance of problems, where the labels are only partially defined, i.e., are uncertain and given only for a subsetof observations. Such partial labels can occur regardless of the knowledge source. For example, an experimental assessment of labels may have limited capacity and is prone to measurement errors. Also expert knowledge is often restricted to a specialized area and is thus unlikely to provide trustworthy labels for all observations in the dataset. Partially supervised mixture modeling is able to process such sparse and imprecise input. Here, we present an R package calledbgmm, which implements two partially supervised mixture modeling methods: soft-label and belief-based modeling. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. On real data we present the usage of bgmm for basic model-fitting in all modeling variants. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. This functionality is presented on an artificial dataset, which can be simulated in bgmm from a distribution defined by a given model.
Parameter Estimation and Model Selection for Mixtures of Truncated Exponentials
DEFF Research Database (Denmark)
Langseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2010-01-01
Bayesian networks with mixtures of truncated exponentials (MTEs) support efficient inference algorithms and provide a flexible way of modeling hybrid domains (domains containing both discrete and continuous variables). On the other hand, estimating an MTE from data has turned out to be a difficult...
Comparing State SAT Scores Using a Mixture Modeling Approach
Kim, YoungKoung Rachel
2009-01-01
Presented at the national conference for AERA (American Educational Research Association) in April 2009. The large variability of SAT taker population across states makes state-by-state comparisons of the SAT scores challenging. Using a mixture modeling approach, therefore, the current study presents a method of identifying subpopulations in terms…
Polymer mixtures in confined geometries: Model systems to explore ...
Indian Academy of Sciences (India)
Home; Journals; Pramana – Journal of Physics; Volume 64; Issue 6. Polymer mixtures in confined geometries: Model systems to explore phase transitions. K Binder M Müller A Cavallo E V Albano. Invited Talks:- Topic 7. Soft condensed matter (colloids, polymers, liquid crystals, microemulsions, foams, membranes, etc.) ...
Option Pricing with Asymmetric Heteroskedastic Normal Mixture Models
DEFF Research Database (Denmark)
Rombouts, Jeroen V. K; Stentoft, Lars
2015-01-01
We propose an asymmetric GARCH in mean mixture model and provide a feasible method for option pricing within this general framework by deriving the appropriate risk neutral dynamics. We forecast the out-of-sample prices of a large sample of options on the S&P 500 index from January 2006 to December...
The Semiparametric Normal Variance-Mean Mixture Model
DEFF Research Database (Denmark)
Korsholm, Lars
1997-01-01
We discuss the normal vairance-mean mixture model from a semi-parametric point of view, i.e. we let the mixing distribution belong to a non parametric family. The main results are consistency of the non parametric maximum likelihood estimat or in this case, and construction of an asymptotically...
Bayesian parameter estimation for the Wnt pathway: an infinite mixture models approach.
Koutroumpas, Konstantinos; Ballarini, Paolo; Votsi, Irene; Cournède, Paul-Henry
2016-09-01
Likelihood-free methods, like Approximate Bayesian Computation (ABC), have been extensively used in model-based statistical inference with intractable likelihood functions. When combined with Sequential Monte Carlo (SMC) algorithms they constitute a powerful approach for parameter estimation and model selection of mathematical models of complex biological systems. A crucial step in the ABC-SMC algorithms, significantly affecting their performance, is the propagation of a set of parameter vectors through a sequence of intermediate distributions using Markov kernels. In this article, we employ Dirichlet process mixtures (DPMs) to design optimal transition kernels and we present an ABC-SMC algorithm with DPM kernels. We illustrate the use of the proposed methodology using real data for the canonical Wnt signaling pathway. A multi-compartment model of the pathway is developed and it is compared to an existing model. The results indicate that DPMs are more efficient in the exploration of the parameter space and can significantly improve ABC-SMC performance. In comparison to alternative sampling schemes that are commonly used, the proposed approach can bring potential benefits in the estimation of complex multimodal distributions. The method is used to estimate the parameters and the initial state of two models of the Wnt pathway and it is shown that the multi-compartment model fits better the experimental data. Python scripts for the Dirichlet Process Gaussian Mixture model and the Gibbs sampler are available at https://sites.google.com/site/kkoutroumpas/software konstantinos.koutroumpas@ecp.fr. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Option Pricing with Asymmetric Heteroskedastic Normal Mixture Models
DEFF Research Database (Denmark)
Rombouts, Jeroen V.K.; Stentoft, Lars
This paper uses asymmetric heteroskedastic normal mixture models to fit return data and to price options. The models can be estimated straightforwardly by maximum likelihood, have high statistical fit when used on S&P 500 index return data, and allow for substantial negative skewness and time var....... Overall, the dollar root mean squared error of the best performing benchmark component model is 39% larger than for the mixture model. When considering the recent financial crisis this difference increases to 69%....... varying higher order moments of the risk neutral distribution. When forecasting out-of-sample a large set of index options between 1996 and 2009, substantial improvements are found compared to several benchmark models in terms of dollar losses and the ability to explain the smirk in implied volatilities...
Gilthorpe, M S; Dahly, D L; Tu, Y K; Kubzansky, L D; Goodman, E
2014-06-01
Lifecourse trajectories of clinical or anthropological attributes are useful for identifying how our early-life experiences influence later-life morbidity and mortality. Researchers often use growth mixture models (GMMs) to estimate such phenomena. It is common to place constrains on the random part of the GMM to improve parsimony or to aid convergence, but this can lead to an autoregressive structure that distorts the nature of the mixtures and subsequent model interpretation. This is especially true if changes in the outcome within individuals are gradual compared with the magnitude of differences between individuals. This is not widely appreciated, nor is its impact well understood. Using repeat measures of body mass index (BMI) for 1528 US adolescents, we estimated GMMs that required variance-covariance constraints to attain convergence. We contrasted constrained models with and without an autocorrelation structure to assess the impact this had on the ideal number of latent classes, their size and composition. We also contrasted model options using simulations. When the GMM variance-covariance structure was constrained, a within-class autocorrelation structure emerged. When not modelled explicitly, this led to poorer model fit and models that differed substantially in the ideal number of latent classes, as well as class size and composition. Failure to carefully consider the random structure of data within a GMM framework may lead to erroneous model inferences, especially for outcomes with greater within-person than between-person homogeneity, such as BMI. It is crucial to reflect on the underlying data generation processes when building such models.
A general mixture model for sediment laden flows
Liang, Lixin; Yu, Xiping; Bombardelli, Fabián
2017-09-01
A mixture model for general description of sediment-laden flows is developed based on an Eulerian-Eulerian two-phase flow theory, with the aim at gaining computational speed in the prediction, but preserving the accuracy of the complete two-fluid model. The basic equations of the model include the mass and momentum conservation equations for the sediment-water mixture, and the mass conservation equation for sediment. However, a newly-obtained expression for the slip velocity between phases allows for the computation of the sediment motion, without the need of solving the momentum equation for sediment. The turbulent motion is represented for both the fluid and the particulate phases. A modified k-ε model is used to describe the fluid turbulence while an algebraic model is adopted for turbulent motion of particles. A two-dimensional finite difference method based on the SMAC scheme was used to numerically solve the mathematical model. The model is validated through simulations of fluid and suspended sediment motion in steady open-channel flows, both in equilibrium and non-equilibrium states, as well as in oscillatory flows. The computed sediment concentrations, horizontal velocity and turbulent kinetic energy of the mixture are all shown to be in good agreement with available experimental data, and importantly, this is done at a fraction of the computational efforts required by the complete two-fluid model.
Pernin, Jérôme; Vrac, Mathieu; Crevoisier, Cyril; Chédin, Alain
2017-04-01
Air mass classification has become an important area in synoptic climatology, simplifying the complexity of the atmosphere by dividing the atmosphere into discrete similar thermodynamic patterns. However, the constant growth of atmospheric databases in both size and complexity implies the need to develop new adaptive classifications. Here, we propose a robust unsupervised and supervised classification methodology of a large thermodynamic dataset, on a global scale and over several years, into discrete air mass groups homogeneous in both temperature and humidity that also provides underlying probability laws. Temperature and humidity at different pressure levels are aggregated into a set of cumulative distribution function (CDF) values instead of classical ones. The method is based on a Gaussian mixture model and uses the expectation-maximization (EM) algorithm to estimate the parameters of the mixture. Spatially gridded thermodynamic profiles come from ECMWF reanalyses spanning the period 2000-2009. Different aspects are investigated, such as the sensitivity of the classification process to both temporal and spatial samplings of the training dataset. Comparisons of the classifications made either by the EM algorithm or by the widely used k-means algorithm show that the former can be viewed as a generalization of the latter. Moreover, the EM algorithm delivers, for each observation, the probabilities of belonging to each class, as well as the associated uncertainty. Finally, a decision tree is proposed as a tool for interpreting the different classes, highlighting the relative importance of temperature and humidity in the classification process.
Gacal, G. F. B.; Lagrosas, N.
2016-12-01
Nowadays, cameras are commonly used by students. In this study, we use this instrument to look at moon signals and relate these signals to Gaussian functions. To implement this as a classroom activity, students need computers, computer software to visualize signals, and moon images. A normalized Gaussian function is often used to represent probability density functions of normal distribution. It is described by its mean m and standard deviation s. The smaller standard deviation implies less spread from the mean. For the 2-dimensional Gaussian function, the mean can be described by coordinates (x0, y0), while the standard deviations can be described by sx and sy. In modelling moon signals obtained from sky-cameras, the position of the mean (x0, y0) is solved by locating the coordinates of the maximum signal of the moon. The two standard deviations are the mean square weighted deviation based from the sum of total pixel values of all rows/columns. If visualized in three dimensions, the 2D Gaussian function appears as a 3D bell surface (Fig. 1a). This shape is similar to the pixel value distribution of moon signals as captured by a sky-camera. An example of this is illustrated in Fig 1b taken around 22:20 (local time) of January 31, 2015. The local time is 8 hours ahead of coordinated universal time (UTC). This image is produced by a commercial camera (Canon Powershot A2300) with 1s exposure time, f-stop of f/2.8, and 5mm focal length. One has to chose a camera with high sensitivity when operated at nighttime to effectively detect these signals. Fig. 1b is obtained by converting the red-green-blue (RGB) photo to grayscale values. The grayscale values are then converted to a double data type matrix. The last conversion process is implemented for the purpose of having the same scales for both Gaussian model and pixel distribution of raw signals. Subtraction of the Gaussian model from the raw data produces a moonless image as shown in Fig. 1c. This moonless image can be
Mixture model for biomagnetic separation in microfluidic systems
Khashan, S. A.; Alazzam, A.; Mathew, B.; Hamdan, M.
2017-11-01
In this paper, we show that mixture model, with algebraic slip velocity relating to the magnetophoresis, provides a continuum-based, and cost-effective tool to simulate biomagnetic separations in microfluidics. The model is most effective to simulate magnetic separation protocols in which magnetic or magnetically labeled biological targets are within a naturally dilute or diluted samples. The transport of these samples is characterized as mixtures in which the dispersed magnetic microparticles establish their magnetophoretic mobility quickly in response to the acting forces. Our simulations demonstrate the coupled particle-fluid transport and the High Gradient Magnetic Capture (HGMC) of magnetic beads flowing through a microchannel. Also, we show that the mixture model and accordingly the modeling of the slip velocity model, unlike with the case with dense and/or macro-scale systems, can be further simplified by ignoring the gravitational and granular parameters. Furthermore, we show, by conducting comparative simulations, that the developed model provides an easier and viable alternative to the commonly used Lagrangian-Eulerian (particle-based) models.
Estimation of biogas produced by the landfill of Palermo, applying a Gaussian model.
Aronica, S; Bonanno, A; Piazza, V; Pignato, L; Trapani, S
2009-01-01
In this work, a procedure is suggested to assess the rate of biogas emitted by the Bellolampo landfill (Palermo, Italy), starting from the data acquired by two of the stations for monitoring meteorological parameters and polluting gases. The data used refer to the period November 2005-July 2006. The methane concentration, measured in the CEP suburb of Palermo, has been analysed together with the meteorological data collected by the station situated inside the landfill area. In the present study, the methane has been chosen as a tracer of the atmospheric pollutants produced by the dump. The data used for assessing the biogas emission refer to night time periods characterized by weak wind blowing from the hill toward the city. The methane rate emitted by the Bellolampo dump has been evaluated using a Gaussian model and considering the landfill both as a single point source and as a multiple point one. The comparison of the results shows that for a first approximation it is sufficient to consider the landfill of Palermo as a single point source. Starting from the monthly percentage composition of the biogas, estimated for the study period, the rate of biogas produced by the dump was evaluated. The total biogas produced by the landfill, obtained as the sum of the emitted component and the recovered one, ranged from 7519.97 to 10,153.7m3/h. For the study period the average monthly estimations of biogas emissions into the atmosphere amount to about 60% of the total biogas produced by the landfill, a little higher than the one estimated by the company responsible for the biogas recovery plant at the landfill.
A hybrid sampler for Poisson-Kingman mixture models
Lomeli, M.; Favaro, S.; Teh, Y. W.
2015-01-01
This paper concerns the introduction of a new Markov Chain Monte Carlo scheme for posterior sampling in Bayesian nonparametric mixture models with priors that belong to the general Poisson-Kingman class. We present a novel compact way of representing the infinite dimensional component of the model such that while explicitly representing this infinite component it has less memory and storage requirements than previous MCMC schemes. We describe comparative simulation results demonstrating the e...
The Research of Indoor Positioning Based on Double-peak Gaussian Model
Directory of Open Access Journals (Sweden)
Lina Chen
2014-04-01
Full Text Available Location fingerprinting using Wi-Fi signals has been very popular and is a well accepted indoor positioning method. The key issue of the fingerprinting approach is generating the fingerprint radio map. Limited by the practical workload, only a few samples of the received signal strength are collected at each reference point. Unfortunately, fewer samples cannot accurately represent the actual distribution of the signal strength from each access point. This study finds most Wi- Fi signals have two peaks. According to the new finding, a double-peak Gaussian arithmetic is proposed to generate a fingerprint radio map. This approach requires little time to receive WiFi signals and it easy to estimate the parameters of the double-peak Gaussian function. Compared to the Gaussian function and histogram method to generate a fingerprint radio map, this method better approximates the occurrence signal distribution. This paper also compared the positioning accuracy using K-Nearest Neighbour theory for three radio maps, the test results show that the positioning distance error utilizing the double-peak Gaussian function is better than the other two methods.
A non‐Gaussian model of turbulence (soccer‐ball integrals)
DEFF Research Database (Denmark)
Betchov, Robert; Larsen, Poul Scheel
1981-01-01
The statistics of the time evolution of a nonlinearly coupled system of first-order equations representing the Euler equations is studied. The probability distribution of functions is nearly Gaussian, while that of their time derivatives has exponential tails and moments of order 4, 6, and 8...
Matzke, D.; Wagenmakers, E.-J.
2009-01-01
A growing number of researchers use descriptive distributions such as the ex-Gaussian and the shifted Wald to summarize response time data for speeded two-choice tasks. Some of these researchers also assume that the parameters of these distributions uniquely correspond to specific cognitive
Phylogenetic mixtures and linear invariants for equal input models.
Casanellas, Marta; Steel, Mike
2017-04-01
The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the 'equal input model'. This model generalizes the 'Felsenstein 1981' model (and thereby the Jukes-Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a 'random cluster' process. We describe the structure and dimension of the vector spaces of phylogenetic mixtures and of linear invariants for any fixed phylogenetic tree (and for all trees-the so called 'model invariants'), on any number n of leaves. We also provide a precise description of the space of mixtures and linear invariants for the special case of [Formula: see text] leaves. By combining techniques from discrete random processes and (multi-) linear algebra, our results build on a classic result that was first established by James Lake (Mol Biol Evol 4:167-191, 1987).
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Modeling adsorption of binary and ternary mixtures on microporous media
DEFF Research Database (Denmark)
Monsalvo, Matias Alfonso; Shapiro, Alexander
2007-01-01
The goal of this work is to analyze the adsorption of binary and ternary mixtures on the basis of the multicomponent potential theory of adsorption (MPTA). In the MPTA, the adsorbate is considered as a segregated mixture in the external potential field emitted by the solid adsorbent. This makes...... it possible using the same equation of state to describe the thermodynamic properties of the segregated and the bulk phases. For comparison, we also used the ideal adsorbed solution theory (IAST) to describe adsorption equilibria. The main advantage of these two models is their capabilities to predict...... multicomponent adsorption equilibria on the basis of single-component adsorption data. We compare the MPTA and IAST models to a large set of experimental data, obtaining reasonable good agreement with experimental data and high degree of predictability. Some limitations of both models are also discussed....
Hydrogenic ionization model for mixtures in non-LTE plasmas
International Nuclear Information System (INIS)
Djaoui, A.
1999-01-01
The Hydrogenic Ionization Model for Mixtures (HIMM) is a non-Local Thermodynamic Equilibrium (non-LTE), time-dependent ionization model for laser-produced plasmas containing mixtures of elements (species). In this version, both collisional and radiative rates are taken into account. An ionization distribution for each species which is consistent with the ambient electron density is obtained by use of an iterative procedure in a single calculation for all species. Energy levels for each shell having a given principal quantum number and for each ion stage of each species in the mixture are calculated using screening constants. Steady-state non-LTE as well as LTE solutions are also provided. The non-LTE rate equations converge to the LTE solution at sufficiently high densities or as the radiation temperature approaches the electron temperature. The model is particularly useful at low temperatures where convergence problems are usually encountered in our previous models. We apply our model to typical situation in x-ray laser research, laser-produced plasmas and inertial confinement fusion. Our results compare well with previously published results for a selenium plasma. (author)
Estimation and Model Selection for Finite Mixtures of Latent Interaction Models
Hsu, Jui-Chen
2011-01-01
Latent interaction models and mixture models have received considerable attention in social science research recently, but little is known about how to handle if unobserved population heterogeneity exists in the endogenous latent variables of the nonlinear structural equation models. The current study estimates a mixture of latent interaction…
Lanzafame, S; Giannelli, M; Garaci, F; Floris, R; Duggento, A; Guerrisi, M; Toschi, N
2016-05-01
/RK/AK values, indicating substantial anatomical variability of these discrepancies. In the HCP dataset, the median voxelwise percentage differences across the whole white matter skeleton were (nonlinear least squares algorithm) 14.5% (8.2%-23.1%) for MD, 4.3% (1.4%-17.3%) for FA, -5.2% (-48.7% to -0.8%) for MO, 12.5% (6.4%-21.2%) for RD, and 16.1% (9.9%-25.6%) for AD (all ranges computed as 0.01 and 0.99 quantiles). All differences/trends were consistent between the discovery (HCP) and replication (local) datasets and between estimation algorithms. However, the relationships between such trends, estimated diffusion tensor invariants, and kurtosis estimates were impacted by the choice of fitting routine. Model-dependent differences in the estimation of conventional indexes of MD/FA/MO/RD/AD can be well beyond commonly seen disease-related alterations. While estimating diffusion tensor-derived indexes using the DKI model may be advantageous in terms of mitigating b-value dependence of diffusivity estimates, such estimates should not be referred to as conventional DTI-derived indexes in order to avoid confusion in interpretation as well as multicenter comparisons. In order to assess the potential and advantages of DKI with respect to DTI as well as to standardize diffusion-weighted imaging methods between centers, both conventional DTI-derived indexes and diffusion tensor invariants derived by fitting the non-Gaussian DKI model should be separately estimated and analyzed using the same combination of fitting routines.
New Flexible Models and Design Construction Algorithms for Mixtures and Binary Dependent Variables
Ruseckaite, Aiste
2017-01-01
markdownabstractThis thesis discusses new mixture(-amount) models, choice models and the optimal design of experiments. Two chapters of the thesis relate to the so-called mixture, which is a product or service whose ingredients’ proportions sum to one. The thesis begins by introducing mixture models in the choice context and develops new optimal design construction algorithms for choice experiments involving mixtures. Building further, varying the total amount of a mixture, and not only its i...
Directory of Open Access Journals (Sweden)
Carlo Baldassi
Full Text Available In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids, exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i the prediction of residue-residue contacts in proteins, and (ii the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code.
Baldassi, Carlo; Zamparo, Marco; Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea
2014-01-01
In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code.
Buchari, M. A.; Mardiyanto, S.; Hendradjaya, B.
2018-03-01
Finding the existence of software defect as early as possible is the purpose of research about software defect prediction. Software defect prediction activity is required to not only state the existence of defects, but also to be able to give a list of priorities which modules require a more intensive test. Therefore, the allocation of test resources can be managed efficiently. Learning to rank is one of the approach that can provide defect module ranking data for the purposes of software testing. In this study, we propose a meta-heuristic chaotic Gaussian particle swarm optimization to improve the accuracy of learning to rank software defect prediction approach. We have used 11 public benchmark data sets as experimental data. Our overall results has demonstrated that the prediction models construct using Chaotic Gaussian Particle Swarm Optimization gets better accuracy on 5 data sets, ties in 5 data sets and gets worse in 1 data sets. Thus, we conclude that the application of Chaotic Gaussian Particle Swarm Optimization in Learning-to-Rank approach can improve the accuracy of the defect module ranking in data sets that have high-dimensional features.
Variable selection for mixture and promotion time cure rate models.
Masud, Abdullah; Tu, Wanzhu; Yu, Zhangsheng
2016-11-16
Failure-time data with cured patients are common in clinical studies. Data from these studies are typically analyzed with cure rate models. Variable selection methods have not been well developed for cure rate models. In this research, we propose two least absolute shrinkage and selection operators based methods, for variable selection in mixture and promotion time cure models with parametric or nonparametric baseline hazards. We conduct an extensive simulation study to assess the operating characteristics of the proposed methods. We illustrate the use of the methods using data from a study of childhood wheezing. © The Author(s) 2016.
KONVERGENSI ESTIMATOR DALAM MODEL MIXTURE BERBASIS MISSING DATA
Directory of Open Access Journals (Sweden)
N Dwidayati
2014-11-01
Full Text Available Model mixture dapat mengestimasi proporsi pasien yang sembuh (cured dan fungsi survival pasien tak sembuh (uncured. Pada kajian ini, model mixture dikembangkan untuk analisis cure rate berbasis missing data. Ada beberapa metode yang dapat digunakan untuk analisis missing data. Salah satu metode yang dapat digunakan adalah Algoritma EM, Metode ini didasarkan pada dua langkah, yaitu: (1 Expectation Step dan (2 Maximization Step. Algoritma EM merupakan pendekatan iterasi untuk mempelajari model dari data dengan nilai hilang melalui empat langkah, yaitu(1 pilih himpunan inisial dari parameter untuk sebuah model, (2 tentukan nilai ekspektasi untuk data hilang, (3 buat induksi parameter model baru dari gabungan nilai ekspekstasi dan data asli, dan (4 jika parameter tidak converged, ulangi langkah 2 menggunakan model baru. Berdasar kajian yang dilakukan dapat ditunjukkan bahwa pada algoritma EM, log-likelihood untuk missing data mengalami kenaikan setelah dilakukan setiap iterasi dari algoritmanya. Dengan demikian berdasar algoritma EM, barisan likelihood konvergen jika likelihood terbatas ke bawah. Model mixture can estimate the proportion of recovering (cured patients and function of survival but do not recover (uncured patients. In this study, a model mixture has been developed to analyze the curing rate based on missing data. There are some methods applicable to analyze missing data. One of the methods is EM Algorithm, This method is based on two (2 steps, i.e.: ( 1 Expectation Step and ( 2 Maximization Step. EM Algorithm is an iteration approach to study the model from data with missing values in four (4 steps, i.e. (1 to choose initial set from parameters for a model, ( 2 to determine the expectation value for missing data, ( 3 to make induction for the new model parameter from the combined expectation values and the original data, and ( 4 if parameter is not converged, repeat step 2 using new model. The current study indicated that for
Directory of Open Access Journals (Sweden)
Mingyu Liu
2016-12-01
Full Text Available Nowadays, the use of freeform surfaces in various functional applications has become more widespread. Multi-sensor coordinate measuring machines (CMMs are becoming popular and are produced by many CMM manufacturers since their measurement ability can be significantly improved with the help of different kinds of sensors. Moreover, the measurement accuracy after data fusion for multiple sensors can be improved. However, the improvement is affected by many issues in practice, especially when the measurement results have bias and there exists uncertainty regarding the data modelling method. This paper proposes a generic data modelling and data fusion method for the measurement of freeform surfaces using multi-sensor CMMs and attempts to study the factors which affect the fusion result. Based on the data modelling method for the original measurement datasets and the statistical Bayesian inference data fusion method, this paper presents a Gaussian process data modelling and maximum likelihood data fusion method for supporting multi-sensor CMM measurement of freeform surfaces. The datasets from different sensors are firstly modelled with the Gaussian process to obtain the mean surfaces and covariance surfaces, which represent the underlying surfaces and associated measurement uncertainties. Hence, the mean surfaces and the covariance surfaces are fused together with the maximum likelihood principle so as to obtain the statistically best estimated underlying surface and associated measurement uncertainty. With this fusion method, the overall measurement uncertainty after fusion is smaller than each of the single-sensor measurements. The capability of the proposed method is demonstrated through a series of simulations and real measurements of freeform surfaces on a multi-sensor CMM. The accuracy of the Gaussian process data modelling and the influence of the form error and measurement noise are also discussed and demonstrated in a series of experiments
Non-Gaussian statistics, classical field theory, and realizable Langevin models
International Nuclear Information System (INIS)
Krommes, J.A.
1995-11-01
The direct-interaction approximation (DIA) to the fourth-order statistic Z ∼ left-angle λψ 2 ) 2 right-angle, where λ is a specified operator and ψ is a random field, is discussed from several points of view distinct from that of Chen et al. [Phys. Fluids A 1, 1844 (1989)]. It is shown that the formula for Z DIA already appeared in the seminal work of Martin, Siggia, and Rose (Phys. Rev. A 8, 423 (1973)] on the functional approach to classical statistical dynamics. It does not follow from the original generalized Langevin equation (GLE) of Leith [J. Atmos. Sd. 28, 145 (1971)] and Kraichnan [J. Fluid Mech. 41, 189 (1970)] (frequently described as an amplitude representation for the DIA), in which the random forcing is realized by a particular superposition of products of random variables. The relationship of that GLE to renormalized field theories with non-Gaussian corrections (''spurious vertices'') is described. It is shown how to derive an improved representation, that realizes cumulants through O(ψ 4 ), by adding to the GLE a particular non-Gaussian correction. A Markovian approximation Z DIA M to Z DIA is derived. Both Z DIA and Z DIA M incorrectly predict a Gaussian kurtosis for the steady state of a solvable three-mode example
Byrnes, Christian T; Tasinato, Gianmassimo; Wands, David
2012-01-01
We propose a method to probe higher-order correlators of the primordial density field through the inhomogeneity of local non-Gaussian parameters, such as f_NL, measured within smaller patches of the sky. Correlators between n-point functions measured in one patch of the sky and k-point functions measured in another patch depend upon the (n+k)-point functions over the entire sky. The inhomogeneity of non-Gaussian parameters may be a feasible way to detect or constrain higher-order correlators in local models of non-Gaussianity, as well as to distinguish between single and multiple-source scenarios for generating the primordial density perturbation, and more generally to probe the details of inflationary physics.
Learning conditional Gaussian networks
DEFF Research Database (Denmark)
Bøttcher, Susanne Gammelgaard
This paper considers conditional Gaussian networks. The parameters in the network are learned by using conjugate Bayesian analysis. As conjugate local priors, we apply the Dirichlet distribution for discrete variables and the Gaussian-inverse gamma distribution for continuous variables, given...... a configuration of the discrete parents. We assume parameter independence and complete data. Further, to learn the structure of the network, the network score is deduced. We then develop a local master prior procedure, for deriving parameter priors in these networks. This procedure satisfies parameter...... independence, parameter modularity and likelihood equivalence. Bayes factors to be used in model search are introduced. Finally the methods derived are illustrated by a simple example....
Directory of Open Access Journals (Sweden)
Fateh Nassim Melzi
2017-09-01
Full Text Available The large amount of data collected by smart meters is a valuable resource that can be used to better understand consumer behavior and optimize electricity consumption in cities. This paper presents an unsupervised classification approach for extracting typical consumption patterns from data generated by smart electric meters. The proposed approach is based on a constrained Gaussian mixture model whose parameters vary according to the day type (weekday, Saturday or Sunday. The proposed methodology is applied to a real dataset of Irish households collected by smart meters over one year. For each cluster, the model provides three consumption profiles that depend on the day type. In the first instance, the model is applied on the electricity consumption of users during one month to extract groups of consumers who exhibit similar consumption behaviors. The clustering results are then crossed with contextual variables available for the households to show the close links between electricity consumption and household socio-economic characteristics. At the second instance, the evolution of the consumer behavior from one month to another is assessed through variations of cluster sizes over time. The results show that the consumer behavior evolves over time depending on the contextual variables such as temperature fluctuations and calendar events.
KONVERGENSI ESTIMATOR DALAM MODEL MIXTURE BERBASIS MISSING DATA
Directory of Open Access Journals (Sweden)
N Dwidayati
2014-06-01
Full Text Available Abstrak __________________________________________________________________________________________ Model mixture dapat mengestimasi proporsi pasien yang sembuh (cured dan fungsi survival pasien tak sembuh (uncured. Pada kajian ini, model mixture dikembangkan untuk analisis cure rate berbasis missing data. Ada beberapa metode yang dapat digunakan untuk analisis missing data. Salah satu metode yang dapat digunakan adalah Algoritma EM, Metode ini didasarkan pada 2 (dua langkah, yaitu: (1 Expectation Step dan (2 Maximization Step. Algoritma EM merupakan pendekatan iterasi untuk mempelajari model dari data dengan nilai hilang melalui 4 (empat langkah, yaitu(1 pilih himpunan inisial dari parameter untuk sebuah model, (2 tentukan nilai ekspektasi untuk data hilang, (3 buat induksi parameter model baru dari gabungan nilai ekspekstasi dan data asli, dan (4 jika parameter tidak converged, ulangi langkah 2 menggunakan model baru. Berdasar kajian yang dilakukan dapat ditunjukkan bahwa pada algoritma EM, log-likelihood untuk missing data mengalami kenaikan setelah dilakukan setiap iterasi dari algoritmanya. Dengan demikian berdasar algoritma EM, barisan likelihood konvergen jika likelihood terbatas ke bawah. Abstract __________________________________________________________________________________________ Model mixture can estimate proportion of recovering patient and function of patient survival do not recover. At this study, model mixture developed to analyse cure rate bases on missing data. There are some method which applicable to analyse missing data. One of method which can be applied is Algoritma EM, This method based on 2 ( two step, that is: ( 1 Expectation Step and ( 2 Maximization Step. EM Algorithm is approach of iteration to study model from data with value loses through 4 ( four step, yaitu(1 select;chooses initial gathering from parameter for a model, ( 2 determines expectation value for data to lose, ( 3 induce newfangled parameter
Thresholding functional connectomes by means of mixture modeling.
Bielczyk, Natalia Z; Walocha, Fabian; Ebel, Patrick W; Haak, Koen V; Llera, Alberto; Buitelaar, Jan K; Glennon, Jeffrey C; Beckmann, Christian F
2018-05-01
Functional connectivity has been shown to be a very promising tool for studying the large-scale functional architecture of the human brain. In network research in fMRI, functional connectivity is considered as a set of pair-wise interactions between the nodes of the network. These interactions are typically operationalized through the full or partial correlation between all pairs of regional time series. Estimating the structure of the latent underlying functional connectome from the set of pair-wise partial correlations remains an open research problem though. Typically, this thresholding problem is approached by proportional thresholding, or by means of parametric or non-parametric permutation testing across a cohort of subjects at each possible connection. As an alternative, we propose a data-driven thresholding approach for network matrices on the basis of mixture modeling. This approach allows for creating subject-specific sparse connectomes by modeling the full set of partial correlations as a mixture of low correlation values associated with weak or unreliable edges in the connectome and a sparse set of reliable connections. Consequently, we propose to use alternative thresholding strategy based on the model fit using pseudo-False Discovery Rates derived on the basis of the empirical null estimated as part of the mixture distribution. We evaluate the method on synthetic benchmark fMRI datasets where the underlying network structure is known, and demonstrate that it gives improved performance with respect to the alternative methods for thresholding connectomes, given the canonical thresholding levels. We also demonstrate that mixture modeling gives highly reproducible results when applied to the functional connectomes of the visual system derived from the n-back Working Memory task in the Human Connectome Project. The sparse connectomes obtained from mixture modeling are further discussed in the light of the previous knowledge of the functional architecture
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...
Scaled unscented transform Gaussian sum filter: Theory and application
Luo, Xiaodong
2010-05-01
In this work we consider the state estimation problem in nonlinear/non-Gaussian systems. We introduce a framework, called the scaled unscented transform Gaussian sum filter (SUT-GSF), which combines two ideas: the scaled unscented Kalman filter (SUKF) based on the concept of scaled unscented transform (SUT) (Julier and Uhlmann (2004) [16]), and the Gaussian mixture model (GMM). The SUT is used to approximate the mean and covariance of a Gaussian random variable which is transformed by a nonlinear function, while the GMM is adopted to approximate the probability density function (pdf) of a random variable through a set of Gaussian distributions. With these two tools, a framework can be set up to assimilate nonlinear systems in a recursive way. Within this framework, one can treat a nonlinear stochastic system as a mixture model of a set of sub-systems, each of which takes the form of a nonlinear system driven by a known Gaussian random process. Then, for each sub-system, one applies the SUKF to estimate the mean and covariance of the underlying Gaussian random variable transformed by the nonlinear governing equations of the sub-system. Incorporating the estimations of the sub-systems into the GMM gives an explicit (approximate) form of the pdf, which can be regarded as a "complete" solution to the state estimation problem, as all of the statistical information of interest can be obtained from the explicit form of the pdf (Arulampalam et al. (2002) [7]). In applications, a potential problem of a Gaussian sum filter is that the number of Gaussian distributions may increase very rapidly. To this end, we also propose an auxiliary algorithm to conduct pdf re-approximation so that the number of Gaussian distributions can be reduced. With the auxiliary algorithm, in principle the SUT-GSF can achieve almost the same computational speed as the SUKF if the SUT-GSF is implemented in parallel. As an example, we will use the SUT-GSF to assimilate a 40-dimensional system due to
Modeling viscosity and diffusion of plasma mixtures across coupling regimes
Arnault, Philippe
2014-10-01
Viscosity and diffusion of plasma for pure elements and multicomponent mixtures are modeled from the high-temperature low-density weakly coupled regime to the low-temperature high-density strongly coupled regime. Thanks to an atom in jellium modeling, the effect of electron screening on the ion-ion interaction is incorporated through a self-consistent definition of the ionization. This defines an effective One Component Plasma, or an effective Binary Ionic Mixture, that is representative of the strength of the interaction. For the viscosity and the interdiffusion of mixtures, approximate kinetic expressions are supplemented by mixing laws applied to the excess viscosity and self-diffusion of pure elements. The comparisons with classical and quantum molecular dynamics results reveal deviations in the range 20--40% on average with almost no predictions further than a factor of 2 over many decades of variation. Applications in the inertial confinement fusion context could help in predicting the growth of hydrodynamic instabilities.
Bayesian mixture models for source separation in MEG
International Nuclear Information System (INIS)
Calvetti, Daniela; Homa, Laura; Somersalo, Erkki
2011-01-01
This paper discusses the problem of imaging electromagnetic brain activity from measurements of the induced magnetic field outside the head. This imaging modality, magnetoencephalography (MEG), is known to be severely ill posed, and in order to obtain useful estimates for the activity map, complementary information needs to be used to regularize the problem. In this paper, a particular emphasis is on finding non-superficial focal sources that induce a magnetic field that may be confused with noise due to external sources and with distributed brain noise. The data are assumed to come from a mixture of a focal source and a spatially distributed possibly virtual source; hence, to differentiate between those two components, the problem is solved within a Bayesian framework, with a mixture model prior encoding the information that different sources may be concurrently active. The mixture model prior combines one density that favors strongly focal sources and another that favors spatially distributed sources, interpreted as clutter in the source estimation. Furthermore, to address the challenge of localizing deep focal sources, a novel depth sounding algorithm is suggested, and it is shown with simulated data that the method is able to distinguish between a signal arising from a deep focal source and a clutter signal. (paper)
Sand - rubber mixtures submitted to isotropic loading: a minimal model
Platzer, Auriane; Rouhanifar, Salman; Richard, Patrick; Cazacliu, Bogdan; Ibraim, Erdin
2017-06-01
The volume of scrap tyres, an undesired urban waste, is increasing rapidly in every country. Mixing sand and rubber particles as a lightweight backfill is one of the possible alternatives to avoid stockpiling them in the environment. This paper presents a minimal model aiming to capture the evolution of the void ratio of sand-rubber mixtures undergoing an isotropic compression loading. It is based on the idea that, submitted to a pressure, the rubber chips deform and partially fill the porous space of the system, leading to a decrease of the void ratio with increasing pressure. Our simple approach is capable of reproducing experimental data for two types of sand (a rounded one and a sub-angular one) and up to mixtures composed of 50% of rubber.
Multicomponent gas mixture air bearing modeling via lattice Boltzmann method
Tae Kim, Woo; Kim, Dehee; Hari Vemuri, Sesha; Kang, Soo-Choon; Seung Chung, Pil; Jhon, Myung S.
2011-04-01
As the demand for ultrahigh recording density increases, development of an integrated head disk interface (HDI) modeling tool, which considers the air bearing and lubricant film morphology simultaneously is of paramount importance. To overcome the shortcomings of the existing models based on the modified Reynolds equation (MRE), the lattice Boltzmann method (LBM) is a natural choice in modeling high Knudsen number (Kn) flows owing to its advantages over conventional methods. The transient and parallel nature makes this LBM an attractive tool for the next generation air bearing design. Although LBM has been successfully applied to single component systems, a multicomponent system analysis has been thwarted because of the complexity in coupling the terms for each component. Previous studies have shown good results in modeling immiscible component mixtures by use of an interparticle potential. In this paper, we extend our LBM model to predict the flow rate of high Kn pressure-driven flows in multicomponent gas mixture air bearings, such as the air-helium system. For accurate modeling of slip conditions near the wall, we adopt our LBM scheme with spatially dependent relaxation times for air bearings in HDIs. To verify the accuracy of our code, we tested our scheme via simple two-dimensional benchmark flows. In the pressure-driven flow of an air-helium mixture, we found that the simple linear combination of pure helium and pure air flow rates, based on helium and air mole fraction, gives considerable error when compared to our LBM calculation. Hybridization with the existing MRE database can be adopted with the procedure reported here to develop the state-of-the-art slider design software.
Tokuda, Tomoki; Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Directory of Open Access Journals (Sweden)
Tomoki Tokuda
Full Text Available We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong
2015-01-01
semi-parametric Dirichlet process mixture (DPM) of normal distributions for three individual VOCs (chloroform, 1,4-DCB, and styrene). Goodness of fit for these full distribution models was also evaluated using simulated data. Specific Aim 2 Mixtures in the RIOPA VOC data set were identified using positive matrix factorization (PMF) and by toxicologic mode of action. Dependency structures of a mixture’s components were examined using mixture fractions and were modeled using copulas, which address correlations of multiple components across their entire distributions. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) were evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks were calculated for mixtures, and results from copulas and multivariate lognormal models were compared with risks based on RIOPA observations. Specific Aim 3 Exposure determinants were identified using stepwise regressions and linear mixed-effects models (LMMs). RESULTS Specific Aim 1 Extreme value exposures in RIOPA typically were best fitted by three-parameter generalized extreme value (GEV) distributions, and sometimes by the two-parameter Gumbel distribution. In contrast, lognormal distributions significantly underestimated both the level and likelihood of extreme values. Among the VOCs measured in RIOPA, 1,4-dichlorobenzene (1,4-DCB) was associated with the greatest cancer risks; for example, for the highest 10% of measurements of 1,4-DCB, all individuals had risk levels above 10−4, and 13% of all participants had risk levels above 10−2. Of the full-distribution models, the finite mixture of normal distributions with two to four clusters and the DPM of normal distributions had superior performance in comparison with the lognormal models. DPM distributions provided slightly better fit than the finite mixture distributions; the advantages of the DPM model were avoiding certain convergence issues associated
Davoodi, H.; Noori, M.
1990-07-01
The work presented in this paper constitutes the second phase of on-going research aimed at developing mathematical models for representing general hysteretic behavior of structures and approximation techniques for the computation and analysis of the response of hysteretic systems to random excitations. In this second part, the technique previously developed by the authors for the Gaussian response analysis of non-linear systems with general hysteretic behavior is extended for the non-Gaussian analysis of these systems. This approximation technique is based on the approach proposed independently by Ibrahim and Wu-Lin. In this work up to fourth order moments of the response co-ordinates are obtained for the Bouc-Baber-Wen smooth hysteresis model. These higher order statistics previously have not been made available for general hysteresis models by using existing approximation methods. Second order moments obtained for the model by this non-Gaussian closure scheme are compared with equivalent linearization and Gaussian closure results via Monte Carlo simulation (MCS). Higher order moments are compared with the simulation results. The study performed for a wide range of degradation parameters and input power spectral density ( PSD) levels shows that the non-Gaussian responses obtained by this approach are in better agreement with the MCS results than the linearized and Gaussian ones. This approximation technique can provide information on higher order moments for general hysteretic systems. This information is valuable in random vibration and the reliability analysis of hysteretically yielding structures.
Kumari, Vandana; Kumar, Ayush; Saxena, Manoj; Gupta, Mridula
2018-01-01
The sub-threshold model formulation of Gaussian Doped Double Gate JunctionLess (GD-DG-JL) FET including source/drain depletion length is reported in the present work under the assumption that the ungated regions are fully depleted. To provide deeper insight into the device performance, the impact of gaussian straggle, channel length, oxide and channel thickness and high-k gate dielectric has been studied using extensive TCAD device simulation.
Detecting Multiple Random Changepoints in Bayesian Piecewise Growth Mixture Models.
Lock, Eric F; Kohli, Nidhi; Bose, Maitreyee
2017-11-17
Piecewise growth mixture models are a flexible and useful class of methods for analyzing segmented trends in individual growth trajectory over time, where the individuals come from a mixture of two or more latent classes. These models allow each segment of the overall developmental process within each class to have a different functional form; examples include two linear phases of growth, or a quadratic phase followed by a linear phase. The changepoint (knot) is the time of transition from one developmental phase (segment) to another. Inferring the location of the changepoint(s) is often of practical interest, along with inference for other model parameters. A random changepoint allows for individual differences in the transition time within each class. The primary objectives of our study are as follows: (1) to develop a PGMM using a Bayesian inference approach that allows the estimation of multiple random changepoints within each class; (2) to develop a procedure to empirically detect the number of random changepoints within each class; and (3) to empirically investigate the bias and precision of the estimation of the model parameters, including the random changepoints, via a simulation study. We have developed the user-friendly package BayesianPGMM for R to facilitate the adoption of this methodology in practice, which is available at https://github.com/lockEF/BayesianPGMM . We describe an application to mouse-tracking data for a visual recognition task.
Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong
2014-06-01
-parametric Dirichlet process mixture (DPM) of normal distributions for three individual VOCs (chloroform, 1,4-DCB, and styrene). Goodness of fit for these full distribution models was also evaluated using simulated data. Specific Aim 2. Mixtures in the RIOPA VOC data set were identified using positive matrix factorization (PMF) and by toxicologic mode of action. Dependency structures of a mixture's components were examined using mixture fractions and were modeled using copulas, which address correlations of multiple components across their entire distributions. Five candidate copulas (Gaussian, t, Gumbel, Clayton, and Frank) were evaluated, and the performance of fitted models was evaluated using simulation and mixture fractions. Cumulative cancer risks were calculated for mixtures, and results from copulas and multivariate lognormal models were compared with risks based on RIOPA observations. Specific Aim 3. Exposure determinants were identified using stepwise regressions and linear mixed-effects models (LMMs). Specific Aim 1. Extreme value exposures in RIOPA typically were best fitted by three-parameter generalized extreme value (GEV) distributions, and sometimes by the two-parameter Gumbel distribution. In contrast, lognormal distributions significantly underestimated both the level and likelihood of extreme values. Among the VOCs measured in RIOPA, 1,4-dichlorobenzene (1,4-DCB) was associated with the greatest cancer risks; for example, for the highest 10% of measurements of 1,4-DCB, all individuals had risk levels above 10(-4), and 13% of all participants had risk levels above 10(-2). Of the full-distribution models, the finite mixture of normal distributions with two to four clusters and the DPM of normal distributions had superior performance in comparison with the lognormal models. DPM distributions provided slightly better fit than the finite mixture distributions; the advantages of the DPM model were avoiding certain convergence issues associated with the finite mixture
International Nuclear Information System (INIS)
Rawat, Gopal; Kumar, Sanjay; Goel, Ekta; Kumar, Mirgender; Jit, S.; Dubey, Sarvesh
2014-01-01
This paper presents the analytical modeling of subthreshold current and subthreshold swing of short-channel fully-depleted (FD) strained-Si-on-insulator (SSOI) MOSFETs having vertical Gaussian-like doping profile in the channel. The subthreshold current and subthreshold swing have been derived using the parabolic approximation method. In addition to the effect of strain on silicon layer, various other device parameters such as channel length (L), gate-oxide thickness (t ox ), strained-Si channel thickness (t s-Si ), peak doping concentration (N P ), project range (R p ) and straggle (σ p ) of the Gaussian profile have been considered while predicting the device characteristics. The present work may help to overcome the degradation in subthreshold characteristics with strain engineering. These subthreshold current and swing models provide valuable information for strained-Si MOSFET design. Accuracy of the proposed models is verified using the commercially available ATLAS™, a two-dimensional (2D) device simulator from SILVACO. (semiconductor devices)
On population size estimators in the Poisson mixture model.
Mao, Chang Xuan; Yang, Nan; Zhong, Jinhua
2013-09-01
Estimating population sizes via capture-recapture experiments has enormous applications. The Poisson mixture model can be adopted for those applications with a single list in which individuals appear one or more times. We compare several nonparametric estimators, including the Chao estimator, the Zelterman estimator, two jackknife estimators and the bootstrap estimator. The target parameter of the Chao estimator is a lower bound of the population size. Those of the other four estimators are not lower bounds, and they may produce lower confidence limits for the population size with poor coverage probabilities. A simulation study is reported and two examples are investigated. © 2013, The International Biometric Society.
Modeling adsorption of liquid mixtures on porous materials
DEFF Research Database (Denmark)
Monsalvo, Matias Alfonso; Shapiro, Alexander
2009-01-01
The multicomponent potential theory of adsorption (MPTA), which was previously applied to adsorption from gases, is extended onto adsorption of liquid mixtures on porous materials. In the MPTA, the adsorbed fluid is considered as an inhomogeneous liquid with thermodynamic properties that depend...... of the MPTA onto liquids has been tested on experimental binary and ternary adsorption data. We show that, for the set of experimental data considered in this work, the MPTA model is capable of correlating binary adsorption equilibria. Based on binary adsorption data, the theory can then predict ternary...
Experiments with Mixtures Designs, Models, and the Analysis of Mixture Data
Cornell, John A
2011-01-01
The most comprehensive, single-volume guide to conducting experiments with mixtures"If one is involved, or heavily interested, in experiments on mixtures of ingredients, one must obtain this book. It is, as was the first edition, the definitive work."-Short Book Reviews (Publication of the International Statistical Institute)"The text contains many examples with worked solutions and with its extensive coverage of the subject matter will prove invaluable to those in the industrial and educational sectors whose work involves the design and analysis of mixture experiments."-Journal of the Royal S
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Directory of Open Access Journals (Sweden)
L. Li
2012-02-01
Full Text Available The normal-score ensemble Kalman filter (NS-EnKF is tested on a synthetic aquifer characterized by the presence of channels with a bimodal distribution of its hydraulic conductivities. This is a clear example of an aquifer that cannot be characterized by a multiGaussian distribution. Fourteen scenarios are analyzed which differ among them in one or various of the following aspects: the prior random function model, the boundary conditions of the flow problem, the number of piezometers used in the assimilation process, or the use of covariance localization in the implementation of the Kalman filter. The performance of the NS-EnKF is evaluated through the ensemble mean and variance maps, the connectivity patterns of the individual conductivity realizations and the degree of reproduction of the piezometric heads. The results show that (i the localized NS-EnKF can characterize the non-multiGaussian underlying hydraulic distribution even when an erroneous prior random function model is used, (ii localization plays an important role to prevent filter inbreeding and results in a better logconductivity characterization, and (iii the NS-EnKF works equally well under very different flow configurations.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Microbial comparative pan-genomics using binomial mixture models
Directory of Open Access Journals (Sweden)
Ussery David W
2009-08-01
Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.
New Flexible Models and Design Construction Algorithms for Mixtures and Binary Dependent Variables
A. Ruseckaite (Aiste)
2017-01-01
markdownabstractThis thesis discusses new mixture(-amount) models, choice models and the optimal design of experiments. Two chapters of the thesis relate to the so-called mixture, which is a product or service whose ingredients’ proportions sum to one. The thesis begins by introducing mixture
Metin, Baris; Wiersema, Jan R; Verguts, Tom; Gasthuys, Roos; van Der Meere, Jacob J; Roeyers, Herbert; Sonuga-Barke, Edmund
2014-12-06
According to the state regulation deficit (SRD) account, ADHD is associated with a problem using effort to maintain an optimal activation state under demanding task settings such as very fast or very slow event rates. This leads to a prediction of disrupted performance at event rate extremes reflected in higher Gaussian response variability that is a putative marker of activation during motor preparation. In the current study, we tested this hypothesis using ex-Gaussian modeling, which distinguishes Gaussian from non-Gaussian variability. Twenty-five children with ADHD and 29 typically developing controls performed a simple Go/No-Go task under four different event-rate conditions. There was an accentuated quadratic relationship between event rate and Gaussian variability in the ADHD group compared to the controls. The children with ADHD had greater Gaussian variability at very fast and very slow event rates but not at moderate event rates. The results provide evidence for the SRD account of ADHD. However, given that this effect did not explain all group differences (some of which were independent of event rate) other cognitive and/or motivational processes are also likely implicated in ADHD performance deficits.
Tractography segmentation using a hierarchical Dirichlet processes mixture model.
Wang, Xiaogang; Grimson, W Eric L; Westin, Carl-Fredrik
2011-01-01
In this paper, we propose a new nonparametric Bayesian framework to cluster white matter fiber tracts into bundles using a hierarchical Dirichlet processes mixture (HDPM) model. The number of clusters is automatically learned driven by data with a Dirichlet process (DP) prior instead of being manually specified. After the models of bundles have been learned from training data without supervision, they can be used as priors to cluster/classify fibers of new subjects for comparison across subjects. When clustering fibers of new subjects, new clusters can be created for structures not observed in the training data. Our approach does not require computing pairwise distances between fibers and can cluster a huge set of fibers across multiple subjects. We present results on several data sets, the largest of which has more than 120,000 fibers. Copyright © 2010 Elsevier Inc. All rights reserved.
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
Clustering disaggregated load profiles using a Dirichlet process mixture model
International Nuclear Information System (INIS)
Granell, Ramon; Axon, Colin J.; Wallom, David C.H.
2015-01-01
Highlights: • We show that the Dirichlet process mixture model is scaleable. • Our model does not require the number of clusters as an input. • Our model creates clusters only by the features of the demand profiles. • We have used both residential and commercial data sets. - Abstract: The increasing availability of substantial quantities of power-use data in both the residential and commercial sectors raises the possibility of mining the data to the advantage of both consumers and network operations. We present a Bayesian non-parametric model to cluster load profiles from households and business premises. Evaluators show that our model performs as well as other popular clustering methods, but unlike most other methods it does not require the number of clusters to be predetermined by the user. We used the so-called ‘Chinese restaurant process’ method to solve the model, making use of the Dirichlet-multinomial distribution. The number of clusters grew logarithmically with the quantity of data, making the technique suitable for scaling to large data sets. We were able to show that the model could distinguish features such as the nationality, household size, and type of dwelling between the cluster memberships
Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering
Xiang, Sijia; Yao, Weixin
2017-01-01
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...
Energy Technology Data Exchange (ETDEWEB)
Thienpont, Benedicte; Barata, Carlos [Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA, CSIC), Jordi Girona, 18-26, 08034 Barcelona (Spain); Raldúa, Demetrio, E-mail: drpqam@cid.csic.es [Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA, CSIC), Jordi Girona, 18-26, 08034 Barcelona (Spain); Maladies Rares: Génétique et Métabolisme (MRGM), University of Bordeaux, EA 4576, F-33400 Talence (France)
2013-06-01
Maternal thyroxine (T4) plays an essential role in fetal brain development, and even mild and transitory deficits in free-T4 in pregnant women can produce irreversible neurological effects in their offspring. Women of childbearing age are daily exposed to mixtures of chemicals disrupting the thyroid gland function (TGFDs) through the diet, drinking water, air and pharmaceuticals, which has raised the highest concern for the potential additive or synergic effects on the development of mild hypothyroxinemia during early pregnancy. Recently we demonstrated that zebrafish eleutheroembryos provide a suitable alternative model for screening chemicals impairing the thyroid hormone synthesis. The present study used the intrafollicular T4-content (IT4C) of zebrafish eleutheroembryos as integrative endpoint for testing the hypotheses that the effect of mixtures of TGFDs with a similar mode of action [inhibition of thyroid peroxidase (TPO)] was well predicted by a concentration addition concept (CA) model, whereas the response addition concept (RA) model predicted better the effect of dissimilarly acting binary mixtures of TGFDs [TPO-inhibitors and sodium-iodide symporter (NIS)-inhibitors]. However, CA model provided better prediction of joint effects than RA in five out of the six tested mixtures. The exception being the mixture MMI (TPO-inhibitor)-KClO{sub 4} (NIS-inhibitor) dosed at a fixed ratio of EC{sub 10} that provided similar CA and RA predictions and hence it was difficult to get any conclusive result. There results support the phenomenological similarity criterion stating that the concept of concentration addition could be extended to mixture constituents having common apical endpoints or common adverse outcomes. - Highlights: • Potential synergic or additive effect of mixtures of chemicals on thyroid function. • Zebrafish as alternative model for testing the effect of mixtures of goitrogens. • Concentration addition seems to predict better the effect of
A smooth mixture of Tobits model for healthcare expenditure.
Keane, Michael; Stavrunova, Olena
2011-09-01
This paper develops a smooth mixture of Tobits (SMTobit) model for healthcare expenditure. The model is a generalization of the smoothly mixing regressions framework of Geweke and Keane (J Econometrics 2007; 138: 257-290) to the case of a Tobit-type limited dependent variable. A Markov chain Monte Carlo algorithm with data augmentation is developed to obtain the posterior distribution of model parameters. The model is applied to the US Medicare Current Beneficiary Survey data on total medical expenditure. The results suggest that the model can capture the overall shape of the expenditure distribution very well, and also provide a good fit to a number of characteristics of the conditional (on covariates) distribution of expenditure, such as the conditional mean, variance and probability of extreme outcomes, as well as the 50th, 90th, and 95th, percentiles. We find that healthier individuals face an expenditure distribution with lower mean, variance and probability of extreme outcomes, compared with their counterparts in a worse state of health. Males have an expenditure distribution with higher mean, variance and probability of an extreme outcome, compared with their female counterparts. The results also suggest that heart and cardiovascular diseases affect the expenditure of males more than that of females. Copyright © 2011 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Arghya Chakravorty
2018-03-01
Full Text Available Conventional modeling techniques to model macromolecular solvation and its effect on binding in the framework of Poisson-Boltzmann based implicit solvent models make use of a geometrically defined surface to depict the separation of macromolecular interior (low dielectric constant from the solvent phase (high dielectric constant. Though this simplification saves time and computational resources without significantly compromising the accuracy of free energy calculations, it bypasses some of the key physio-chemical properties of the solute-solvent interface, e.g., the altered flexibility of water molecules and that of side chains at the interface, which results in dielectric properties different from both bulk water and macromolecular interior, respectively. Here we present a Gaussian-based smooth dielectric model, an inhomogeneous dielectric distribution model that mimics the effect of macromolecular flexibility and captures the altered properties of surface bound water molecules. Thus, the model delivers a smooth transition of dielectric properties from the macromolecular interior to the solvent phase, eliminating any unphysical surface separating the two phases. Using various examples of macromolecular binding, we demonstrate its utility and illustrate the comparison with the conventional 2-dielectric model. We also showcase some additional abilities of this model, viz. to account for the effect of electrolytes in the solution and to render the distribution profile of water across a lipid membrane.
Extracting Spurious Latent Classes in Growth Mixture Modeling with Nonnormal Errors
Guerra-Peña, Kiero; Steinley, Douglas
2016-01-01
Growth mixture modeling is generally used for two purposes: (1) to identify mixtures of normal subgroups and (2) to approximate oddly shaped distributions by a mixture of normal components. Often in applied research this methodology is applied to both of these situations indistinctly: using the same fit statistics and likelihood ratio tests. This…
Zheng, Qiang; Li, Honglun; Fan, Baode; Wu, Shuanhu; Xu, Jindong
2017-12-01
Active contour model (ACM) has been one of the most widely utilized methods in magnetic resonance (MR) brain image segmentation because of its ability of capturing topology changes. However, most of the existing ACMs only consider single-slice information in MR brain image data, i.e., the information used in ACMs based segmentation method is extracted only from one slice of MR brain image, which cannot take full advantage of the adjacent slice images' information, and cannot satisfy the local segmentation of MR brain images. In this paper, a novel ACM is proposed to solve the problem discussed above, which is based on multi-variate local Gaussian distribution and combines the adjacent slice images' information in MR brain image data to satisfy segmentation. The segmentation is finally achieved through maximizing the likelihood estimation. Experiments demonstrate the advantages of the proposed ACM over the single-slice ACM in local segmentation of MR brain image series.
Induced polarization of clay-sand mixtures: experiments and modeling
International Nuclear Information System (INIS)
Okay, G.; Leroy, P.; Tournassat, C.; Ghorbani, A.; Jougnot, D.; Cosenza, P.; Camerlynck, C.; Cabrera, J.; Florsch, N.; Revil, A.
2012-01-01
were performed with a cylindrical four-electrode sample-holder (cylinder made of PVC with 30 cm in length and 19 cm in diameter) associated with a SIP-Fuchs II impedance meter and non-polarizing Cu/CuSO 4 electrodes. These electrodes were installed at 10 cm from the base of the sample holder and regularly spaced (each 90 degree). The results illustrate the strong impact of the Cationic Exchange Capacity (CEC) of the clay minerals upon the complex conductivity. The amplitude of the in-phase conductivity of the kaolinite-clay samples is strongly dependent to saturating fluid salinity for all volumetric clay fractions, whereas the in-phase conductivity of the smectite-clay samples is quite independent on the salinity, except at the low clay content (5% and 1% of clay in volume). This is due to the strong and constant surface conductivity of smectite associated with its very high CEC. The quadrature conductivity increases steadily with the CEC and the clay content. We observe that the dependence on frequency of the quadrature conductivity of sand-kaolinite mixtures is more important than for sand-bentonite mixtures. For both types of clay, the quadrature conductivity seems to be fairly independent on the pore fluid salinity except at very low clay contents (1% in volume of kaolinite-clay). This is due to the constant surface site density of Na counter-ions in the Stern layer of clay materials. At the lowest clay content (1%), the magnitude of the quadrature conductivity increases with the salinity, as expected for silica sands. In this case, the surface site density of Na counter-ions in the Stern layer increases with salinity. The experimental data show good agreement with predicted values given by our Spectral Induced Polarization (SIP) model. This complex conductivity model considers the electrochemical polarization of the Stern layer coating the clay particles and the Maxwell-Wagner polarization. We use the differential effective medium theory to calculate the complex
Additivity of statistical moments in the exponentially modified Gaussian model of chromatography
International Nuclear Information System (INIS)
Howerton, Samuel B.; Lee Chomin; McGuffin, Victoria L.
2002-01-01
A homologous series of saturated fatty acids ranging from C 10 to C 22 was separated by reversed-phase capillary liquid chromatography. The resultant zone profiles were found to be fit best by an exponentially modified Gaussian (EMG) function. To compare the EMG function and statistical moments for the analysis of the experimental zone profiles, a series of simulated profiles was generated by using fixed values for retention time and different values for the symmetrical (σ) and asymmetrical (τ) contributions to the variance. The simulated profiles were modified with respect to the integration limits, the number of points, and the signal-to-noise ratio. After modification, each profile was analyzed by using statistical moments and an iteratively fit EMG equation. These data indicate that the statistical moment method is much more susceptible to error when the degree of asymmetry is large, when the integration limits are inappropriately chosen, when the number of points is small, and when the signal-to-noise ratio is small. The experimental zone profiles were then analyzed by using the statistical moment and EMG methods. Although care was taken to minimize the sources of error discussed above, significant differences were found between the two methods. The differences in the second moment suggest that the symmetrical and asymmetrical contributions to broadening in the experimental zone profiles are not independent. As a consequence, the second moment is not equal to the sum of σ 2 and τ 2 , as is commonly assumed. This observation has important implications for the elucidation of thermodynamic and kinetic information from chromatographic zone profiles
Toxicological risk assessment of complex mixtures through the Wtox model
Directory of Open Access Journals (Sweden)
William Gerson Matias
2015-01-01
Full Text Available Mathematical models are important tools for environmental management and risk assessment. Predictions about the toxicity of chemical mixtures must be enhanced due to the complexity of eects that can be caused to the living species. In this work, the environmental risk was accessed addressing the need to study the relationship between the organism and xenobiotics. Therefore, ve toxicological endpoints were applied through the WTox Model, and with this methodology we obtained the risk classication of potentially toxic substances. Acute and chronic toxicity, citotoxicity and genotoxicity were observed in the organisms Daphnia magna, Vibrio scheri and Oreochromis niloticus. A case study was conducted with solid wastes from textile, metal-mechanic and pulp and paper industries. The results have shown that several industrial wastes induced mortality, reproductive eects, micronucleus formation and increases in the rate of lipid peroxidation and DNA methylation of the organisms tested. These results, analyzed together through the WTox Model, allowed the classication of the environmental risk of industrial wastes. The evaluation showed that the toxicological environmental risk of the samples analyzed can be classied as signicant or critical.
Directory of Open Access Journals (Sweden)
E. Simon
2009-11-01
Full Text Available We consider the application of the Ensemble Kalman Filter (EnKF to a coupled ocean ecosystem model (HYCOM-NORWECOM. Such models, especially the ecosystem models, are characterized by strongly non-linear interactions active in ocean blooms and present important difficulties for the use of data assimilation methods based on linear statistical analysis. Besides the non-linearity of the model, one is confronted with the model constraints, the analysis state having to be consistent with the model, especially with respect to the constraints that some of the variables have to be positive. Furthermore the non-Gaussian distributions of the biogeochemical variables break an important assumption of the linear analysis, leading to a loss of optimality of the filter. We present an extension of the EnKF dealing with these difficulties by introducing a non-linear change of variables (anamorphosis function in order to execute the analysis step in a Gaussian space, namely a space where the distributions of the transformed variables are Gaussian. We present also the initial results of the application of this non-Gaussian extension of the EnKF to the assimilation of simulated chlorophyll surface concentration data in a North Atlantic configuration of the HYCOM-NORWECOM coupled model.
Bayesian Kernel Mixtures for Counts.
Canale, Antonio; Dunson, David B
2011-12-01
Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.
Maximum Likelihood in a Generalized Linear Finite Mixture Model by Using the EM Algorithm
Jansen, R.C.
A generalized linear finite mixture model and an EM algorithm to fit the model to data are described. By this approach the finite mixture model is embedded within the general framework of generalized linear models (GLMs). Implementation of the proposed EM algorithm can be readily done in statistical
Directory of Open Access Journals (Sweden)
Viet Dung Cao
2013-10-01
Full Text Available Background: We extend the "Wedding Ring‟ agent-based model of marriage formation to include some empirical information on the natural population change for the United Kingdom together with behavioural explanations that drive the observed nuptiality trends. Objective: We propose a method to explore statistical properties of agent-based demographic models. By coupling rule-based explanations driving the agent-based model with observed data we wish to bring agent-based modelling and demographic analysis closer together. Methods: We present a Semi-Artificial Model of Population, which aims to bridge demographic micro-simulation and agent-based traditions. We then utilise a Gaussian process emulator - a statistical model of the base model - to analyse the impact of selected model parameters on two key model outputs: population size and share of married agents. A sensitivity analysis is attempted, aiming to assess the relative importance of different inputs. Results: The resulting multi-state model of population dynamics has enhanced predictive capacity as compared to the original specification of the Wedding Ring, but there are some trade-offs between the outputs considered. The sensitivity analysis allows identification of the most important parameters in the modelled marriage formation process. Conclusions: The proposed methods allow for generating coherent, multi-level agent-based scenarios aligned with some aspects of empirical demographic reality. Emulators permit a statistical analysis of their properties and help select plausible parameter values. Comments: Given non-linearities in agent-based models such as the Wedding Ring, and the presence of feedback loops, the uncertainty in the model may not be directly computable by using traditional statistical methods. The use of statistical emulators offers a way forward.
Energy Technology Data Exchange (ETDEWEB)
Li, F; Park, J; Barraclough, B; Lu, B; Li, J; Liu, C; Yan, G [University Florida, Gainesville, FL (United States)
2016-06-15
Purpose: To develop an efficient and accurate independent dose calculation algorithm with a simplified analytical source model for the quality assurance and safe delivery of Flattening Filter Free (FFF)-IMRT on an Elekta Versa HD. Methods: The source model consisted of a point source and a 2D bivariate Gaussian source, respectively modeling the primary photons and the combined effect of head scatter, monitor chamber backscatter and collimator exchange effect. The in-air fluence was firstly calculated by back-projecting the edges of beam defining devices onto the source plane and integrating the visible source distribution. The effect of the rounded MLC leaf end, tongue-and-groove and interleaf transmission was taken into account in the back-projection. The in-air fluence was then modified with a fourth degree polynomial modeling the cone-shaped dose distribution of FFF beams. Planar dose distribution was obtained by convolving the in-air fluence with a dose deposition kernel (DDK) consisting of the sum of three 2D Gaussian functions. The parameters of the source model and the DDK were commissioned using measured in-air output factors (Sc) and cross beam profiles, respectively. A novel method was used to eliminate the volume averaging effect of ion chambers in determining the DDK. Planar dose distributions of five head-and-neck FFF-IMRT plans were calculated and compared against measurements performed with a 2D diode array (MapCHECK™) to validate the accuracy of the algorithm. Results: The proposed source model predicted Sc for both 6MV and 10MV with an accuracy better than 0.1%. With a stringent gamma criterion (2%/2mm/local difference), the passing rate of the FFF-IMRT dose calculation was 97.2±2.6%. Conclusion: The removal of the flattening filter represents a simplification of the head structure which allows the use of a simpler source model for very accurate dose calculation. The proposed algorithm offers an effective way to ensure the safe delivery of FFF-IMRT.
Wright, Aidan G C; Hallquist, Michael N
2014-01-01
Studying personality and its pathology as it changes, develops, or remains stable over time offers exciting insight into the nature of individual differences. Researchers interested in examining personal characteristics over time have a number of time-honored analytic approaches at their disposal. In recent years there have also been considerable advances in person-oriented analytic approaches, particularly longitudinal mixture models. In this methodological primer we focus on mixture modeling approaches to the study of normative and individual change in the form of growth mixture models and ipsative change in the form of latent transition analysis. We describe the conceptual underpinnings of each of these models, outline approaches for their implementation, and provide accessible examples for researchers studying personality and its assessment.
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
Identifying Mixtures of Mixtures Using Bayesian Estimation
Malsiner-Walli, Gertraud; Frühwirth-Schnatter, Sylvia; Grün, Bettina
2017-01-01
ABSTRACT The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online. PMID:28626349
International Nuclear Information System (INIS)
Marrel, A.
2008-01-01
In the studies of environmental transfer and risk assessment, numerical models are used to simulate, understand and predict the transfer of pollutant. These computer codes can depend on a high number of uncertain input parameters (geophysical variables, chemical parameters, etc.) and can be often too computer time expensive. To conduct uncertainty propagation studies and to measure the importance of each input on the response variability, the computer code has to be approximated by a meta model which is build on an acceptable number of simulations of the code and requires a negligible calculation time. We focused our research work on the use of Gaussian process meta model to make the sensitivity analysis of the code. We proposed a methodology with estimation and input selection procedures in order to build the meta model in the case of a high number of inputs and with few simulations available. Then, we compared two approaches to compute the sensitivity indices with the meta model and proposed an algorithm to build prediction intervals for these indices. Afterwards, we were interested in the choice of the code simulations. We studied the influence of different sampling strategies on the predictiveness of the Gaussian process meta model. Finally, we extended our statistical tools to a functional output of a computer code. We combined a decomposition on a wavelet basis with the Gaussian process modelling before computing the functional sensitivity indices. All the tools and statistical methodologies that we developed were applied to the real case of a complex hydrogeological computer code, simulating radionuclide transport in groundwater. (author) [fr
Energy Technology Data Exchange (ETDEWEB)
Martínez-Tossas, L. A. [Department of Mechanical Engineering, Johns Hopkins University, Baltimore 21218 MD USA; Churchfield, M. J. [National Renewable Energy Laboratory, Golden 80401 CO USA; Meneveau, C. [Department of Mechanical Engineering, Johns Hopkins University, Baltimore 21218 MD USA
2017-01-20
The actuator line model (ALM) is a commonly used method to represent lifting surfaces such as wind turbine blades within large-eddy simulations (LES). In the ALM, the lift and drag forces are replaced by an imposed body force that is typically smoothed over several grid points using a Gaussian kernel with some prescribed smoothing width e. To date, the choice of e has most often been based on numerical considerations related to the grid spacing used in LES. However, especially for finely resolved LES with grid spacings on the order of or smaller than the chord length of the blade, the best choice of e is not known. In this work, a theoretical approach is followed to determine the most suitable value of e, based on an analytical solution to the linearized inviscid flow response to a Gaussian force. We find that the optimal smoothing width eopt is on the order of 14%-25% of the chord length of the blade, and the center of force is located at about 13%-26% downstream of the leading edge of the blade for the cases considered. These optimal values do not depend on angle of attack and depend only weakly on the type of lifting surface. It is then shown that an even more realistic velocity field can be induced by a 2-D elliptical Gaussian lift-force kernel. Some results are also provided regarding drag force representation.
Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling.
Nord, Camilla L; Valton, Vincent; Wood, John; Roiser, Jonathan P
2017-08-23
Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered-some very seriously so-but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially
Lamont, A.E.; Vermunt, J.K.; Van Horn, M.L.
2016-01-01
Regression mixture models are increasingly used as an exploratory approach to identify heterogeneity in the effects of a predictor on an outcome. In this simulation study, we tested the effects of violating an implicit assumption often made in these models; that is, independent variables in the
Matacchiera, F; Manes, C; Beaven, R P; Rees-White, T C; Boano, F; Mønster, J; Scheutz, C
2018-02-13
The measurement of methane emissions from landfills is important to the understanding of landfills' contribution to greenhouse gas emissions. The Tracer Dispersion Method (TDM) is becoming widely accepted as a technique, which allows landfill emissions to be quantified accurately provided that measurements are taken where the plumes of a released tracer-gas and landfill-gas are well-mixed. However, the distance at which full mixing of the gases occurs is generally unknown prior to any experimental campaign. To overcome this problem the present paper demonstrates that, for any specific TDM application, a simple Gaussian dispersion model (AERMOD) can be run beforehand to help determine the distance from the source at which full mixing conditions occur, and the likely associated measurement errors. An AERMOD model was created to simulate a series of TDM trials carried out at a UK landfill, and was benchmarked against the experimental data obtained. The model was used to investigate the impact of different factors (e.g. tracer cylinder placements, wind directions, atmospheric stability parameters) on TDM results to identify appropriate experimental set ups for different conditions. The contribution of incomplete vertical mixing of tracer and landfill gas on TDM measurement error was explored using the model. It was observed that full mixing conditions at ground level do not imply full mixing over the entire plume height. However, when full mixing conditions were satisfied at ground level, then the error introduced by variations in mixing higher up were always less than 10%. Copyright © 2018. Published by Elsevier Ltd.
RIM: A Random Item Mixture Model to Detect Differential Item Functioning
Frederickx, Sofie; Tuerlinckx, Francis; De Boeck, Paul; Magis, David
2010-01-01
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the…
RIM: A random item mixture model to detect Differential Item Functioning
Frederickx, S.; Tuerlinckx, T.; de Boeck, P.; Magis, D.
2010-01-01
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is
Identifiability in N-mixture models: a large-scale screening test with bird data.
Kéry, Marc
2018-02-01
Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike's information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help. © 2017 by the Ecological Society of America.
Wetting kinetics of oil mixtures on fluorinated model cellulose surfaces.
Aulin, Christian; Shchukarev, Andrei; Lindqvist, Josefina; Malmström, Eva; Wågberg, Lars; Lindström, Tom
2008-01-15
The wetting of two different model cellulose surfaces has been studied; a regenerated cellulose (RG) surface prepared by spin-coating, and a novel multilayer film of poly(ethyleneimine) and a carboxymethylated microfibrillated cellulose (MFC). The cellulose films were characterized in detail using atomic force microscopy (AFM) and X-ray photoelectron spectroscopy (XPS). AFM indicates smooth and continuous films on a nanometer scale and the RMS roughness of the RG cellulose and MFC surfaces was determined to be 3 and 6 nm, respectively. The cellulose films were modified by coating with various amounts of an anionic fluorosurfactant, perfluorooctadecanoic acid, or covalently modified with pentadecafluorooctanyl chloride. The fluorinated cellulose films were used to follow the spreading mechanisms of three different oil mixtures. The viscosity and surface tension of the oils were found to be essential parameters governing the spreading kinetics on these surfaces. XPS and dispersive surface energy measurements were made on the cellulose films coated with perfluorooctadecanoic acid. A strong correlation was found between the surface concentration of fluorine, the dispersive surface energy and the contact angle of castor oil on the surface. A dispersive surface energy less than 18 mN/m was required in order for the cellulose surface to be non-wetting (theta e>90 degrees ) by castor oil.
Theoretical models for fluid thermodynamics based on the quasi-Gaussian entropy theory
Amadei, Andrea
1998-01-01
Summary The theoretical modeling of fluid thermodynamics is one of the most challenging fields in physical chemistry. In fact the fluid behavior, except at very low density conditions, is still extremely difficult to be modeled from a statistical mechanical point of view, as for any realistic model
Modeling abundance using N-mixture models: the importance of considering ecological mechanisms.
Joseph, Liana N; Elkin, Ché; Martin, Tara G; Possinghami, Hugh P
2009-04-01
Predicting abundance across a species' distribution is useful for studies of ecology and biodiversity management. Modeling of survey data in relation to environmental variables can be a powerful method for extrapolating abundances across a species' distribution and, consequently, calculating total abundances and ultimately trends. Research in this area has demonstrated that models of abundance are often unstable and produce spurious estimates, and until recently our ability to remove detection error limited the development of accurate models. The N-mixture model accounts for detection and abundance simultaneously and has been a significant advance in abundance modeling. Case studies that have tested these new models have demonstrated success for some species, but doubt remains over the appropriateness of standard N-mixture models for many species. Here we develop the N-mixture model to accommodate zero-inflated data, a common occurrence in ecology, by employing zero-inflated count models. To our knowledge, this is the first application of this method to modeling count data. We use four variants of the N-mixture model (Poisson, zero-inflated Poisson, negative binomial, and zero-inflated negative binomial) to model abundance, occupancy (zero-inflated models only) and detection probability of six birds in South Australia. We assess models by their statistical fit and the ecological realism of the parameter estimates. Specifically, we assess the statistical fit with AIC and assess the ecological realism by comparing the parameter estimates with expected values derived from literature, ecological theory, and expert opinion. We demonstrate that, despite being frequently ranked the "best model" according to AIC, the negative binomial variants of the N-mixture often produce ecologically unrealistic parameter estimates. The zero-inflated Poisson variant is preferable to the negative binomial variants of the N-mixture, as it models an ecological mechanism rather than a
Directory of Open Access Journals (Sweden)
Douglas A. Fynan
2016-06-01
Full Text Available The Gaussian process model (GPM is a flexible surrogate model that can be used for nonparametric regression for multivariate problems. A unique feature of the GPM is that a prediction variance is automatically provided with the regression function. In this paper, we estimate the safety margin of a nuclear power plant by performing regression on the output of best-estimate simulations of a large-break loss-of-coolant accident with sampling of safety system configuration, sequence timing, technical specifications, and thermal hydraulic parameter uncertainties. The key aspect of our approach is that the GPM regression is only performed on the dominant input variables, the safety injection flow rate and the delay time for AC powered pumps to start representing sequence timing uncertainty, providing a predictive model for the peak clad temperature during a reflood phase. Other uncertainties are interpreted as contributors to the measurement noise of the code output and are implicitly treated in the GPM in the noise variance term, providing local uncertainty bounds for the peak clad temperature. We discuss the applicability of the foregoing method to reduce the use of conservative assumptions in best estimate plus uncertainty (BEPU and Level 1 probabilistic safety assessment (PSA success criteria definitions while dealing with a large number of uncertainties.
Extensions of D-optimal Minimal Designs for Symmetric Mixture Models
Li, Yanyan; Raghavarao, Damaraju; Chervoneva, Inna
2016-01-01
The purpose of mixture experiments is to explore the optimum blends of mixture components, which will provide desirable response characteristics in finished products. D-optimal minimal designs have been considered for a variety of mixture models, including Scheffé's linear, quadratic, and cubic models. Usually, these D-optimal designs are minimally supported since they have just as many design points as the number of parameters. Thus, they lack the degrees of freedom to perform the Lack of Fi...
A Note on the Use of Mixture Models for Individual Prediction.
Cole, Veronica T; Bauer, Daniel J
Mixture models capture heterogeneity in data by decomposing the population into latent subgroups, each of which is governed by its own subgroup-specific set of parameters. Despite the flexibility and widespread use of these models, most applications have focused solely on making inferences for whole or sub-populations, rather than individual cases. The current article presents a general framework for computing marginal and conditional predicted values for individuals using mixture model results. These predicted values can be used to characterize covariate effects, examine the fit of the model for specific individuals, or forecast future observations from previous ones. Two empirical examples are provided to demonstrate the usefulness of individual predicted values in applications of mixture models. The first example examines the relative timing of initiation of substance use using a multiple event process survival mixture model whereas the second example evaluates changes in depressive symptoms over adolescence using a growth mixture model.
a Merton-Like Approach to Pricing Debt Based on a Non-Gaussian Asset Model
Borland, Lisa; Evnine, Jeremy; Pochart, Benoit
2005-09-01
We propose a generalization to Merton's model for evaluating credit spreads. In his original work, a company's assets were assumed to follow a log-normal process. We introduce fat tails and skew into this model, along the same lines as in the option pricing model of Borland and Bouchaud (2004, Quantitative Finance 4) and illustrate the effects of each component. Preliminary empirical results indicate that this model fits well to empirically observed credit spreads with a parameterization that also matched observed stock return distributions and option prices.
How Non-Gaussian Shocks Affect Risk Premia in Non-Linear DSGE Models
DEFF Research Database (Denmark)
Andreasen, Martin Møller
premia in a wide class of DSGE models. To quantify these effects, we then set up a standard New Keynesian DSGE model where total factor productivity includes rare disasters, stochastic volatility, and GARCH. We …find that rare disasters increase the mean level of the 10-year nominal term premium, whereas...
An integrated numerical model for the prediction of Gaussian and billet shapes
DEFF Research Database (Denmark)
Hattel, Jesper; Pryds, Nini; Pedersen, Trine Bjerre
2004-01-01
Separate models for the atomisation and the deposition stages were recently integrated by the authors to form a unified model describing the entire spray-forming process. In the present paper, the focus is on describing the shape of the deposited material during the spray-forming process, obtaine...
A direct derivation of the exact Fisther information matrix of Gaussian vector state space models
Klein, A.A.B.; Neudecker, H.
2000-01-01
This paper deals with a direct derivation of Fisher's information matrix of vector state space models for the general case, by which is meant the establishment of the matrix as a whole and not element by element. The method to be used is matrix differentiation, see [4]. We assume the model to be
CSIR Research Space (South Africa)
Miya, WS
2008-10-01
Full Text Available method. The first stage detects whether the bushing is faulty or normal while the second stage classifies the fault. Experimentation is conducted using dissolve gas-in-oil analysis (DGA) data collected from bushings based on IEEEc57.104; IEC60599 and IEEE...
de Jong, Martijn G.; Steenkamp, Jan-Benedict E. M.
2010-01-01
We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while…
An NCME Instructional Module on Latent DIF Analysis Using Mixture Item Response Models
Cho, Sun-Joo; Suh, Youngsuk; Lee, Woo-yeol
2016-01-01
The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called…
Modelling of phase equilibria of glycol ethers mixtures using an association model
DEFF Research Database (Denmark)
Garrido, Nuno M.; Folas, Georgios; Kontogeorgis, Georgios
2008-01-01
Vapor-liquid and liquid-liquid equilibria of glycol ethers (surfactant) mixtures with hydrocarbons, polar compounds and water are calculated using an association model, the Cubic-Plus-Association Equation of State. Parameters are estimated for several non-ionic surfactants of the polyoxyethylene...
Jin, Ick Hoon; Yuan, Ying; Bandyopadhyay, Dipankar
2016-01-01
Research in dental caries generates data with two levels of hierarchy: that of a tooth overall and that of the different surfaces of the tooth. The outcomes often exhibit spatial referencing among neighboring teeth and surfaces, i.e., the disease status of a tooth or surface might be influenced by the status of a set of proximal teeth/surfaces. Assessments of dental caries (tooth decay) at the tooth level yield binary outcomes indicating the presence/absence of teeth, and trinary outcomes at the surface level indicating healthy, decayed, or filled surfaces. The presence of these mixed discrete responses complicates the data analysis under a unified framework. To mitigate complications, we develop a Bayesian two-level hierarchical model under suitable (spatial) Markov random field assumptions that accommodates the natural hierarchy within the mixed responses. At the first level, we utilize an autologistic model to accommodate the spatial dependence for the tooth-level binary outcomes. For the second level and conditioned on a tooth being non-missing, we utilize a Potts model to accommodate the spatial referencing for the surface-level trinary outcomes. The regression models at both levels were controlled for plausible covariates (risk factors) of caries, and remain connected through shared parameters. To tackle the computational challenges in our Bayesian estimation scheme caused due to the doubly-intractable normalizing constant, we employ a double Metropolis-Hastings sampler. We compare and contrast our model performances to the standard non-spatial (naive) model using a small simulation study, and illustrate via an application to a clinical dataset on dental caries.
Application of association models to mixtures containing alkanolamines
DEFF Research Database (Denmark)
Avlund, Ane Søgaard; Eriksen, Daniel Kunisch; Kontogeorgis, Georgios
2011-01-01
. The role of association schemes is investigated in connection with CPA, while for sPC-SAFT emphasisis given on the role of different types of data in the determination of pure compound parameters suitable for mixture calculations. Moreover, the performance of CPA and sPC-SAFT for MEA-containing systems...... is compared.The investigation showed that vapor pressures and liquid densities were not sufficient for obtaining reliable parameters with either CPA or sPC-SAFT, but that at least one other type of information is needed. LLE data for a binary mixture of the associating component with an inert compound is very...
Gaussian wave packet dynamics and the Landau-Zener model for nonadiabatic transitions
DEFF Research Database (Denmark)
Henriksen, Niels Engholm
1992-01-01
The Landau-Zener model for transitions between two linear diabatic potentials is examined. We derive, in the weak-coupling limit, an expression for the transition probability where the classical trajectory and the constant velocity approximations are abandoned and replaced by quantum dynamics...
Copula Gaussian graphical models with penalized ascent Monte Carlo EM algorithm
Abegaz, Fentaw; Wit, Ernst
2015-01-01
Typical data that arise from surveys, experiments, and observational studies include continuous and discrete variables. In this article, we study the interdependence among a mixed (continuous, count, ordered categorical, and binary) set of variables via graphical models. We propose an (1)-penalized
Self-organising mixture autoregressive model for non-stationary time series modelling.
Ni, He; Yin, Hujun
2008-12-01
Modelling non-stationary time series has been a difficult task for both parametric and nonparametric methods. One promising solution is to combine the flexibility of nonparametric models with the simplicity of parametric models. In this paper, the self-organising mixture autoregressive (SOMAR) network is adopted as a such mixture model. It breaks time series into underlying segments and at the same time fits local linear regressive models to the clusters of segments. In such a way, a global non-stationary time series is represented by a dynamic set of local linear regressive models. Neural gas is used for a more flexible structure of the mixture model. Furthermore, a new similarity measure has been introduced in the self-organising network to better quantify the similarity of time series segments. The network can be used naturally in modelling and forecasting non-stationary time series. Experiments on artificial, benchmark time series (e.g. Mackey-Glass) and real-world data (e.g. numbers of sunspots and Forex rates) are presented and the results show that the proposed SOMAR network is effective and superior to other similar approaches.
Gassmann Modeling of Acoustic Properties of Sand-clay Mixtures
Gurevich, B.; Carcione, J. M.
The feasibility of modeling elastic properties of a fluid-saturated sand-clay mixture rock is analyzed by assuming that the rock is composed of macroscopic regions of sand and clay. The elastic properties of such a composite rock are computed using two alternative schemes.The first scheme, which we call the composite Gassmann (CG) scheme, uses Gassmann equations to compute elastic moduli of the saturated sand and clay from their respective dry moduli. The effective elastic moduli of the fluid-saturated composite rock are then computed by applying one of the mixing laws commonly used to estimate elastic properties of composite materials.In the second scheme which we call the Berryman-Milton scheme, the elastic moduli of the dry composite rock matrix are computed from the moduli of dry sand and clay matrices using the same composite mixing law used in the first scheme. Next, the saturated composite rock moduli are computed using the equations of Brown and Korringa, which, together with the expressions for the coefficients derived by Berryman and Milton, provide an extension of Gassmann equations to rocks with a heterogeneous solid matrix.For both schemes, the moduli of the dry homogeneous sand and clay matrices are assumed to obey the Krief's velocity-porosity relationship. As a mixing law we use the self-consistent coherent potential approximation proposed by Berryman.The calculated dependence of compressional and shear velocities on porosity and clay content for a given set of parameters using the two schemes depends on the distribution of total porosity between the sand and clay regions. If the distribution of total porosity between sand and clay is relatively uniform, the predictions of the two schemes in the porosity range up to 0.3 are very similar to each other. For higher porosities and medium-to-large clay content the elastic moduli predicted by CG scheme are significantly higher than those predicted by the BM scheme.This difference is explained by the fact
Polymer mixtures in confined geometries: Model systems to explore ...
Indian Academy of Sciences (India)
to mean field behavior for very long chains, the critical behavior of mixtures confined into thin film geometry falls in the 2d Ising class irrespective of chain length. The critical temperature always scales .... tive monomer blocks all the eight sites of an elementary cube, and these monomers are connected by bond vectors b ...
International Nuclear Information System (INIS)
Kretzschmar, J.G.; Mertens, I.; Vanderborght, B.
1984-01-01
A computer code CAERS (Computer Aided Emergency Response System) has been developed for the simulation of the short-term concentrations caused by an atmospheric emission. The concentration calculations are based on the bi-gaussian theorem with the possibility of using twelve different sets of turbulence typing schemes and dispersion parameters or the plume can be simulated with a bi-dimensional puff trajectory model with tri-gaussian diffusion of the puffs. With the puff trajectory model the emission and the wind conditions can be variable in time. Sixteen SF 6 tracer dispersion experiments, with mobile as well as stationary time averaging sampling, have been carried out for the validation of the on-line and off-line models of CAERS. The tracer experiments of this study have shown that the CAERS system, using the bi-gaussian model and the SCK/CEN turbulence typing scheme, can simulate short time concentration levels very well. The variations of the plume under non-steady emission and meteo conditions are well simulated by the puff trajectory model. This leads to the general conclusion that the atmospheric dispersion models of the CAERS system can give a significant contribution to the management and the interpretation of air pollution concentration measurements in emergency situations
Meyfroidt, Geert; Güiza, Fabian; Cottem, Dominiek; De Becker, Wilfried; Van Loon, Kristien; Aerts, Jean-Marie; Berckmans, Daniël; Ramon, Jan; Bruynooghe, Maurice; Van den Berghe, Greet
2011-10-25
The intensive care unit (ICU) length of stay (LOS) of patients undergoing cardiac surgery may vary considerably, and is often difficult to predict within the first hours after admission. The early clinical evolution of a cardiac surgery patient might be predictive for his LOS. The purpose of the present study was to develop a predictive model for ICU discharge after non-emergency cardiac surgery, by analyzing the first 4 hours of data in the computerized medical record of these patients with Gaussian processes (GP), a machine learning technique. Non-interventional study. Predictive modeling, separate development (n = 461) and validation (n = 499) cohort. GP models were developed to predict the probability of ICU discharge the day after surgery (classification task), and to predict the day of ICU discharge as a discrete variable (regression task). GP predictions were compared with predictions by EuroSCORE, nurses and physicians. The classification task was evaluated using aROC for discrimination, and Brier Score, Brier Score Scaled, and Hosmer-Lemeshow test for calibration. The regression task was evaluated by comparing median actual and predicted discharge, loss penalty function (LPF) ((actual-predicted)/actual) and calculating root mean squared relative errors (RMSRE). Median (P25-P75) ICU length of stay was 3 (2-5) days. For classification, the GP model showed an aROC of 0.758 which was significantly higher than the predictions by nurses, but not better than EuroSCORE and physicians. The GP had the best calibration, with a Brier Score of 0.179 and Hosmer-Lemeshow p-value of 0.382. For regression, GP had the highest proportion of patients with a correctly predicted day of discharge (40%), which was significantly better than the EuroSCORE (p < 0.001) and nurses (p = 0.044) but equivalent to physicians. GP had the lowest RMSRE (0.408) of all predictive models. A GP model that uses PDMS data of the first 4 hours after admission in the ICU of scheduled adult cardiac
Hyper-Fit: Fitting Linear Models to Multidimensional Data with Multivariate Gaussian Uncertainties
Robotham, A. S. G.; Obreschkow, D.
2015-09-01
Astronomical data is often uncertain with errors that are heteroscedastic (different for each data point) and covariant between different dimensions. Assuming that a set of D-dimensional data points can be described by a (D - 1)-dimensional plane with intrinsic scatter, we derive the general likelihood function to be maximised to recover the best fitting model. Alongside the mathematical description, we also release the hyper-fit package for the R statistical language (http://github.com/asgr/hyper.fit) and a user-friendly web interface for online fitting (http://hyperfit.icrar.org). The hyper-fit package offers access to a large number of fitting routines, includes visualisation tools, and is fully documented in an extensive user manual. Most of the hyper-fit functionality is accessible via the web interface. In this paper, we include applications to toy examples and to real astronomical data from the literature: the mass-size, Tully-Fisher, Fundamental Plane, and mass-spin-morphology relations. In most cases, the hyper-fit solutions are in good agreement with published values, but uncover more information regarding the fitted model.
Growth Mixture Modeling of Depression Symptoms Following Traumatic Brain Injury
Directory of Open Access Journals (Sweden)
Rapson Gomez
2017-08-01
Full Text Available Growth Mixture Modeling (GMM was used to investigate the longitudinal trajectory of groups (classes of depression symptoms, and how these groups were predicted by the covariates of age, sex, severity, and length of hospitalization following Traumatic Brain Injury (TBI in a group of 1074 individuals (696 males, and 378 females from the Royal Hobart Hospital, who sustained a TBI. The study began in late December 2003 and recruitment continued until early 2007. Ages ranged from 14 to 90 years, with a mean of 35.96 years (SD = 16.61. The study also examined the associations between the groups and causes of TBI. Symptoms of depression were assessed using the Hospital Anxiety and Depression Scale within 3 weeks of injury, and at 1, 3, 6, 12, and 24 months post-injury. The results revealed three groups: low, high, and delayed depression. In the low group depression scores remained below the clinical cut-off at all assessment points during the 24-months post-TBI, and in the high group, depression scores were above the clinical cut-off at all assessment points. The delayed group showed an increase in depression symptoms to 12 months after injury, followed by a return to initial assessment level during the following 12 months. Covariates were found to be differentially associated with the three groups. For example, relative to the low group, the high depression group was associated with more severe TBI, being female, and a shorter period of hospitalization. The delayed group also had a shorter period of hospitalization, were younger, and sustained less severe TBI. Our findings show considerable fluctuation of depression over time, and that a non-clinical level of depression at any one point in time does not necessarily mean that the person will continue to have non-clinical levels in the future. As we used GMM, we were able to show new findings and also bring clarity to contradictory past findings on depression and TBI. Consequently, we recommend the use
Sound speed models for a noncondensible gas-steam-water mixture
International Nuclear Information System (INIS)
Ransom, V.H.; Trapp, J.A.
1984-01-01
An analytical expression is derived for the homogeneous equilibrium speed of sound in a mixture of noncondensible gas, steam, and water. The expression is based on the Gibbs free energy interphase equilibrium condition for a Gibbs-Dalton mixture in contact with a pure liquid phase. Several simplified models are discussed including the homogeneous frozen model. These idealized models can be used as a reference for data comparison and also serve as a basis for empirically corrected nonhomogeneous and nonequilibrium models
Short communication: Alteration of priors for random effects in Gaussian linear mixed model
DEFF Research Database (Denmark)
Vandenplas, Jérémie; Christensen, Ole Fredslund; Gengler, Nicholas
2014-01-01
such alterations. Therefore, the aim of this study was to propose a method to alter both the mean and (co)variance of the prior multivariate normal distributions of random effects of linear mixed models while using currently available software packages. The proposed method was tested on simulated examples with 3......, multiple-trait predictions of lactation yields, and Bayesian approaches integrating external information into genetic evaluations) need to alter both the mean and (co)variance of the prior distributions and, to our knowledge, most software packages available in the animal breeding community do not permit...... different software packages available in animal breeding. The examples showed the possibility of the proposed method to alter both the mean and (co)variance of the prior distributions with currently available software packages through the use of an extended data file and a user-supplied (co)variance matrix....
Directory of Open Access Journals (Sweden)
Meyfroidt Geert
2011-10-01
Full Text Available Abstract Background The intensive care unit (ICU length of stay (LOS of patients undergoing cardiac surgery may vary considerably, and is often difficult to predict within the first hours after admission. The early clinical evolution of a cardiac surgery patient might be predictive for his LOS. The purpose of the present study was to develop a predictive model for ICU discharge after non-emergency cardiac surgery, by analyzing the first 4 hours of data in the computerized medical record of these patients with Gaussian processes (GP, a machine learning technique. Methods Non-interventional study. Predictive modeling, separate development (n = 461 and validation (n = 499 cohort. GP models were developed to predict the probability of ICU discharge the day after surgery (classification task, and to predict the day of ICU discharge as a discrete variable (regression task. GP predictions were compared with predictions by EuroSCORE, nurses and physicians. The classification task was evaluated using aROC for discrimination, and Brier Score, Brier Score Scaled, and Hosmer-Lemeshow test for calibration. The regression task was evaluated by comparing median actual and predicted discharge, loss penalty function (LPF ((actual-predicted/actual and calculating root mean squared relative errors (RMSRE. Results Median (P25-P75 ICU length of stay was 3 (2-5 days. For classification, the GP model showed an aROC of 0.758 which was significantly higher than the predictions by nurses, but not better than EuroSCORE and physicians. The GP had the best calibration, with a Brier Score of 0.179 and Hosmer-Lemeshow p-value of 0.382. For regression, GP had the highest proportion of patients with a correctly predicted day of discharge (40%, which was significantly better than the EuroSCORE (p Conclusions A GP model that uses PDMS data of the first 4 hours after admission in the ICU of scheduled adult cardiac surgery patients was able to predict discharge from the ICU as a
Model-based experimental design for assessing effects of mixtures of chemicals
Energy Technology Data Exchange (ETDEWEB)
Baas, Jan, E-mail: jan.baas@falw.vu.n [Vrije Universiteit of Amsterdam, Dept of Theoretical Biology, De Boelelaan 1085, 1081 HV Amsterdam (Netherlands); Stefanowicz, Anna M., E-mail: anna.stefanowicz@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Klimek, Beata, E-mail: beata.klimek@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Laskowski, Ryszard, E-mail: ryszard.laskowski@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Kooijman, Sebastiaan A.L.M., E-mail: bas@bio.vu.n [Vrije Universiteit of Amsterdam, Dept of Theoretical Biology, De Boelelaan 1085, 1081 HV Amsterdam (Netherlands)
2010-01-15
We exposed flour beetles (Tribolium castaneum) to a mixture of four poly aromatic hydrocarbons (PAHs). The experimental setup was chosen such that the emphasis was on assessing partial effects. We interpreted the effects of the mixture by a process-based model, with a threshold concentration for effects on survival. The behavior of the threshold concentration was one of the key features of this research. We showed that the threshold concentration is shared by toxicants with the same mode of action, which gives a mechanistic explanation for the observation that toxic effects in mixtures may occur in concentration ranges where the individual components do not show effects. Our approach gives reliable predictions of partial effects on survival and allows for a reduction of experimental effort in assessing effects of mixtures, extrapolations to other mixtures, other points in time, or in a wider perspective to other organisms. - We show a mechanistic approach to assess effects of mixtures in low concentrations.
Energy Technology Data Exchange (ETDEWEB)
Piepel, Gregory F.
2007-12-01
A mixture experiment involves combining two or more components in various proportions or amounts and then measuring one or more responses for the resulting end products. Other factors that affect the response(s), such as process variables and/or the total amount of the mixture, may also be studied in the experiment. A mixture experiment design specifies the combinations of mixture components and other experimental factors (if any) to be studied and the response variable(s) to be measured. Mixture experiment data analyses are then used to achieve the desired goals, which may include (i) understanding the effects of components and other factors on the response(s), (ii) identifying components and other factors with significant and nonsignificant effects on the response(s), (iii) developing models for predicting the response(s) as functions of the mixture components and any other factors, and (iv) developing end-products with desired values and uncertainties of the response(s). Given a mixture experiment problem, a practitioner must consider the possible approaches for designing the experiment and analyzing the data, and then select the approach best suited to the problem. Eight possible approaches include 1) component proportions, 2) mathematically independent variables, 3) slack variable, 4) mixture amount, 5) component amounts, 6) mixture process variable, 7) mixture of mixtures, and 8) multi-factor mixture. The article provides an overview of the mixture experiment designs, models, and data analyses for these approaches.
Using mixture models to characterize disease-related traits
Ye Kenny Q; Chase Gary A; Finch Stephen J; Duan Tao; Mendell Nancy R
2005-01-01
Abstract We consider 12 event-related potentials and one electroencephalogram measure as disease-related traits to compare alcohol-dependent individuals (cases) to unaffected individuals (controls). We use two approaches: 1) two-way analysis of variance (with sex and alcohol dependency as the factors), and 2) likelihood ratio tests comparing sex adjusted values of cases to controls assuming that within each group the trait has a 2 (or 3) component normal mixture distribution. In the second ap...
Metal Mixture Modeling Evaluation project: 2. Comparison of four modeling approaches
Farley, Kevin J.; Meyer, Joe; Balistrieri, Laurie S.; DeSchamphelaere, Karl; Iwasaki, Yuichi; Janssen, Colin; Kamo, Masashi; Lofts, Steve; Mebane, Christopher A.; Naito, Wataru; Ryan, Adam C.; Santore, Robert C.; Tipping, Edward
2015-01-01
As part of the Metal Mixture Modeling Evaluation (MMME) project, models were developed by the National Institute of Advanced Industrial Science and Technology (Japan), the U.S. Geological Survey (USA), HDR⎪HydroQual, Inc. (USA), and the Centre for Ecology and Hydrology (UK) to address the effects of metal mixtures on biological responses of aquatic organisms. A comparison of the 4 models, as they were presented at the MMME Workshop in Brussels, Belgium (May 2012), is provided herein. Overall, the models were found to be similar in structure (free ion activities computed by WHAM; specific or non-specific binding of metals/cations in or on the organism; specification of metal potency factors and/or toxicity response functions to relate metal accumulation to biological response). Major differences in modeling approaches are attributed to various modeling assumptions (e.g., single versus multiple types of binding site on the organism) and specific calibration strategies that affected the selection of model parameters. The models provided a reasonable description of additive (or nearly additive) toxicity for a number of individual toxicity test results. Less-than-additive toxicity was more difficult to describe with the available models. Because of limitations in the available datasets and the strong inter-relationships among the model parameters (log KM values, potency factors, toxicity response parameters), further evaluation of specific model assumptions and calibration strategies is needed.
Singh, Kunal; Kumar, Mirgender; Goel, Ekta; Singh, Balraj; Dubey, Sarvesh; Kumar, Sanjay; Jit, Satyabrata
2016-04-01
This paper reports a new two-dimensional (2D) analytical model for the potential distribution and threshold voltage of the short-channel symmetric gate underlap ultrathin DG MOSFETs with a lateral Gaussian doping profile in the source (S)/drain (D) region. The parabolic approximation and conformal mapping techniques have been explored for solving the 2D Poisson's equation to obtain the channel potential function of the device. The effects of straggle parameter (of the lateral Gaussian doping profile in the S/D region), underlap length, gate length, channel thickness and oxide thickness on the surface potential and threshold voltage have been investigated. The loss of switching speed due to the drain-induced barrier lowering (DIBL) has also been reported. The proposed model results have been validated by comparing them with their corresponding TCAD simulation data obtained by using the commercially available 2D ATLAS™ simulation software.
Directory of Open Access Journals (Sweden)
Yoon Soo ePark
2016-02-01
Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.
Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan
2016-01-01
This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.
DEFF Research Database (Denmark)
Hadrup, Niels; Taxvig, Camilla; Pedersen, Mikael
2013-01-01
and compared to the experimental mixture data. Mixture 1 contained environmental chemicals adjusted in ratio according to human exposure levels. Mixture 2 was a potency adjusted mixture containing five pesticides. Prediction of testosterone effects coincided with the experimental Mixture 1 data. In contrast...... was the predominant but not sole driver of the mixtures, suggesting that one chemical alone was not responsible for the mixture effects. In conclusion, the GCA model seemed to be superior to the CA and IA models for the prediction of testosterone effects. A situation with chemicals exerting opposing effects...
Bellos, V.; Mahmoodian, M.; Leopold, U.; Torres-Matallana, J. A.; Schutz, G.; Clemens, F.
2017-12-01
Surrogate models help to decrease the run-time of computationally expensive, detailed models. Recent studies show that Gaussian Process Emulators (GPE) are promising techniques in the field of urban drainage modelling. However, this study focusses on developing a GPE-based surrogate model for later application in Real Time Control (RTC) using input and output time series of a complex simulator. The case study is an urban drainage catchment in Luxembourg. A detailed simulator, implemented in InfoWorks ICM, is used to generate 120 input-output ensembles, from which, 100 are used for training the emulator and 20 for validation of the results. An ensemble of historical rainfall events with 2 hours duration and 10 minutes time steps are considered as the input data. Two example outputs, are selected as wastewater volume and total COD concentration in a storage tank in the network. The results of the emulator are tested with unseen random rainfall events from the ensemble dataset. The emulator is approximately 1000 times faster than the original simulator for this small case study. Whereas the overall patterns of the simulator are matched by the emulator, in some cases the emulator deviates from the simulator. To quantify the accuracy of the emulator in comparison with the original simulator, Nash-Sutcliffe efficiency (NSE) between the emulator and simulator is calculated for unseen rainfall scenarios. The range of NSE for the case of tank volume is from 0.88 to 0.99 with a mean value of 0.95, whereas for COD is from 0.71 to 0.99 with a mean value of 0.92. The emulator is able to predict the tank volume with higher accuracy as the relationship between rainfall intensity and tank volume is linear. For COD, which has a non-linear behaviour, the predictions are less accurate and more uncertain, in particular when rainfall intensity increases. This predictions were improved by including a larger amount of training data for the higher rainfall intensities. It was observed
Directory of Open Access Journals (Sweden)
Emanuela Colasante
2008-12-01
Full Text Available
Introduction: The aim of this study is to evaluate, even if partially, how much the drug use phenomenon impacts on the Italian National Heatlh System throughout the estimation at local level (Local Health Unit of the hospitalization rate caused by substance use and abuse such as opiates, barbiturates-sedativeshypnotics, cocaine and cannabis, and keeping in mind the phenomenon distribution in the space and so the fact that what happens in a specific area depends on what is happening in the neighbourhoods close to it (spatial autocorrelation.
Methods: Data from hospital discharge database were provided by the Ministry of Health and an auto- Gaussian model was fitted. The spatial trend can be a function of other explanatory variables or can simply be modeled as a function of spatial location. Both models were fitted and compared using the number of subjects kept in charge by Drug Addiction Services and the number of beds held by hospitals as covariates.
Results: Concerning opiates use related hospitalizations, results show areas where the phenomenon was less prominent in 2001 (Lombardy, part of Liguria, Umbria, part of Latium, Campania, Apulia and Sicily. In the following years, the hospitalization rates increased in some areas, such as the north of Apulia, part of Campania and Latium. A dependence of the opiates related hospitalization rates on the rate of subjects kept in charge by the Drug Addiction Services is highlighted. Concerning barbiturates-sedatives-hypnotics consumption, the best model is the one without covariates and estimated hospitalization rates are lower then 3 per thousand. The model with only the covariate “rate of subjects kept in charge by Drug Addiction Services” has been used both for cocaine and cannabis. In these two cases, more than a half of the Local Health Units report hospitalization rates lower than 0.5 per thousand
How Gaussian can our Universe be?
Cabass, Giovanni; Pajer, Enrico; Schmidt, Fabian
Gravity is a non-linear theory, and hence, barring cancellations, the initial super-horizon perturbations produced by inflation must contain some minimum amount of mode coupling, or primordial non-Gaussianity. In single-field slow-roll models, where this lower bound is saturated, non-Gaussianity is
Directory of Open Access Journals (Sweden)
Rodolphe Marion
2018-01-01
Full Text Available The identification and mapping of the mineral composition of by-products and residues on industrial sites is a topic of growing interest because it may provide information on plant-processing activities and their impact on the surrounding environment. Imaging spectroscopy can provide such information based on the spectral signatures of soil mineral markers. In this study, we use the automatized Gaussian model (AGM, an automated, physically based method relying on spectral deconvolution. Originally developed for the short-wavelength infrared (SWIR range, it has been extended to include information from the visible and near-infrared (VNIR range to take iron oxides/hydroxides into account. We present the results of its application to two French industrial sites: (i the Altéo Environnement site in Gardanne, southern France, dedicated to the extraction of alumina from bauxite; and (ii the Millennium Inorganic Chemicals site in Thann, eastern France, which produces titanium dioxide from ilmenite and rutile, and its associated Séché Éco Services site used to neutralize the resulting effluents, producing gypsum. HySpex hyperspectral images were acquired over Gardanne in September 2013 and an APEX image was acquired over Thann in June 2013. In both cases, reflectance spectra were measured and samples were collected in the field and analyzed for mineralogical and chemical composition. When applying the AGM to the images, both in the VNIR and SWIR ranges, we successfully identified and mapped minerals of interest characteristic of each site: bauxite, Bauxaline® and alumina for Gardanne; and red and white gypsum and calcite for Thann. Identifications and maps were consistent with in situ measurements.
Ueda, Teruyuki; Honda, Masao; Horimoto, Katsuhisa; Aburatani, Sachiyo; Saito, Shigeru; Yamashita, Taro; Sakai, Yoshio; Nakamura, Mikiko; Takatori, Hajime; Sunagozaka, Hajime; Kaneko, Shuichi
2013-04-01
Gene expression profiling of hepatocellular carcinoma (HCC) and background liver has been studied extensively; however, the relationship between the gene expression profiles of different lesions has not been assessed. We examined the expression profiles of 34 HCC specimens (17 hepatitis B virus [HBV]-related and 17 hepatitis C virus [HCV]-related) and 71 non-tumor liver specimens (36 chronic hepatitis B [CH-B] and 35 chronic hepatitis C [CH-C]) using an in-house cDNA microarray consisting of liver-predominant genes. Graphical Gaussian modeling (GGM) was applied to elucidate the interactions of gene clusters among the HCC and non-tumor lesions. In CH-B-related HCC, the expression of vascular endothelial growth factor-family signaling and regulation of T cell differentiation, apoptosis, and survival, as well as development-related genes was up-regulated. In CH-C-related HCC, the expression of ectodermal development and cell proliferation, wnt receptor signaling, cell adhesion, and defense response genes was also up-regulated. Many of the metabolism-related genes were down-regulated in both CH-B- and CH-C-related HCC. GGM analysis of the HCC and non-tumor lesions revealed that DNA damage response genes were associated with AP1 signaling in non-tumor lesions, which mediates the expression of many genes in CH-B-related HCC. In contrast, signal transducer and activator of transcription 1 and phosphatase and tensin homolog were associated with early growth response protein 1 signaling in non-tumor lesions, which potentially promotes angiogenesis, fibrogenesis, and tumorigenesis in CH-C-related HCC. Gene expression profiling of HCC and non-tumor lesions revealed the predisposing changes of gene expression in HCC. This approach has potential for the early diagnosis and possible prevention of HCC. Copyright © 2013 Elsevier Inc. All rights reserved.
On Gaussian conditional independence structures
Czech Academy of Sciences Publication Activity Database
Lněnička, Radim; Matúš, František
2007-01-01
Roč. 43, č. 3 (2007), s. 327-342 ISSN 0023-5954 R&D Projects: GA AV ČR IAA100750603 Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate Gaussian distribution * positive definite matrices * determinants * gaussoids * covariance selection models * Markov perfectness Subject RIV: BA - General Mathematics Impact factor: 0.552, year: 2007
Influence of high power ultrasound on rheological and foaming properties of model ice-cream mixtures
Directory of Open Access Journals (Sweden)
Verica Batur
2010-03-01
Full Text Available This paper presents research of the high power ultrasound effect on rheological and foaming properties of ice cream model mixtures. Ice cream model mixtures are prepared according to specific recipes, and afterward undergone through different homogenization techniques: mechanical mixing, ultrasound treatment and combination of mechanical and ultrasound treatment. Specific diameter (12.7 mm of ultrasound probe tip has been used for ultrasound treatment that lasted 5 minutes at 100 percent amplitude. Rheological parameters have been determined using rotational rheometer and expressed as flow index, consistency coefficient and apparent viscosity. From the results it can be concluded that all model mixtures have non-newtonian, dilatant type behavior. The highest viscosities have been observed for model mixtures that were homogenizes with mechanical mixing, and significantly lower values of viscosity have been observed for ultrasound treated ones. Foaming properties are expressed as percentage of increase in foam volume, foam stability index and minimal viscosity. It has been determined that ice cream model mixtures treated only with ultrasound had minimal increase in foam volume, while the highest increase in foam volume has been observed for ice cream mixture that has been treated in combination with mechanical and ultrasound treatment. Also, ice cream mixtures having higher amount of proteins in composition had shown higher foam stability. It has been determined that optimal treatment time is 10 minutes.
Self-reciprocal M. Born's model and the dynamic picture in the conformal-flat gaussian-type metric
International Nuclear Information System (INIS)
Tomil'chik, L.M.
2010-01-01
It is shown that Born's self-reciprocal equation coincides in form with the D'Alembert general covariant equation in the conformal-flat Gaussian-type metric and the geodesic equations reproduce the hyperbolic motion of the probe particle. The relation between the energy-momentum tensor generating such a metric and the corresponding Minkowski force. (authors)
International Nuclear Information System (INIS)
Dubey, Sarvesh; Jit, S.; Tiwari Pramod Kumar
2013-01-01
An analytic drain current model is presented for doped short-channel double-gate MOSFETs with a Gaussian-like doping profile in the vertical direction of the channel. The present model is valid in linear and saturation regions of device operation. The drain current variation with various device parameters has been demonstrated. The model is made more physical by incorporating the channel length modulation effect. Parameters like transconductance and drain conductance that are important in assessing the analog performance of the device have also been formulated. The model results are validated by numerical simulation results obtained by using the commercially available ATLAS™, a two dimensional device simulator from SILVACO. (semiconductor devices)
Linking network usage patterns to traffic Gaussianity fit
de Oliveira Schmidt, R.; Sadre, R.; Melnikov, Nikolay; Schönwälder, Jürgen; Pras, Aiko
Gaussian traffic models are widely used in the domain of network traffic modeling. The central assumption is that traffic aggregates are Gaussian distributed. Due to its importance, the Gaussian character of network traffic has been extensively assessed by researchers in the past years. In 2001,
Gaussian processes for machine learning.
Seeger, Matthias
2004-04-01
Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible non-parametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations.13,78,31 The mathematical literature on GPs is large and often uses deep concepts which are not required to fully understand most machine learning applications. In this tutorial paper, we aim to present characteristics of GPs relevant to machine learning and to show up precise connections to other "kernel machines" popular in the community. Our focus is on a simple presentation, but references to more detailed sources are provided.
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
DEFF Research Database (Denmark)
Thompson, Wesley K.; Wang, Yunpeng; Schork, Andrew J.
2015-01-01
minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local...... for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome...... analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While...
Study of the Internal Mechanical response of an asphalt mixture by 3-D Discrete Element Modeling
DEFF Research Database (Denmark)
Feng, Huan; Pettinari, Matteo; Hofko, Bernhard
2015-01-01
In this paper the viscoelastic behavior of asphalt mixture was investigated by employing a three-dimensional Discrete Element Method (DEM). The cylinder model was filled with cubic array of spheres with a specified radius, and was considered as a whole mixture with uniform contact properties...... for all the distinct elements. The dynamic modulus and phase angle from uniaxial complex modulus tests of the asphalt mixtures in the laboratory have been collected. A macro-scale Burger’s model was first established and the input parameters of Burger’s contact model were calibrated by fitting...... with the lab test data of the complex modulus of the asphalt mixture. The Burger’s contact model parameters are usually calibrated for each frequency. While in this research a constant set of Burger’s parameters has been calibrated and used for all the test frequencies, the calibration procedure...
Modeling Hydrodynamic State of Oil and Gas Condensate Mixture in a Pipeline
Directory of Open Access Journals (Sweden)
Dudin Sergey
2016-01-01
Based on the developed model a calculation method was obtained which is used to analyze hydrodynamic state and composition of hydrocarbon mixture in each ith section of the pipeline when temperature-pressure and hydraulic conditions change.
A Linear Gradient Theory Model for Calculating Interfacial Tensions of Mixtures
DEFF Research Database (Denmark)
Zou, You-Xiang; Stenby, Erling Halfdan
1996-01-01
containing supercritical methane, argon, nitrogen, and carbon dioxide gases at high pressure. With this model it is unnecessary to solve the time-consuming density profile equations of the gradient theory model. The model has been tested on a number of mixtures at low and high pressures. The results show...... with proper scaling behavior at the critical point is at least required.Key words: linear gradient theory; interfacial tension; equation of state; influence parameter; density profile.......In this research work, we assumed that the densities of each component in a mixture are linearly distributed across the interface between the coexisting vapor and liquid phases, and we developed a linear gradient theory model for computing interfacial tensions of mixtures, especially mixtures...
A predictive model of natural gas mixture combustion in internal combustion engines
Directory of Open Access Journals (Sweden)
Henry Espinoza
2007-05-01
Full Text Available This study shows the development of a predictive natural gas mixture combustion model for conventional com-bustion (ignition engines. The model was based on resolving two areas; one having unburned combustion mixture and another having combustion products. Energy and matter conservation equations were solved for each crankshaft turn angle for each area. Nonlinear differential equations for each phase’s energy (considering compression, combustion and expansion were solved by applying the fourth-order Runge-Kutta method. The model also enabled studying different natural gas components’ composition and evaluating combustion in the presence of dry and humid air. Validation results are shown with experimental data, demonstrating the software’s precision and accuracy in the results so produced. The results showed cylinder pressure, unburned and burned mixture temperature, burned mass fraction and combustion reaction heat for the engine being modelled using a natural gas mixture.
Mixture estimation with state-space components and Markov model of switching
Czech Academy of Sciences Publication Activity Database
Nagy, Ivan; Suzdaleva, Evgenia
2013-01-01
Roč. 37, č. 24 (2013), s. 9970-9984 ISSN 0307-904X R&D Projects: GA TA ČR TA01030123 Institutional support: RVO:67985556 Keywords : probabilistic dynamic mixtures, * probability density function * state-space models * recursive mixture estimation * Bayesian dynamic decision making under uncertainty * Kerridge inaccuracy Subject RIV: BC - Control Systems Theory Impact factor: 2.158, year: 2013 http://library.utia.cas.cz/separaty/2013/AS/nagy-mixture estimation with state-space components and markov model of switching.pdf
Latent variable mixture modeling in psychiatric research--a review and application.
Miettunen, J; Nordström, T; Kaakinen, M; Ahmed, A O
2016-02-01
Latent variable mixture modeling represents a flexible approach to investigating population heterogeneity by sorting cases into latent but non-arbitrary subgroups that are more homogeneous. The purpose of this selective review is to provide a non-technical introduction to mixture modeling in a cross-sectional context. Latent class analysis is used to classify individuals into homogeneous subgroups (latent classes). Factor mixture modeling represents a newer approach that represents a fusion of latent class analysis and factor analysis. Factor mixture models are adaptable to representing categorical and dimensional states of affairs. This article provides an overview of latent variable mixture models and illustrates the application of these methods by applying them to the study of the latent structure of psychotic experiences. The flexibility of latent variable mixture models makes them adaptable to the study of heterogeneity in complex psychiatric and psychological phenomena. They also allow researchers to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.
Feng, Jianfeng; Gao, Yongfei; Ji, Yijun; Zhu, Lin
2018-03-05
Predicting the toxicity of chemical mixtures is difficult because of the additive, antagonistic, or synergistic interactions among the mixture components. Antagonistic and synergistic interactions are dominant in metal mixtures, and their distributions may correlate with exposure concentrations. However, whether the interaction types of metal mixtures change at different time points during toxicodynamic (TD) processes is undetermined because of insufficient appropriate models and metal bioaccumulation data at different time points. In the present study, the generalized linear model (GLM) was used to illustrate the combined toxicities of binary metal mixtures, such as Cu-Zn, Cu-Cd, and Cd-Pb, to zebrafish larvae (Danio rerio). GLM was also used to identify possible interaction types among these method for the traditional concentration addition (CA) and independent action (IA) models. Then the GLM were applied to quantify the different possible interaction types for metal mixture toxicity (Cu-Zn, Cu-Cd, and Cd-Pb to D. rerio and Ni-Co to Oligochaeta Enchytraeus crypticus) during the TD process at different exposure times. We found different metal interaction responses in the TD process and interactive coefficients significantly changed at different exposure times (pmixture toxicology on organisms. Moreover, care should be taken when evaluating interactions in toxicity prediction because results may vary at different time points. The GLM could be an alternative or complementary approach for BLM to analyze and predict metal mixture toxicity. Copyright © 2017 Elsevier B.V. All rights reserved.
Latent Transition Analysis with a Mixture Item Response Theory Measurement Model
Cho, Sun-Joo; Cohen, Allan S.; Kim, Seock-Ho; Bottge, Brian
2010-01-01
A latent transition analysis (LTA) model was described with a mixture Rasch model (MRM) as the measurement model. Unlike the LTA, which was developed with a latent class measurement model, the LTA-MRM permits within-class variability on the latent variable, making it more useful for measuring treatment effects within latent classes. A simulation…
A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling.
Bouguila, Nizar; Ziou, Djemel
2010-01-01
In this paper, we propose a clustering algorithm based on both Dirichlet processes and generalized Dirichlet distribution which has been shown to be very flexible for proportional data modeling. Our approach can be viewed as an extension of the finite generalized Dirichlet mixture model to the infinite case. The extension is based on nonparametric Bayesian analysis. This clustering algorithm does not require the specification of the number of mixture components to be given in advance and estimates it in a principled manner. Our approach is Bayesian and relies on the estimation of the posterior distribution of clusterings using Gibbs sampler. Through some applications involving real-data classification and image databases categorization using visual words, we show that clustering via infinite mixture models offers a more powerful and robust performance than classic finite mixtures.
Extensions of D-optimal Minimal Designs for Symmetric Mixture Models.
Li, Yanyan; Raghavarao, Damaraju; Chervoneva, Inna
2017-01-01
The purpose of mixture experiments is to explore the optimum blends of mixture components, which will provide desirable response characteristics in finished products. D-optimal minimal designs have been considered for a variety of mixture models, including Scheffé's linear, quadratic, and cubic models. Usually, these D-optimal designs are minimally supported since they have just as many design points as the number of parameters. Thus, they lack the degrees of freedom to perform the Lack of Fit tests. Also, the majority of the design points in D-optimal minimal designs are on the boundary: vertices, edges, or faces of the design simplex. Also a new strategy for adding multiple interior points for symmetric mixture models is proposed. We compare the proposed designs with Cornell (1986) two ten-point designs for the Lack of Fit test by simulations.
International Nuclear Information System (INIS)
Maevskii, K. K.; Kinelovskii, S. A.
2015-01-01
The numerical results of modeling of shock wave loading of mixtures with the SiO 2 component are presented. The TEC (thermodynamic equilibrium component) model is employed to describe the behavior of solid and porous multicomponent mixtures and alloys under shock wave loading. State equations of a Mie–Grüneisen type are used to describe the behavior of condensed phases, taking into account the temperature dependence of the Grüneisen coefficient, gas in pores is one of the components of the environment. The model is based on the assumption that all components of the mixture under shock-wave loading are in thermodynamic equilibrium. The calculation results are compared with the experimental data derived by various authors. The behavior of the mixture containing components with a phase transition under high dynamic loads is described
Existence, uniqueness and positivity of solutions for BGK models for mixtures
Klingenberg, C.; Pirner, M.
2018-01-01
We consider kinetic models for a multi component gas mixture without chemical reactions. In the literature, one can find two types of BGK models in order to describe gas mixtures. One type has a sum of BGK type interaction terms in the relaxation operator, for example the model described by Klingenberg, Pirner and Puppo [20] which contains well-known models of physicists and engineers for example Hamel [16] and Gross and Krook [15] as special cases. The other type contains only one collision term on the right-hand side, for example the well-known model of Andries, Aoki and Perthame [1]. For each of these two models [20] and [1], we prove existence, uniqueness and positivity of solutions in the first part of the paper. In the second part, we use the first model [20] in order to determine an unknown function in the energy exchange of the macroscopic equations for gas mixtures described by Dellacherie [11].
Non-Gaussian signatures of tachyacoustic cosmology
Energy Technology Data Exchange (ETDEWEB)
Bessada, Dennis, E-mail: dennis.bessada@unifesp.br [UNIFESP — Universidade Federal de São Paulo, Laboratório de Física Teórica e Computação Científica, Rua São Nicolau, 210, 09913-030, Diadema, SP (Brazil)
2012-09-01
I investigate non-Gaussian signatures in the context of tachyacoustic cosmology, that is, a noninflationary model with superluminal speed of sound. I calculate the full non-Gaussian amplitude A, its size f{sub NL}, and corresponding shapes for a red-tilted spectrum of primordial scalar perturbations. Specifically, for cuscuton-like models I show that f{sub NL} ∼ O(1), and the shape of its non-Gaussian amplitude peaks for both equilateral and local configurations, the latter being dominant. These results, albeit similar, are quantitatively distinct from the corresponding ones obtained by Magueijo et al. in the context of superluminal bimetric models.
Kittisuwan, Pichid
2015-03-01
The application of image processing in industry has shown remarkable success over the last decade, for example, in security and telecommunication systems. The denoising of natural image corrupted by Gaussian noise is a classical problem in image processing. So, image denoising is an indispensable step during image processing. This paper is concerned with dual-tree complex wavelet-based image denoising using Bayesian techniques. One of the cruxes of the Bayesian image denoising algorithms is to estimate the statistical parameter of the image. Here, we employ maximum a posteriori (MAP) estimation to calculate local observed variance with generalized Gamma density prior for local observed variance and Laplacian or Gaussian distribution for noisy wavelet coefficients. Evidently, our selection of prior distribution is motivated by efficient and flexible properties of generalized Gamma density. The experimental results show that the proposed method yields good denoising results.
Evaluation of fecal mRNA reproducibility via a marginal transformed mixture modeling approach
Directory of Open Access Journals (Sweden)
Davidson Laurie A
2010-01-01
Full Text Available Abstract Background Developing and evaluating new technology that enables researchers to recover gene-expression levels of colonic cells from fecal samples could be key to a non-invasive screening tool for early detection of colon cancer. The current study, to the best of our knowledge, is the first to investigate and report the reproducibility of fecal microarray data. Using the intraclass correlation coefficient (ICC as a measure of reproducibility and the preliminary analysis of fecal and mucosal data, we assessed the reliability of mixture density estimation and the reproducibility of fecal microarray data. Using Monte Carlo-based methods, we explored whether ICC values should be modeled as a beta-mixture or transformed first and fitted with a normal-mixture. We used outcomes from bootstrapped goodness-of-fit tests to determine which approach is less sensitive toward potential violation of distributional assumptions. Results The graphical examination of both the distributions of ICC and probit-transformed ICC (PT-ICC clearly shows that there are two components in the distributions. For ICC measurements, which are between 0 and 1, the practice in literature has been to assume that the data points are from a beta-mixture distribution. Nevertheless, in our study we show that the use of a normal-mixture modeling approach on PT-ICC could provide superior performance. Conclusions When modeling ICC values of gene expression levels, using mixture of normals in the probit-transformed (PT scale is less sensitive toward model mis-specification than using mixture of betas. We show that a biased conclusion could be made if we follow the traditional approach and model the two sets of ICC values using the mixture of betas directly. The problematic estimation arises from the sensitivity of beta-mixtures toward model mis-specification, particularly when there are observations in the neighborhood of the the boundary points, 0 or 1. Since beta-mixture modeling
Thermodiffusion in Multicomponent Mixtures Thermodynamic, Algebraic, and Neuro-Computing Models
Srinivasan, Seshasai
2013-01-01
Thermodiffusion in Multicomponent Mixtures presents the computational approaches that are employed in the study of thermodiffusion in various types of mixtures, namely, hydrocarbons, polymers, water-alcohol, molten metals, and so forth. We present a detailed formalism of these methods that are based on non-equilibrium thermodynamics or algebraic correlations or principles of the artificial neural network. The book will serve as single complete reference to understand the theoretical derivations of thermodiffusion models and its application to different types of multi-component mixtures. An exhaustive discussion of these is used to give a complete perspective of the principles and the key factors that govern the thermodiffusion process.
MCEM algorithm for the log-Gaussian Cox process
Delmas, Celine; Dubois-Peyrard, Nathalie; Sabbadin, Regis
2014-01-01
Log-Gaussian Cox processes are an important class of models for aggregated point patterns. They have been largely used in spatial epidemiology (Diggle et al., 2005), in agronomy (Bourgeois et al., 2012), in forestry (Moller et al.), in ecology (sightings of wild animals) or in environmental sciences (radioactivity counts). A log-Gaussian Cox process is a Poisson process with a stochastic intensity depending on a Gaussian random eld. We consider the case where this Gaussian random eld is ...
On mixture model complexity estimation for music recommender systems
Balkema, W.; van der Heijden, Ferdinand; Meijerink, B.
2006-01-01
Content-based music navigation systems are in need of robust music similarity measures. Current similarity measures model each song with the same model parameters. We propose methods to efficiently estimate the required number of model parameters of each individual song. First results of a study on
Mowlavi, N.; Lecoeur-Taïbi, I.; Holl, B.; Rimoldini, L.; Barblan, F.; Prša, A.; Kochoska, A.; Süveges, M.; Eyer, L.; Nienartowicz, K.; Jevardat, G.; Charnas, J.; Guy, L.; Audard, M.
2017-10-01
Context. The advent of large scale multi-epoch surveys raises the need for automated light curve (LC) processing. This is particularly true for eclipsing binaries (EBs), which form one of the most populated types of variable objects. The Gaia mission, launched at the end of 2013, is expected to detect of the order of few million EBs over a five-year mission. Aims: We present an automated procedure to characterize EBs based on the geometric morphology of their LCs with two aims: first to study an ensemble of EBs on a statistical ground without the need to model the binary system, and second to enable the automated identification of EBs that display atypical LCs. Methods: We modeled the folded LC geometry of EBs using up to two Gaussian functions for the eclipses and a cosine function for any ellipsoidal-like variability that may be present between the eclipses. The procedure is applied to the OGLE-III data set of EBs in the Large Magellanic Cloud (LMC) as a proof of concept. The Bayesian information criterion is used to select the best model among models containing various combinations of those components, as well as to estimate the significance of the components. Results: Based on the two-Gaussian models, EBs with atypical LC geometries are successfully identified in two diagrams, using the Abbe values of the original and residual folded LCs, and the reduced χ2. Cleaning the data set from the atypical cases and further filtering out LCs that contain non-significant eclipse candidates, the ensemble of EBs can be studied on a statistical ground using the two-Gaussian model parameters. For illustrative purposes, we present the distribution of projected eccentricities as a function of orbital period for the OGLE-III set of EBs in the LMC, as well as the distribution of their primary versus secondary eclipse widths. The two-Gaussian models for all the OGLE-III LMC EBs table is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130
Directory of Open Access Journals (Sweden)
F. C. PEIXOTO
1999-09-01
Full Text Available Fragmentation kinetics is employed to model a continuous reactive mixture of alkanes under catalytic cracking conditions. Standard moment analysis techniques are employed, and a dynamic system for the time evolution of moments of the mixture's dimensionless concentration distribution function (DCDF is found. The time behavior of the DCDF is recovered with successive estimations of scaled gamma distributions using the moments time data.
Karagiannis, Georgios; Lin, Guang
2017-08-01
For many real systems, several computer models may exist with different physics and predictive abilities. To achieve more accurate simulations/predictions, it is desirable for these models to be properly combined and calibrated. We propose the Bayesian calibration of computer model mixture method which relies on the idea of representing the real system output as a mixture of the available computer model outputs with unknown input dependent weight functions. The method builds a fully Bayesian predictive model as an emulator for the real system output by combining, weighting, and calibrating the available models in the Bayesian framework. Moreover, it fits a mixture of calibrated computer models that can be used by the domain scientist as a mean to combine the available computer models, in a flexible and principled manner, and perform reliable simulations. It can address realistic cases where one model may be more accurate than the others at different input values because the mixture weights, indicating the contribution of each model, are functions of the input. Inference on the calibration parameters can consider multiple computer models associated with different physics. The method does not require knowledge of the fidelity order of the models. We provide a technique able to mitigate the computational overhead due to the consideration of multiple computer models that is suitable to the mixture model framework. We implement the proposed method in a real-world application involving the Weather Research and Forecasting large-scale climate model.
Piecewise Linear-Linear Latent Growth Mixture Models with Unknown Knots
Kohli, Nidhi; Harring, Jeffrey R.; Hancock, Gregory R.
2013-01-01
Latent growth curve models with piecewise functions are flexible and useful analytic models for investigating individual behaviors that exhibit distinct phases of development in observed variables. As an extension of this framework, this study considers a piecewise linear-linear latent growth mixture model (LGMM) for describing segmented change of…
Modeling and Computation of Thermodynamic Equilibrium for Mixtures of Inorganic and Organic Species
Caboussat, A.; Amundson, N. R.; He, J.; Martynenko, A. V.; Seinfeld, J. H.
2007-05-01
A series of modules has been developed in the atmospheric modeling community to predict the phase transition, crystallization and evaporation of inorganic aerosols. Modules for the computation of the thermodynamics of pure organic-containing aerosols have been developed more recently; however, the modeling of aerosols containing mixtures of inorganic and organic compounds has gathered less attention. We present here a model (UHAERO), that is flexible, efficient and rigorously computes the thermodynamic equilibrium of atmospheric particles containing inorganic and organic compounds. It is applied first to mixtures of inorganic electrolytes and dicarboxylic acids, and then to thermodynamic equilibria including crystallization and liquid-liquid phase separation. The model does not rely on any a priori specification of the phases present in certain atmospheric conditions. The multicomponent phase equilibrium for a closed organic aerosol system at constant temperature and pressure and for specified feeds is the solution to the equilibrium problem arising from the constrained minimization of the Gibbs free energy. For mixtures of inorganic electrolytes and dissociated organics, organic salts appear at equilibrium in the aqueous phase. In the general case, liquid-liquid phase separations happen and electrolytes dissociate in both aqueous and organic liquid phases. The Gibbs free energy is modeled by the UNIFAC model for the organic compounds, the PSC model for the inorganic constituents and a Pitzer model for interactions. The difficulty comes from the accurate estimation of interactions in the modeling of the activity coefficients. An accurate and efficient method for the computation of the minimum of energy is used to compute phase diagrams for mixtures of inorganic and organic species. Numerical results show the efficiency of the model for mixtures of inorganic electrolytes and organic acids, which make it suitable for insertion in global three-dimensional air quality
Structure-reactivity modeling using mixture-based representation of chemical reactions.
Polishchuk, Pavel; Madzhidov, Timur; Gimadiev, Timur; Bodrov, Andrey; Nugmanov, Ramil; Varnek, Alexandre
2017-09-01
We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.
Identification and separation of DNA mixtures using peak area information
DEFF Research Database (Denmark)
Cowell, R.G.; Lauritzen, Steffen Lilholt; Mortera, J.
We show how probabilistic expert systems can be used to analyse forensic identification problems involving DNA mixture traces using quantitative peak area information. Peak area is modelled with conditional Gaussian distributions. The expert system can be used for scertaining whether individuals......, whose profiles have been measured, have contributed to the mixture, but also to predict DNA profiles of unknown contributors by separating the mixture into its individual components. The potential of our methodology is illustrated on case data examples and compared with alternative approaces...
Directory of Open Access Journals (Sweden)
D. Béal
2010-02-01
Full Text Available In biogeochemical models coupled to ocean circulation models, vertical mixing is an important physical process which governs the nutrient supply and the plankton residence in the euphotic layer. However, vertical mixing is often poorly represented in numerical simulations because of approximate parameterizations of sub-grid scale turbulence, wind forcing errors and other mis-represented processes such as restratification by mesoscale eddies. Getting a sufficient knowledge of the nature and structure of these errors is necessary to implement appropriate data assimilation methods and to evaluate if they can be controlled by a given observation system.
In this paper, Monte Carlo simulations are conducted to study mixing errors induced by approximate wind forcings in a three-dimensional coupled physical-biogeochemical model of the North Atlantic with a 1/4° horizontal resolution. An ensemble forecast involving 200 members is performed during the 1998 spring bloom, by prescribing perturbations of the wind forcing to generate mixing errors. The biogeochemical response is shown to be rather complex because of nonlinearities and threshold effects in the coupled model. The response of the surface phytoplankton depends on the region of interest and is particularly sensitive to the local stratification. In addition, the statistical relationships computed between the various physical and biogeochemical variables reflect the signature of the non-Gaussian behaviour of the system. It is shown that significant information on the ecosystem can be retrieved from observations of chlorophyll concentration or sea surface temperature if a simple nonlinear change of variables (anamorphosis is performed by mapping separately and locally the ensemble percentiles of the distributions of each state variable on the Gaussian percentiles. The results of idealized observational updates (performed with perfect observations and neglecting horizontal correlations
Gaussian and Non-Gaussian operations on non-Gaussian state: engineering non-Gaussianity
Directory of Open Access Journals (Sweden)
Olivares Stefano
2014-03-01
Full Text Available Multiple photon subtraction applied to a displaced phase-averaged coherent state, which is a non-Gaussian classical state, produces conditional states with a non trivial (positive Glauber-Sudarshan Prepresentation. We theoretically and experimentally demonstrate that, despite its simplicity, this class of conditional states cannot be fully characterized by direct detection of photon numbers. In particular, the non-Gaussianity of the state is a characteristics that must be assessed by phase-sensitive measurements. We also show that the non-Gaussianity of conditional states can be manipulated by choosing suitable conditioning values and composition of phase-averaged states.
Molenaar, Dylan; de Boeck, Paul
2018-02-01
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.
Fazli Shahri, Hamid Reza; Mahdavinejad, Ramezanali
2018-02-01
Thermal-based processes with Gaussian heat source often produce excessive temperature which can impose thermally-affected layers in specimens. Therefore, the temperature distribution and Heat Affected Zone (HAZ) of materials are two critical factors which are influenced by different process parameters. Measurement of the HAZ thickness and temperature distribution within the processes are not only difficult but also expensive. This research aims at finding a valuable knowledge on these factors by prediction of the process through a novel combinatory model. In this study, an integrated Artificial Neural Network (ANN) and genetic algorithm (GA) was used to predict the HAZ and temperature distribution of the specimens. To end this, a series of full factorial design of experiments were conducted by applying a Gaussian heat flux on Ti-6Al-4 V at first, then the temperature of the specimen was measured by Infrared thermography. The HAZ width of each sample was investigated through measuring the microhardness. Secondly, the experimental data was used to create a GA-ANN model. The efficiency of GA in design and optimization of the architecture of ANN was investigated. The GA was used to determine the optimal number of neurons in hidden layer, learning rate and momentum coefficient of both output and hidden layers of ANN. Finally, the reliability of models was assessed according to the experimental results and statistical indicators. The results demonstrated that the combinatory model predicted the HAZ and temperature more effective than a trial-and-error ANN model.
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Kontogeorgis, Georgios; Michelsen, Michael Locht
2010-01-01
The Cubic-Plus-Association (CPA) equation of state is applied to a large variety of mixtures containing H2S, which are of interest in the oil and gas industry. Binary H2S mixtures with alkanes, CO2, water, methanol, and glycols are first considered. The interactions of H2S with polar compounds...
Latent Partially Ordered Classification Models and Normal Mixtures
Tatsuoka, Curtis; Varadi, Ferenc; Jaeger, Judith
2013-01-01
Latent partially ordered sets (posets) can be employed in modeling cognitive functioning, such as in the analysis of neuropsychological (NP) and educational test data. Posets are cognitively diagnostic in the sense that classification states in these models are associated with detailed profiles of cognitive functioning. These profiles allow for…
Detecting Housing Submarkets using Unsupervised Learning of Finite Mixture Models
DEFF Research Database (Denmark)
Ntantamis, Christos
ability of the model to identify spatial heterogeneity is validated through a set of simulations. The model was applied to Los Angeles county housing prices data for the year 2002. The results suggests that the statistically identified number of submarkets, after taking into account the dwellings......The problem of modeling housing prices has attracted considerable attention due to its importance in terms of households' wealth and in terms of public revenues through taxation. One of the main concerns raised in both the theoretical and the empirical literature is the existence of spatial...... association between prices that can be attributed, among others, to unobserved neighborhood effects. In this paper, a model of spatial association for housing markets is introduced. Spatial association is treated in the context of spatial heterogeneity, which is explicitly modeled in both a global and a local...
Modelling time course gene expression data with finite mixtures of linear additive models.
Grün, Bettina; Scharl, Theresa; Leisch, Friedrich
2012-01-15
A model class of finite mixtures of linear additive models is presented. The component-specific parameters in the regression models are estimated using regularized likelihood methods. The advantages of the regularization are that (i) the pre-specified maximum degrees of freedom for the splines is less crucial than for unregularized estimation and that (ii) for each component individually a suitable degree of freedom is selected in an automatic way. The performance is evaluated in a simulation study with artificial data as well as on a yeast cell cycle dataset of gene expression levels over time. The latest release version of the R package flexmix is available from CRAN (http://cran.r-project.org/).