Mapping geogenic radon potential by regression kriging
Energy Technology Data Exchange (ETDEWEB)
Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)
2016-02-15
Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method
Mapping geogenic radon potential by regression kriging
International Nuclear Information System (INIS)
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-01-01
Radon ( 222 Rn) gas is produced in the radioactive decay chain of uranium ( 238 U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method, regression
Semiparametric regression analysis of interval-censored competing risks data.
Mao, Lu; Lin, Dan-Yu; Zeng, Donglin
2017-09-01
Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes. © 2017, The International Biometric Society.
Analyzing Big Data with the Hybrid Interval Regression Methods
Directory of Open Access Journals (Sweden)
Chia-Hui Huang
2014-01-01
Full Text Available Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM to analyze big data. Recently, the smooth support vector machine (SSVM was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.
Learning Inverse Rig Mappings by Nonlinear Regression.
Holden, Daniel; Saito, Jun; Komura, Taku
2017-03-01
We present a framework to design inverse rig-functions-functions that map low level representations of a character's pose such as joint positions or surface geometry to the representation used by animators called the animation rig. Animators design scenes using an animation rig, a framework widely adopted in animation production which allows animators to design character poses and geometry via intuitive parameters and interfaces. Yet most state-of-the-art computer animation techniques control characters through raw, low level representations such as joint angles, joint positions, or vertex coordinates. This difference often stops the adoption of state-of-the-art techniques in animation production. Our framework solves this issue by learning a mapping between the low level representations of the pose and the animation rig. We use nonlinear regression techniques, learning from example animation sequences designed by the animators. When new motions are provided in the skeleton space, the learned mapping is used to estimate the rig controls that reproduce such a motion. We introduce two nonlinear functions for producing such a mapping: Gaussian process regression and feedforward neural networks. The appropriate solution depends on the nature of the rig and the amount of data available for training. We show our framework applied to various examples including articulated biped characters, quadruped characters, facial animation rigs, and deformable characters. With our system, animators have the freedom to apply any motion synthesis algorithm to arbitrary rigging and animation pipelines for immediate editing. This greatly improves the productivity of 3D animation, while retaining the flexibility and creativity of artistic input.
Lo, Ching F.
1999-01-01
The integration of Radial Basis Function Networks and Back Propagation Neural Networks with the Multiple Linear Regression has been accomplished to map nonlinear response surfaces over a wide range of independent variables in the process of the Modem Design of Experiments. The integrated method is capable to estimate the precision intervals including confidence and predicted intervals. The power of the innovative method has been demonstrated by applying to a set of wind tunnel test data in construction of response surface and estimation of precision interval.
Zero entropy continuous interval maps and MMLS-MMA property
Jiang, Yunping
2018-06-01
We prove that the flow generated by any continuous interval map with zero topological entropy is minimally mean-attractable and minimally mean-L-stable. One of the consequences is that any oscillating sequence is linearly disjoint from all flows generated by all continuous interval maps with zero topological entropy. In particular, the Möbius function is linearly disjoint from all flows generated by all continuous interval maps with zero topological entropy (Sarnak’s conjecture for continuous interval maps). Another consequence is a non-trivial example of a flow having discrete spectrum. We also define a log-uniform oscillating sequence and show a result in ergodic theory for comparison. This material is based upon work supported by the National Science Foundation. It is also partially supported by a collaboration grant from the Simons Foundation (grant number 523341) and PSC-CUNY awards and a grant from NSFC (grant number 11571122).
Voltage interval mappings for an elliptic bursting model
Wojcik, Jeremy; Shilnikov, Andrey
2013-01-01
We employed Poincar\\'e return mappings for a parameter interval to an exemplary elliptic bursting model, the FitzHugh-Nagumo-Rinzel model. Using the interval mappings, we were able to examine in detail the bifurcations that underlie the complex activity transitions between: tonic spiking and bursting, bursting and mixed-mode oscillations, and finally, mixed-mode oscillations and quiescence in the FitzHugh-Nagumo-Rinzel model. We illustrate the wealth of information, qualitative and quantitati...
Bootstrap Prediction Intervals in Non-Parametric Regression with Applications to Anomaly Detection
Kumar, Sricharan; Srivistava, Ashok N.
2012-01-01
Prediction intervals provide a measure of the probable interval in which the outputs of a regression model can be expected to occur. Subsequently, these prediction intervals can be used to determine if the observed output is anomalous or not, conditioned on the input. In this paper, a procedure for determining prediction intervals for outputs of nonparametric regression models using bootstrap methods is proposed. Bootstrap methods allow for a non-parametric approach to computing prediction intervals with no specific assumptions about the sampling distribution of the noise or the data. The asymptotic fidelity of the proposed prediction intervals is theoretically proved. Subsequently, the validity of the bootstrap based prediction intervals is illustrated via simulations. Finally, the bootstrap prediction intervals are applied to the problem of anomaly detection on aviation data.
Iterates of piecewise monotone mappings on an interval
Preston, Chris
1988-01-01
Piecewise monotone mappings on an interval provide simple examples of discrete dynamical systems whose behaviour can be very complicated. These notes are concerned with the properties of the iterates of such mappings. The material presented can be understood by anyone who has had a basic course in (one-dimensional) real analysis. The account concentrates on the topological (as opposed to the measure theoretical) aspects of the theory of piecewise monotone mappings. As well as offering an elementary introduction to this theory, these notes also contain a more advanced treatment of the problem of classifying such mappings up to topological conjugacy.
Coexistence of uniquely ergodic subsystems of interval mapping
International Nuclear Information System (INIS)
Ye Xiangdong.
1991-10-01
The purpose of this paper is to show that uniquely ergodic subsystems of interval mapping also coexist in the same way as minimal sets do. To do this we give some notations in section 2. In section 3 we define D-function of a uniquely ergodic system and show its basic properties. We prove the coexistence of uniquely ergodic subsystems of interval mapping in section 4. Lastly we give the examples of uniquely ergodic systems with given D-functions in section 5. 27 refs
Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.
Hu, Yi-Chung
2014-01-01
On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.
Confidence intervals for distinguishing ordinal and disordinal interactions in multiple regression.
Lee, Sunbok; Lei, Man-Kit; Brody, Gene H
2015-06-01
Distinguishing between ordinal and disordinal interaction in multiple regression is useful in testing many interesting theoretical hypotheses. Because the distinction is made based on the location of a crossover point of 2 simple regression lines, confidence intervals of the crossover point can be used to distinguish ordinal and disordinal interactions. This study examined 2 factors that need to be considered in constructing confidence intervals of the crossover point: (a) the assumption about the sampling distribution of the crossover point, and (b) the possibility of abnormally wide confidence intervals for the crossover point. A Monte Carlo simulation study was conducted to compare 6 different methods for constructing confidence intervals of the crossover point in terms of the coverage rate, the proportion of true values that fall to the left or right of the confidence intervals, and the average width of the confidence intervals. The methods include the reparameterization, delta, Fieller, basic bootstrap, percentile bootstrap, and bias-corrected accelerated bootstrap methods. The results of our Monte Carlo simulation study suggest that statistical inference using confidence intervals to distinguish ordinal and disordinal interaction requires sample sizes more than 500 to be able to provide sufficiently narrow confidence intervals to identify the location of the crossover point. (c) 2015 APA, all rights reserved).
DEFF Research Database (Denmark)
Carstensen, Bendix
1996-01-01
This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men.......This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men....
Mapping the results of local statistics: Using geographically weighted regression
Directory of Open Access Journals (Sweden)
Stephen A. Matthews
2012-03-01
Full Text Available BACKGROUND The application of geographically weighted regression (GWR - a local spatial statistical technique used to test for spatial nonstationarity - has grown rapidly in the social, health, and demographic sciences. GWR is a useful exploratory analytical tool that generates a set of location-specific parameter estimates which can be mapped and analysed to provide information on spatial nonstationarity in the relationships between predictors and the outcome variable. OBJECTIVE A major challenge to users of GWR methods is how best to present and synthesize the large number of mappable results, specifically the local parameter parameter estimates and local t-values, generated from local GWR models. We offer an elegant solution. METHODS This paper introduces a mapping technique to simultaneously display local parameter estimates and local t-values on one map based on the use of data selection and transparency techniques. We integrate GWR software and GIS software package (ArcGIS and adapt earlier work in cartography on bivariate mapping. We compare traditional mapping strategies (i.e., side-by-side comparison and isoline overlay maps with our method using an illustration focusing on US county infant mortality data. CONCLUSIONS The resultant map design is more elegant than methods used to date. This type of map presentation can facilitate the exploration and interpretation of nonstationarity, focusing map reader attention on the areas of primary interest.
The Initial Regression Statistical Characteristics of Intervals Between Zeros of Random Processes
Directory of Open Access Journals (Sweden)
V. K. Hohlov
2014-01-01
Full Text Available The article substantiates the initial regression statistical characteristics of intervals between zeros of realizing random processes, studies their properties allowing the use these features in the autonomous information systems (AIS of near location (NL. Coefficients of the initial regression (CIR to minimize the residual sum of squares of multiple initial regression views are justified on the basis of vector representations associated with a random vector notion of analyzed signal parameters. It is shown that even with no covariance-based private CIR it is possible to predict one random variable through another with respect to the deterministic components. The paper studies dependences of CIR interval sizes between zeros of the narrowband stationary in wide-sense random process with its energy spectrum. Particular CIR for random processes with Gaussian and rectangular energy spectra are obtained. It is shown that the considered CIRs do not depend on the average frequency of spectra, are determined by the relative bandwidth of the energy spectra, and weakly depend on the type of spectrum. CIR properties enable its use as an informative parameter when implementing temporary regression methods of signal processing, invariant to the average rate and variance of the input implementations. We consider estimates of the average energy spectrum frequency of the random stationary process by calculating the length of the time interval corresponding to the specified number of intervals between zeros. It is shown that the relative variance in estimation of the average energy spectrum frequency of stationary random process with increasing relative bandwidth ceases to depend on the last process implementation in processing above ten intervals between zeros. The obtained results can be used in the AIS NL to solve the tasks of detection and signal recognition, when a decision is made in conditions of unknown mathematical expectations on a limited observation
Chaoticity of interval self-maps with positive entropy
International Nuclear Information System (INIS)
Xiong Jincheng.
1988-12-01
Li and Yorke originally introduced the notion of chaos for continuous self-map of the interval I = (0,1). In the present paper we show that an interval self-map with positive topological entropy has a chaoticity more complicated than the chaoticity in the sense of Li and Yorke. The main result is that if f:I → I is continuous and has a periodic point with odd period > 1 then there exists a closed subset K of I invariant with respect to f such that the periodic points are dense in K, the periods of periodic points in K form an infinite set and f|K is topologically mixing. (author). 9 refs
Recurrence determinism and Li-Yorke chaos for interval maps
Špitalský, Vladimír
2017-01-01
Recurrence determinism, one of the fundamental characteristics of recurrence quantification analysis, measures predictability of a trajectory of a dynamical system. It is tightly connected with the conditional probability that, given a recurrence, following states of the trajectory will be recurrences. In this paper we study recurrence determinism of interval dynamical systems. We show that recurrence determinism distinguishes three main types of $\\omega$-limit sets of zero entropy maps: fini...
Tridimensional Regression for Comparing and Mapping 3D Anatomical Structures
Directory of Open Access Journals (Sweden)
Kendra K. Schmid
2012-01-01
Full Text Available Shape analysis is useful for a wide variety of disciplines and has many applications. There are many approaches to shape analysis, one of which focuses on the analysis of shapes that are represented by the coordinates of predefined landmarks on the object. This paper discusses Tridimensional Regression, a technique that can be used for mapping images and shapes that are represented by sets of three-dimensional landmark coordinates, for comparing and mapping 3D anatomical structures. The degree of similarity between shapes can be quantified using the tridimensional coefficient of determination (2. An experiment was conducted to evaluate the effectiveness of this technique to correctly match the image of a face with another image of the same face. These results were compared to the 2 values obtained when only two dimensions are used and show that using three dimensions increases the ability to correctly match and discriminate between faces.
Landslide Hazard Mapping in Rwanda Using Logistic Regression
Piller, A.; Anderson, E.; Ballard, H.
2015-12-01
Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.
Dynamical zeta functions for piecewise monotone maps of the interval
Ruelle, David
2004-01-01
Consider a space M, a map f:M\\to M, and a function g:M \\to {\\mathbb C}. The formal power series \\zeta (z) = \\exp \\sum ^\\infty _{m=1} \\frac {z^m}{m} \\sum _{x \\in \\mathrm {Fix}\\,f^m} \\prod ^{m-1}_{k=0} g (f^kx) yields an example of a dynamical zeta function. Such functions have unexpected analytic properties and interesting relations to the theory of dynamical systems, statistical mechanics, and the spectral theory of certain operators (transfer operators). The first part of this monograph presents a general introduction to this subject. The second part is a detailed study of the zeta functions associated with piecewise monotone maps of the interval [0,1]. In particular, Ruelle gives a proof of a generalized form of the Baladi-Keller theorem relating the poles of \\zeta (z) and the eigenvalues of the transfer operator. He also proves a theorem expressing the largest eigenvalue of the transfer operator in terms of the ergodic properties of (M,f,g).
Mapping urban environmental noise: a land use regression method.
Xie, Dan; Liu, Yi; Chen, Jining
2011-09-01
Forecasting and preventing urban noise pollution are major challenges in urban environmental management. Most existing efforts, including experiment-based models, statistical models, and noise mapping, however, have limited capacity to explain the association between urban growth and corresponding noise change. Therefore, these conventional methods can hardly forecast urban noise at a given outlook of development layout. This paper, for the first time, introduces a land use regression method, which has been applied for simulating urban air quality for a decade, to construct an urban noise model (LUNOS) in Dalian Municipality, Northwest China. The LUNOS model describes noise as a dependent variable of surrounding various land areas via a regressive function. The results suggest that a linear model performs better in fitting monitoring data, and there is no significant difference of the LUNOS's outputs when applied to different spatial scales. As the LUNOS facilitates a better understanding of the association between land use and urban environmental noise in comparison to conventional methods, it can be regarded as a promising tool for noise prediction for planning purposes and aid smart decision-making.
The best of both worlds: Phylogenetic eigenvector regression and mapping
Directory of Open Access Journals (Sweden)
José Alexandre Felizola Diniz Filho
2015-09-01
Full Text Available Eigenfunction analyses have been widely used to model patterns of autocorrelation in time, space and phylogeny. In a phylogenetic context, Diniz-Filho et al. (1998 proposed what they called Phylogenetic Eigenvector Regression (PVR, in which pairwise phylogenetic distances among species are submitted to a Principal Coordinate Analysis, and eigenvectors are then used as explanatory variables in regression, correlation or ANOVAs. More recently, a new approach called Phylogenetic Eigenvector Mapping (PEM was proposed, with the main advantage of explicitly incorporating a model-based warping in phylogenetic distance in which an Ornstein-Uhlenbeck (O-U process is fitted to data before eigenvector extraction. Here we compared PVR and PEM in respect to estimated phylogenetic signal, correlated evolution under alternative evolutionary models and phylogenetic imputation, using simulated data. Despite similarity between the two approaches, PEM has a slightly higher prediction ability and is more general than the original PVR. Even so, in a conceptual sense, PEM may provide a technique in the best of both worlds, combining the flexibility of data-driven and empirical eigenfunction analyses and the sounding insights provided by evolutionary models well known in comparative analyses.
Microbiome Data Accurately Predicts the Postmortem Interval Using Random Forest Regression Models
Directory of Open Access Journals (Sweden)
Aeriel Belk
2018-02-01
Full Text Available Death investigations often include an effort to establish the postmortem interval (PMI in cases in which the time of death is uncertain. The postmortem interval can lead to the identification of the deceased and the validation of witness statements and suspect alibis. Recent research has demonstrated that microbes provide an accurate clock that starts at death and relies on ecological change in the microbial communities that normally inhabit a body and its surrounding environment. Here, we explore how to build the most robust Random Forest regression models for prediction of PMI by testing models built on different sample types (gravesoil, skin of the torso, skin of the head, gene markers (16S ribosomal RNA (rRNA, 18S rRNA, internal transcribed spacer regions (ITS, and taxonomic levels (sequence variants, species, genus, etc.. We also tested whether particular suites of indicator microbes were informative across different datasets. Generally, results indicate that the most accurate models for predicting PMI were built using gravesoil and skin data using the 16S rRNA genetic marker at the taxonomic level of phyla. Additionally, several phyla consistently contributed highly to model accuracy and may be candidate indicators of PMI.
Flexible regression models for estimating postmortem interval (PMI) in forensic medicine.
Muñoz Barús, José Ignacio; Febrero-Bande, Manuel; Cadarso-Suárez, Carmen
2008-10-30
Correct determination of time of death is an important goal in forensic medicine. Numerous methods have been described for estimating postmortem interval (PMI), but most are imprecise, poorly reproducible and/or have not been validated with real data. In recent years, however, some progress in PMI estimation has been made, notably through the use of new biochemical methods for quantifying relevant indicator compounds in the vitreous humour. The best, but unverified, results have been obtained with [K+] and hypoxanthine [Hx], using simple linear regression (LR) models. The main aim of this paper is to offer more flexible alternatives to LR, such as generalized additive models (GAMs) and support vector machines (SVMs) in order to obtain improved PMI estimates. The present study, based on detailed analysis of [K+] and [Hx] in more than 200 vitreous humour samples from subjects with known PMI, compared classical LR methodology with GAM and SVM methodologies. Both proved better than LR for estimation of PMI. SVM showed somewhat greater precision than GAM, but GAM offers a readily interpretable graphical output, facilitating understanding of findings by legal professionals; there are thus arguments for using both types of models. R code for these methods is available from the authors, permitting accurate prediction of PMI from vitreous humour [K+], [Hx] and [U], with confidence intervals and graphical output provided. Copyright 2008 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Lins, Isis Didier; Droguett, Enrique López; Moura, Márcio das Chagas; Zio, Enrico; Jacinto, Carlos Magno
2015-01-01
Data-driven learning methods for predicting the evolution of the degradation processes affecting equipment are becoming increasingly attractive in reliability and prognostics applications. Among these, we consider here Support Vector Regression (SVR), which has provided promising results in various applications. Nevertheless, the predictions provided by SVR are point estimates whereas in order to take better informed decisions, an uncertainty assessment should be also carried out. For this, we apply bootstrap to SVR so as to obtain confidence and prediction intervals, without having to make any assumption about probability distributions and with good performance even when only a small data set is available. The bootstrapped SVR is first verified on Monte Carlo experiments and then is applied to a real case study concerning the prediction of degradation of a component from the offshore oil industry. The results obtained indicate that the bootstrapped SVR is a promising tool for providing reliable point and interval estimates, which can inform maintenance-related decisions on degrading components. - Highlights: • Bootstrap (pairs/residuals) and SVR are used as an uncertainty analysis framework. • Numerical experiments are performed to assess accuracy and coverage properties. • More bootstrap replications does not significantly improve performance. • Degradation of equipment of offshore oil wells is estimated by bootstrapped SVR. • Estimates about the scale growth rate can support maintenance-related decisions
Gao, Yong-Ming; Wan, Ping
2002-06-01
Screening markers efficiently is the foundation of mapping QTLs by composite interval mapping. Main and interaction markers distinguished, besides using background control for genetic variation, could also be used to construct intervals of two-way searching for mapping QTLs with epistasis, which can save a lot of calculation time. Therefore, the efficiency of marker screening would affect power and precision of QTL mapping. A doubled haploid population with 200 individuals and 5 chromosomes was constructed, with 50 markers evenly distributed at 10 cM space. Among a total of 6 QTLs, one was placed on chromosome I, two linked on chromosome II, and the other three linked on chromosome IV. QTL setting included additive effects and epistatic effects of additive x additive, the corresponding QTL interaction effects were set if data were collected under multiple environments. The heritability was assumed to be 0.5 if no special declaration. The power of marker screening by stepwise regression, forward regression, and three methods for random effect prediction, e.g. best linear unbiased prediction (BLUP), linear unbiased prediction (LUP) and adjusted unbiased prediction (AUP), was studied and compared through 100 Monte Carlo simulations. The results indicated that the marker screening power by stepwise regression at 0.1, 0.05 and 0.01 significant level changed from 2% to 68%, the power changed from 2% to 72% by forward regression. The larger the QTL effects, the higher the marker screening power. While the power of marker screening by three random effect prediction was very low, the maximum was only 13%. That suggested that regression methods were much better than those by using the approaches of random effect prediction to identify efficient markers flanking QTLs, and forward selection method was more simple and efficient. The results of simulation study on heritability showed that heightening of both general heritability and interaction heritability of genotype x
Li, Ming; Peng, Qiang; Xiao, Man
2016-01-01
Fortnightly investigations at 12 sampling sites in Meiliang Bay and Gonghu Bay of Lake Taihu (China) were carried out from June to early November 2010. The relationship between abiotic factors and cell density of different Microcystis species was analyzed using the interval maxima regression (IMR) to determine the optimum temperature and nutrient concentrations for growth of different Microcystis species. Our results showed that cell density of all the Microcystis species increased along with the increase of water temperature, but Microcystis aeruginosa adapted to a wide range of temperatures. The optimum total dissolved nitrogen concentrations for M. aeruginosa, Microcystis wesenbergii, Microcystis ichthyoblabe, and unidentified Microcystis were 3.7, 2.0, 2.4, and 1.9 mg L(-1), respectively. The optimum total dissolved phosphorus concentrations for different species were M. wesenbergii (0.27 mg L(-1)) > M. aeruginosa (0.1 mg L(-1)) > M. ichthyoblabe (0.06 mg L(-1)) ≈ unidentified Microcystis, and the iron (Fe(3+)) concentrations were M. wesenbergii (0.73 mg L(-1)) > M. aeruginosa (0.42 mg L(-1)) > M. ichthyoblabe (0.35 mg L(-1)) > unidentified Microcystis (0.09 mg L(-1)). The above results suggest that if phosphorus concentration was reduced to 0.06 mg L(-1) or/and iron concentration was reduced to 0.35 mg L(-1) in Lake Taihu, the large colonial M. wesenbergii and M. aeruginosa would be replaced by small colonial M. ichthyoblabe and unidentified Microcystis. Thereafter, the intensity and frequency of the occurrence of Microcystis blooms would be reduced by changing Microcystis species composition.
Directory of Open Access Journals (Sweden)
Somaye Yeylaghi
2017-06-01
Full Text Available In this paper, a novel hybrid method based on interval-valued fuzzy neural network for approximate of interval-valued fuzzy regression models, is presented. The work of this paper is an expansion of the research of real fuzzy regression models. In this paper interval-valued fuzzy neural network (IVFNN can be trained with crisp and interval-valued fuzzy data. Here a neural network is considered as a part of a large field called neural computing or soft computing. Moreover, in order to find the approximate parameters, a simple algorithm from the cost function of the fuzzy neural network is proposed. Finally, we illustrate our approach by some numerical examples and compare this method with existing methods.
Complexity of a kind of interval continuous self-map of finite type
International Nuclear Information System (INIS)
Wang Lidong; Chu Zhenyan; Liao Gongfu
2011-01-01
Highlights: → We find the Hausdorff dimension for an interval continuous self-map f of finite type is s element of (0,1) on a non-wandering set. → f| Ω(f) has positive topological entropy. → f| Ω(f) is chaotic such as Devaney chaos, Kato chaos, two point distributional chaos and so on. - Abstract: An interval map is called finitely typal, if the restriction of the map to non-wandering set is topologically conjugate with a subshift of finite type. In this paper, we prove that there exists an interval continuous self-map of finite type such that the Hausdorff dimension is an arbitrary number in the interval (0, 1), discuss various chaotic properties of the map and the relations between chaotic set and the set of recurrent points.
Complexity of a kind of interval continuous self-map of finite type
Energy Technology Data Exchange (ETDEWEB)
Wang Lidong, E-mail: wld@dlnu.edu.cn [Institute of Mathematics, Dalian Nationalities University, Dalian 116600 (China); Institute of Mathematics, Jilin Normal University, Siping 136000 (China); Chu Zhenyan, E-mail: chuzhenyan8@163.com [Institute of Mathematics, Dalian Nationalities University, Dalian 116600 (China) and Institute of Mathematics, Jilin University, Changchun 130023 (China); Liao Gongfu, E-mail: liaogf@email.jlu.edu.cn [Institute of Mathematics, Jilin University, Changchun 130023 (China)
2011-10-15
Highlights: > We find the Hausdorff dimension for an interval continuous self-map f of finite type is s element of (0,1) on a non-wandering set. > f|{sub {Omega}(f)} has positive topological entropy. > f|{sub {Omega}(f)} is chaotic such as Devaney chaos, Kato chaos, two point distributional chaos and so on. - Abstract: An interval map is called finitely typal, if the restriction of the map to non-wandering set is topologically conjugate with a subshift of finite type. In this paper, we prove that there exists an interval continuous self-map of finite type such that the Hausdorff dimension is an arbitrary number in the interval (0, 1), discuss various chaotic properties of the map and the relations between chaotic set and the set of recurrent points.
Sun, Jianguo; Feng, Yanqin; Zhao, Hui
2015-01-01
Interval-censored failure time data occur in many fields including epidemiological and medical studies as well as financial and sociological studies, and many authors have investigated their analysis (Sun, The statistical analysis of interval-censored failure time data, 2006; Zhang, Stat Modeling 9:321-343, 2009). In particular, a number of procedures have been developed for regression analysis of interval-censored data arising from the proportional hazards model (Finkelstein, Biometrics 42:845-854, 1986; Huang, Ann Stat 24:540-568, 1996; Pan, Biometrics 56:199-203, 2000). For most of these procedures, however, one drawback is that they involve estimation of both regression parameters and baseline cumulative hazard function. In this paper, we propose two simple estimation approaches that do not need estimation of the baseline cumulative hazard function. The asymptotic properties of the resulting estimates are given, and an extensive simulation study is conducted and indicates that they work well for practical situations.
Semiparametric regression analysis of failure time data with dependent interval censoring.
Chen, Chyong-Mei; Shen, Pao-Sheng
2017-09-20
Interval-censored failure-time data arise when subjects are examined or observed periodically such that the failure time of interest is not examined exactly but only known to be bracketed between two adjacent observation times. The commonly used approaches assume that the examination times and the failure time are independent or conditionally independent given covariates. In many practical applications, patients who are already in poor health or have a weak immune system before treatment usually tend to visit physicians more often after treatment than those with better health or immune system. In this situation, the visiting rate is positively correlated with the risk of failure due to the health status, which results in dependent interval-censored data. While some measurable factors affecting health status such as age, gender, and physical symptom can be included in the covariates, some health-related latent variables cannot be observed or measured. To deal with dependent interval censoring involving unobserved latent variable, we characterize the visiting/examination process as recurrent event process and propose a joint frailty model to account for the association of the failure time and visiting process. A shared gamma frailty is incorporated into the Cox model and proportional intensity model for the failure time and visiting process, respectively, in a multiplicative way. We propose a semiparametric maximum likelihood approach for estimating model parameters and show the asymptotic properties, including consistency and weak convergence. Extensive simulation studies are conducted and a data set of bladder cancer is analyzed for illustrative purposes. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Wang, Peijie; Zhao, Hui; Sun, Jianguo
2016-12-01
Interval-censored failure time data occur in many fields such as demography, economics, medical research, and reliability and many inference procedures on them have been developed (Sun, 2006; Chen, Sun, and Peace, 2012). However, most of the existing approaches assume that the mechanism that yields interval censoring is independent of the failure time of interest and it is clear that this may not be true in practice (Zhang et al., 2007; Ma, Hu, and Sun, 2015). In this article, we consider regression analysis of case K interval-censored failure time data when the censoring mechanism may be related to the failure time of interest. For the problem, an estimated sieve maximum-likelihood approach is proposed for the data arising from the proportional hazards frailty model and for estimation, a two-step procedure is presented. In the addition, the asymptotic properties of the proposed estimators of regression parameters are established and an extensive simulation study suggests that the method works well. Finally, we apply the method to a set of real interval-censored data that motivated this study. © 2016, The International Biometric Society.
Directory of Open Access Journals (Sweden)
Soyoung Park
2017-07-01
Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.
DEFF Research Database (Denmark)
Adhikari, Kabindra; Bou Kheir, Rania; Greve, Mette Balslev
2013-01-01
Information on the spatial variability of soil texture including soil clay content in a landscape is very important for agricultural and environmental use. Different prediction techniques are available to assess and map spatial variability of soil properties, but selecting the most suitable techn...... the prediction in OKst compared with that in OK, whereas RT showed the lowest performance of all (R2 = 0.52; RMSE = 0.52; and RPD = 1.17). We found RKrr to be an effective prediction method and recommend this method for any future soil mapping activities in Denmark....... technique at a given site has always been a major issue in all soil mapping applications. We studied the prediction performance of ordinary kriging (OK), stratified OK (OKst), regression trees (RT), and rule-based regression kriging (RKrr) for digital mapping of soil clay content at 30.4-m grid size using 6...
Differential properties and attracting sets of a simplest skew product of interval maps
International Nuclear Information System (INIS)
Efremova, Lyudmila S
2010-01-01
For a skew product of interval maps with a closed set of periodic points, the dependence of the structure of its ω-limit sets on its differential properties is investigated. An example of a map in this class is constructed which has the maximal differentiability properties (within a certain subclass) with respect to the variable x, is C 1 -smooth in the y-variable and has one-dimensional ω-limit sets. Theorems are proved that give necessary conditions for one-dimensional ω-limit sets to exist. One of them is formulated in terms of the divergence of the series consisting of the values of a function of x; this function is the C 0 -norm of the deviation of the restrictions of the fibre maps to some nondegenerate closed interval from the identity on the same interval. Another theorem is formulated in terms of the properties of the partial derivative with respect to x of the fibre maps. A complete description is given of the ω-limit sets of certain class of C 1 -smooth skew products satisfying some natural conditions. Bibliography: 33 titles.
Directory of Open Access Journals (Sweden)
Ebrahim Karimi Sangchini
2015-01-01
Full Text Available Landslides are amongst the most damaging natural hazards in mountainous regions. Every year, hundreds of people all over the world lose their lives in landslides; furthermore, there are large impacts on the local and global economy from these events. In this study, landslide hazard zonation in Babaheydar watershed using logistic regression was conducted to determine landslide hazard areas. At first, the landslide inventory map was prepared using aerial photograph interpretations and field surveys. The next step, ten landslide conditioning factors such as altitude, slope percentage, slope aspect, lithology, distance from faults, rivers, settlement and roads, land use, and precipitation were chosen as effective factors on landsliding in the study area. Subsequently, landslide susceptibility map was constructed using the logistic regression model in Geographic Information System (GIS. The ROC and Pseudo-R2 indexes were used for model assessment. Results showed that the logistic regression model provided slightly high prediction accuracy of landslide susceptibility maps in the Babaheydar Watershed with ROC equal to 0.876. Furthermore, the results revealed that about 44% of the watershed areas were located in high and very high hazard classes. The resultant landslide susceptibility maps can be useful in appropriate watershed management practices and for sustainable development in the region.
Sandberg, D.T.
1986-01-01
This thickness map of a Late Cretaceous interval in the northwestern part of the San Juan Basin is part of a study of the relationship between ancient shore 1ines and coal-forming swamps during the filial regression of the Cretaceous epicontinental sea. The top of the thickness interval is the top of the Pictured Cliffs Sands tone. The base of the interval is a thin time marker, the Huerfanito Bentonite Bed of the Lewis Shale. The interval includes all of the Pictured Cliffs Sandstone and the upper part of the Lewis Shale. The northwest boundary of the map area is the outcrop of the Pictured Cliffs Sandstone and the Lewis Shale.
Single Image Super-Resolution Using Global Regression Based on Multiple Local Linear Mappings.
Choi, Jae-Seok; Kim, Munchurl
2017-03-01
Super-resolution (SR) has become more vital, because of its capability to generate high-quality ultra-high definition (UHD) high-resolution (HR) images from low-resolution (LR) input images. Conventional SR methods entail high computational complexity, which makes them difficult to be implemented for up-scaling of full-high-definition input images into UHD-resolution images. Nevertheless, our previous super-interpolation (SI) method showed a good compromise between Peak-Signal-to-Noise Ratio (PSNR) performances and computational complexity. However, since SI only utilizes simple linear mappings, it may fail to precisely reconstruct HR patches with complex texture. In this paper, we present a novel SR method, which inherits the large-to-small patch conversion scheme from SI but uses global regression based on local linear mappings (GLM). Thus, our new SR method is called GLM-SI. In GLM-SI, each LR input patch is divided into 25 overlapped subpatches. Next, based on the local properties of these subpatches, 25 different local linear mappings are applied to the current LR input patch to generate 25 HR patch candidates, which are then regressed into one final HR patch using a global regressor. The local linear mappings are learned cluster-wise in our off-line training phase. The main contribution of this paper is as follows: Previously, linear-mapping-based conventional SR methods, including SI only used one simple yet coarse linear mapping to each patch to reconstruct its HR version. On the contrary, for each LR input patch, our GLM-SI is the first to apply a combination of multiple local linear mappings, where each local linear mapping is found according to local properties of the current LR patch. Therefore, it can better approximate nonlinear LR-to-HR mappings for HR patches with complex texture. Experiment results show that the proposed GLM-SI method outperforms most of the state-of-the-art methods, and shows comparable PSNR performance with much lower
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.
Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam
2017-11-01
The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
Monopole and dipole estimation for multi-frequency sky maps by linear regression
Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.
2017-01-01
We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.
Francq, Bernard G; Govaerts, Bernadette
2016-06-30
Two main methodologies for assessing equivalence in method-comparison studies are presented separately in the literature. The first one is the well-known and widely applied Bland-Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors-in-variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated-errors-in-variables regression is introduced as the errors are shown to be correlated in the Bland-Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland-Atman plot with excellent coverage probabilities. We conclude that the (correlated)-errors-in-variables regressions should not be avoided in method comparison studies, although the Bland-Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Farid Djeddaoui
2017-10-01
Full Text Available The main goal of this work was to identify the areas that are most susceptible to desertification in a part of the Algerian steppe, and to quantitatively assess the key factors that contribute to this desertification. In total, 139 desertified zones were mapped using field surveys and photo-interpretation. We selected 16 spectral and geomorphic predictive factors, which a priori play a significant role in desertification. They were mainly derived from Landsat 8 imagery and Shuttle Radar Topographic Mission digital elevation model (SRTM DEM. Some factors, such as the topographic position index (TPI and curvature, were used for the first time in this kind of study. For this purpose, we adapted the logistic regression algorithm for desertification susceptibility mapping, which has been widely used for landslide susceptibility mapping. The logistic model was evaluated using the area under the receiver operating characteristic (ROC curve. The model accuracy was 87.8%. We estimated the model uncertainties using a bootstrap method. Our analysis suggests that the predictive model is robust and stable. Our results indicate that land cover factors, including normalized difference vegetation index (NDVI and rangeland classes, play a major role in determining desertification occurrence, while geomorphological factors have a limited impact. The predictive map shows that 44.57% of the area is classified as highly to very highly susceptible to desertification. The developed approach can be used to assess desertification in areas with similar characteristics and to guide possible actions to combat desertification.
On Bayesian shared component disease mapping and ecological regression with errors in covariates.
MacNab, Ying C
2010-05-20
Recent literature on Bayesian disease mapping presents shared component models (SCMs) for joint spatial modeling of two or more diseases with common risk factors. In this study, Bayesian hierarchical formulations of shared component disease mapping and ecological models are explored and developed in the context of ecological regression, taking into consideration errors in covariates. A review of multivariate disease mapping models (MultiVMs) such as the multivariate conditional autoregressive models that are also part of the more recent Bayesian disease mapping literature is presented. Some insights into the connections and distinctions between the SCM and MultiVM procedures are communicated. Important issues surrounding (appropriate) formulation of shared- and disease-specific components, consideration/choice of spatial or non-spatial random effects priors, and identification of model parameters in SCMs are explored and discussed in the context of spatial and ecological analysis of small area multivariate disease or health outcome rates and associated ecological risk factors. The methods are illustrated through an in-depth analysis of four-variate road traffic accident injury (RTAI) data: gender-specific fatal and non-fatal RTAI rates in 84 local health areas in British Columbia (Canada). Fully Bayesian inference via Markov chain Monte Carlo simulations is presented. Copyright 2010 John Wiley & Sons, Ltd.
GIS-based rare events logistic regression for mineral prospectivity mapping
Xiong, Yihui; Zuo, Renguang
2018-02-01
Mineralization is a special type of singularity event, and can be considered as a rare event, because within a specific study area the number of prospective locations (1s) are considerably fewer than the number of non-prospective locations (0s). In this study, GIS-based rare events logistic regression (RELR) was used to map the mineral prospectivity in the southwestern Fujian Province, China. An odds ratio was used to measure the relative importance of the evidence variables with respect to mineralization. The results suggest that formations, granites, and skarn alterations, followed by faults and aeromagnetic anomaly are the most important indicators for the formation of Fe-related mineralization in the study area. The prediction rate and the area under the curve (AUC) values show that areas with higher probability have a strong spatial relationship with the known mineral deposits. Comparing the results with original logistic regression (OLR) demonstrates that the GIS-based RELR performs better than OLR. The prospectivity map obtained in this study benefits the search for skarn Fe-related mineralization in the study area.
Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh
2015-12-01
Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE
Subpixel Snow Cover Mapping from MODIS Data by Nonparametric Regression Splines
Akyurek, Z.; Kuter, S.; Weber, G. W.
2016-12-01
Spatial extent of snow cover is often considered as one of the key parameters in climatological, hydrological and ecological modeling due to its energy storage, high reflectance in the visible and NIR regions of the electromagnetic spectrum, significant heat capacity and insulating properties. A significant challenge in snow mapping by remote sensing (RS) is the trade-off between the temporal and spatial resolution of satellite imageries. In order to tackle this issue, machine learning-based subpixel snow mapping methods, like Artificial Neural Networks (ANNs), from low or moderate resolution images have been proposed. Multivariate Adaptive Regression Splines (MARS) is a nonparametric regression tool that can build flexible models for high dimensional and complex nonlinear data. Although MARS is not often employed in RS, it has various successful implementations such as estimation of vertical total electron content in ionosphere, atmospheric correction and classification of satellite images. This study is the first attempt in RS to evaluate the applicability of MARS for subpixel snow cover mapping from MODIS data. Total 16 MODIS-Landsat ETM+ image pairs taken over European Alps between March 2000 and April 2003 were used in the study. MODIS top-of-atmospheric reflectance, NDSI, NDVI and land cover classes were used as predictor variables. Cloud-covered, cloud shadow, water and bad-quality pixels were excluded from further analysis by a spatial mask. MARS models were trained and validated by using reference fractional snow cover (FSC) maps generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also developed. The mutual comparison of obtained MARS and ANN models was accomplished on independent test areas. The MARS model performed better than the ANN model with an average RMSE of 0.1288 over the independent test areas; whereas the average RMSE of the ANN model
Landslide susceptibility mapping on a global scale using the method of logistic regression
Directory of Open Access Journals (Sweden)
L. Lin
2017-08-01
Full Text Available This paper proposes a statistical model for mapping global landslide susceptibility based on logistic regression. After investigating explanatory factors for landslides in the existing literature, five factors were selected for model landslide susceptibility: relative relief, extreme precipitation, lithology, ground motion and soil moisture. When building the model, 70 % of landslide and nonlandslide points were randomly selected for logistic regression, and the others were used for model validation. To evaluate the accuracy of predictive models, this paper adopts several criteria including a receiver operating characteristic (ROC curve method. Logistic regression experiments found all five factors to be significant in explaining landslide occurrence on a global scale. During the modeling process, percentage correct in confusion matrix of landslide classification was approximately 80 % and the area under the curve (AUC was nearly 0.87. During the validation process, the above statistics were about 81 % and 0.88, respectively. Such a result indicates that the model has strong robustness and stable performance. This model found that at a global scale, soil moisture can be dominant in the occurrence of landslides and topographic factor may be secondary.
Directory of Open Access Journals (Sweden)
Mohsen Omidvar
2015-12-01
Full Text Available Background & objective: Due to the features such as intuitive graphical appearance, ease of perception and straightforward applicability, risk matrix has become as one of the most used risk assessment tools. On the other hand, features such as the lack of precision in the classification of risk index, as well as subjective computational process, has limited its use. In order to solve this problem, in the current study we used fuzzy logic inference systems and mathematical operators (interval numbers and mapping operator. Methods: In this study, first 10 risk scenarios in the excavation and piping process were selected, then the outcome of the risk assessment were studied using four types of matrix including traditional (ORM, displaced cells (RCM , extended (ERM and fuzzy (FRM risk matrixes. Results: The results showed that the use of FRM and ERM matrix have prority, due to the high level of " Risk Tie Density" (RTD and "Risk Level Density" (RLD in the ORM and RCM matrix, as well as more accurate results presented in FRM and ERM, in risk assessment. While, FRM matrix provides more reliable results due to the application of fuzzy membership functions. Conclusion: Using new mathematical issues such as fuzzy sets and arithmetic and mapping operators for risk assessment could improve the accuracy of risk matrix and increase the reliability of the risk assessment results, when the accurate data are not available, or its data are avaliable in a limit range.
Kuiper, Gerhardus J A J M; Houben, Rik; Wetzels, Rick J H; Verhezen, Paul W M; Oerle, Rene van; Ten Cate, Hugo; Henskens, Yvonne M C; Lancé, Marcus D
2017-11-01
Low platelet counts and hematocrit levels hinder whole blood point-of-care testing of platelet function. Thus far, no reference ranges for MEA (multiple electrode aggregometry) and PFA-100 (platelet function analyzer 100) devices exist for low ranges. Through dilution methods of volunteer whole blood, platelet function at low ranges of platelet count and hematocrit levels was assessed on MEA for four agonists and for PFA-100 in two cartridges. Using (multiple) regression analysis, 95% reference intervals were computed for these low ranges. Low platelet counts affected MEA in a positive correlation (all agonists showed r 2 ≥ 0.75) and PFA-100 in an inverse correlation (closure times were prolonged with lower platelet counts). Lowered hematocrit did not affect MEA testing, except for arachidonic acid activation (ASPI), which showed a weak positive correlation (r 2 = 0.14). Closure time on PFA-100 testing was inversely correlated with hematocrit for both cartridges. Regression analysis revealed different 95% reference intervals in comparison with originally established intervals for both MEA and PFA-100 in low platelet or hematocrit conditions. Multiple regression analysis of ASPI and both tests on the PFA-100 for combined low platelet and hematocrit conditions revealed that only PFA-100 testing should be adjusted for both thrombocytopenia and anemia. 95% reference intervals were calculated using multiple regression analysis. However, coefficients of determination of PFA-100 were poor, and some variance remained unexplained. Thus, in this pilot study using (multiple) regression analysis, we could establish reference intervals of platelet function in anemia and thrombocytopenia conditions on PFA-100 and in thrombocytopenia conditions on MEA.
Towards molecular design using 2D-molecular contour maps obtained from PLS regression coefficients
Borges, Cleber N.; Barigye, Stephen J.; Freitas, Matheus P.
2017-12-01
The multivariate image analysis descriptors used in quantitative structure-activity relationships are direct representations of chemical structures as they are simply numerical decodifications of pixels forming the 2D chemical images. These MDs have found great utility in the modeling of diverse properties of organic molecules. Given the multicollinearity and high dimensionality of the data matrices generated with the MIA-QSAR approach, modeling techniques that involve the projection of the data space onto orthogonal components e.g. Partial Least Squares (PLS) have been generally used. However, the chemical interpretation of the PLS-based MIA-QSAR models, in terms of the structural moieties affecting the modeled bioactivity has not been straightforward. This work describes the 2D-contour maps based on the PLS regression coefficients, as a means of assessing the relevance of single MIA predictors to the response variable, and thus allowing for the structural, electronic and physicochemical interpretation of the MIA-QSAR models. A sample study to demonstrate the utility of the 2D-contour maps to design novel drug-like molecules is performed using a dataset of some anti-HIV-1 2-amino-6-arylsulfonylbenzonitriles and derivatives, and the inferences obtained are consistent with other reports in the literature. In addition, the different schemes for encoding atomic properties in molecules are discussed and evaluated.
Lambert, Ronald J W; Mytilinaios, Ioannis; Maitland, Luke; Brown, Angus M
2012-08-01
This study describes a method to obtain parameter confidence intervals from the fitting of non-linear functions to experimental data, using the SOLVER and Analysis ToolPaK Add-In of the Microsoft Excel spreadsheet. Previously we have shown that Excel can fit complex multiple functions to biological data, obtaining values equivalent to those returned by more specialized statistical or mathematical software. However, a disadvantage of using the Excel method was the inability to return confidence intervals for the computed parameters or the correlations between them. Using a simple Monte-Carlo procedure within the Excel spreadsheet (without recourse to programming), SOLVER can provide parameter estimates (up to 200 at a time) for multiple 'virtual' data sets, from which the required confidence intervals and correlation coefficients can be obtained. The general utility of the method is exemplified by applying it to the analysis of the growth of Listeria monocytogenes, the growth inhibition of Pseudomonas aeruginosa by chlorhexidine and the further analysis of the electrophysiological data from the compound action potential of the rodent optic nerve. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
High-Dimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps
International Nuclear Information System (INIS)
Thimmisetty, Charanraj A.; Ghanem, Roger G.; White, Joshua A.; Chen, Xiao
2017-01-01
This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen high-dimensional interpolation by inferring non-linear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a low-dimensional manifold embedded in an ambient high-dimensional space associated with the data. It relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the high-dimensional ambient space. The proposed method is applicable to arbitrarily high-dimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of the geological structural properties is observed with the proposed method.
Jović, Ozren; Smrečki, Neven; Popović, Zora
2016-04-01
A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for poil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEPoil (R(2)>0.99). Copyright © 2015 Elsevier B.V. All rights reserved.
Vogel, L.; Sihler, H.; Lampel, J.; Wagner, T.; Platt, U.
2012-06-01
Remote sensing via differential optical absorption spectroscopy (DOAS) has become a standard technique to identify and quantify trace gases in the atmosphere. The technique is applied in a variety of configurations, commonly classified into active and passive instruments using artificial and natural light sources, respectively. Platforms range from ground based to satellite instruments and trace-gases are studied in all kinds of different environments. Due to the wide range of measurement conditions, atmospheric compositions and instruments used, a specific challenge of a DOAS retrieval is to optimize the parameters for each specific case and particular trace gas of interest. This becomes especially important when measuring close to the detection limit. A well chosen evaluation wavelength range is crucial to the DOAS technique. It should encompass strong absorption bands of the trace gas of interest in order to maximize the sensitivity of the retrieval, while at the same time minimizing absorption structures of other trace gases and thus potential interferences. Also, instrumental limitations and wavelength depending sources of errors (e.g. insufficient corrections for the Ring effect and cross correlations between trace gas cross sections) need to be taken into account. Most often, not all of these requirements can be fulfilled simultaneously and a compromise needs to be found depending on the conditions at hand. Although for many trace gases the overall dependence of common DOAS retrieval on the evaluation wavelength interval is known, a systematic approach to find the optimal retrieval wavelength range and qualitative assessment is missing. Here we present a novel tool to determine the optimal evaluation wavelength range. It is based on mapping retrieved values in the retrieval wavelength space and thus visualize the consequence of different choices of retrieval spectral ranges, e.g. caused by slightly erroneous absorption cross sections, cross correlations and
Andrew T. Hudak; Nicholas L. Crookston; Jeffrey S. Evans; Michael K. Falkowski; Alistair M. S. Smith; Paul E. Gessler; Penelope Morgan
2006-01-01
We compared the utility of discrete-return light detection and ranging (lidar) data and multispectral satellite imagery, and their integration, for modeling and mapping basal area and tree density across two diverse coniferous forest landscapes in north-central Idaho. We applied multiple linear regression models subset from a suite of 26 predictor variables derived...
Moghimbeigi, Abbas
2015-05-07
Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Langella, Giuliano; Basile, Angelo; Bonfante, Antonello; Manna, Piero; Terribile, Fabio
2013-04-01
Digital soil mapping procedures are widespread used to build two-dimensional continuous maps about several pedological attributes. Our work addressed a regression kriging (RK) technique and a bootstrapped artificial neural network approach in order to evaluate and compare (i) the accuracy of prediction, (ii) the susceptibility of being included in automatic engines (e.g. to constitute web processing services), and (iii) the time cost needed for calibrating models and for making predictions. Regression kriging is maybe the most widely used geostatistical technique in the digital soil mapping literature. Here we tried to apply the EBLUP regression kriging as it is deemed to be the most statistically sound RK flavor by pedometricians. An unusual multi-parametric and nonlinear machine learning approach was accomplished, called BAGAP (Bootstrap aggregating Artificial neural networks with Genetic Algorithms and Principal component regression). BAGAP combines a selected set of weighted neural nets having specified characteristics to yield an ensemble response. The purpose of applying these two particular models is to ascertain whether and how much a more cumbersome machine learning method could be much promising in making more accurate/precise predictions. Being aware of the difficulty to handle objects based on EBLUP-RK as well as BAGAP when they are embedded in environmental applications, we explore the susceptibility of them in being wrapped within Web Processing Services. Two further kinds of aspects are faced for an exhaustive evaluation and comparison: automaticity and time of calculation with/without high performance computing leverage.
Yilmaz, Isik; Keskin, Inan; Marschalko, Marian; Bednarik, Martin
2010-05-01
This study compares the GIS based collapse susceptibility mapping methods such as; conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) applied in gypsum rock masses in Sivas basin (Turkey). Digital Elevation Model (DEM) was first constructed using GIS software. Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index- TWI, stream power index- SPI, Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from CP, LR and ANN models, and they were then compared by means of their validations. Area Under Curve (AUC) values obtained from all three methodologies showed that the map obtained from ANN model looks like more accurate than the other models, and the results also showed that the artificial neural networks is a usefull tool in preparation of collapse susceptibility map and highly compatible with GIS operating features. Key words: Collapse; doline; susceptibility map; gypsum; GIS; conditional probability; logistic regression; artificial neural networks.
Directory of Open Access Journals (Sweden)
Robinson Nicholas
2007-04-01
Full Text Available Abstract This study represents the first attempt at an empirical evaluation of the DNA pooling methodology by comparing it to individual genotyping and interval mapping to detect QTL in a dairy half-sib design. The findings indicated that the use of peak heights from the pool electropherograms without correction for stutter (shadow product and preferential amplification performed as well as corrected estimates of frequencies. However, errors were found to decrease the power of the experiment at every stage of the pooling and analysis. The main sources of errors include technical errors from DNA quantification, pool construction, inconsistent differential amplification, and from the prevalence of sire alleles in the dams. Additionally, interval mapping using individual genotyping gains information from phenotypic differences between individuals in the same pool and from neighbouring markers, which is lost in a DNA pooling design. These errors cause some differences between the markers detected as significant by pooling and those found significant by interval mapping based on individual selective genotyping. Therefore, it is recommended that pooled genotyping only be used as part of an initial screen with significant results to be confirmed by individual genotyping. Strategies for improving the efficiency of the DNA pooling design are also presented.
International Nuclear Information System (INIS)
Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios, Martha; Baechler, Sébastien
2015-01-01
Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables
Translation of Bernstein Coefficients Under an Affine Mapping of the Unit Interval
Alford, John A., II
2012-01-01
We derive an expression connecting the coefficients of a polynomial expanded in the Bernstein basis to the coefficients of an equivalent expansion of the polynomial under an affine mapping of the domain. The expression may be useful in the calculation of bounds for multi-variate polynomials.
High Resolution of Quantitative Traits Into Multiple Loci via Interval Mapping
Jansen, Ritsert C.; Stam, Piet
1994-01-01
A very general method is described for multiple linear regression of a quantitative phenotype on genotype [putative quantitative trait loci (QTLs) and markers] in segregating generations obtained from line crosses. The method exploits two features, (a) the use of additional parental and F1 data, which fixes the joint QTL effects and the environmental error, and (b) the use of markers as cofactors, which reduces the genetic background noise. As a result, a significant increase of QTL detection...
Demir, Gökhan; aytekin, mustafa; banu ikizler, sabriye; angın, zekai
2013-04-01
The North Anatolian Fault is know as one of the most active and destructive fault zone which produced many earthquakes with high magnitudes. Along this fault zone, the morphology and the lithological features are prone to landsliding. However, many earthquake induced landslides were recorded by several studies along this fault zone, and these landslides caused both injuiries and live losts. Therefore, a detailed landslide susceptibility assessment for this area is indispancable. In this context, a landslide susceptibility assessment for the 1445 km2 area in the Kelkit River valley a part of North Anatolian Fault zone (Eastern Black Sea region of Turkey) was intended with this study, and the results of this study are summarized here. For this purpose, geographical information system (GIS) and a bivariate statistical model were used. Initially, Landslide inventory maps are prepared by using landslide data determined by field surveys and landslide data taken from General Directorate of Mineral Research and Exploration. The landslide conditioning factors are considered to be lithology, slope gradient, slope aspect, topographical elevation, distance to streams, distance to roads and distance to faults, drainage density and fault density. ArcGIS package was used to manipulate and analyze all the collected data Logistic regression method was applied to create a landslide susceptibility map. Landslide susceptibility maps were divided into five susceptibility regions such as very low, low, moderate, high and very high. The result of the analysis was verified using the inventoried landslide locations and compared with the produced probability model. For this purpose, Area Under Curvature (AUC) approach was applied, and a AUC value was obtained. Based on this AUC value, the obtained landslide susceptibility map was concluded as satisfactory. Keywords: North Anatolian Fault Zone, Landslide susceptibility map, Geographical Information Systems, Logistic Regression Analysis.
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2017-12-01
Full Text Available We evaluated the performance of nine machine learning regression algorithms and their ensembles for sub-pixel estimation of impervious areas coverages from Landsat imagery. The accuracy of imperviousness mapping in individual time points was assessed based on RMSE, MAE and R2. These measures were also used for the assessment of imperviousness change intensity estimations. The applicability for detection of relevant changes in impervious areas coverages at sub-pixel level was evaluated using overall accuracy, F-measure and ROC Area Under Curve. The results proved that Cubist algorithm may be advised for Landsat-based mapping of imperviousness for single dates. Stochastic gradient boosting of regression trees (GBM may be also considered for this purpose. However, Random Forest algorithm is endorsed for both imperviousness change detection and mapping of its intensity. In all applications the heterogeneous model ensembles performed at least as well as the best individual models or better. They may be recommended for improving the quality of sub-pixel imperviousness and imperviousness change mapping. The study revealed also limitations of the investigated methodology for detection of subtle changes of imperviousness inside the pixel. None of the tested approaches was able to reliably classify changed and non-changed pixels if the relevant change threshold was set as one or three percent. Also for fi ve percent change threshold most of algorithms did not ensure that the accuracy of change map is higher than the accuracy of random classifi er. For the threshold of relevant change set as ten percent all approaches performed satisfactory.
Drzewiecki, Wojciech
2017-12-01
We evaluated the performance of nine machine learning regression algorithms and their ensembles for sub-pixel estimation of impervious areas coverages from Landsat imagery. The accuracy of imperviousness mapping in individual time points was assessed based on RMSE, MAE and R2. These measures were also used for the assessment of imperviousness change intensity estimations. The applicability for detection of relevant changes in impervious areas coverages at sub-pixel level was evaluated using overall accuracy, F-measure and ROC Area Under Curve. The results proved that Cubist algorithm may be advised for Landsat-based mapping of imperviousness for single dates. Stochastic gradient boosting of regression trees (GBM) may be also considered for this purpose. However, Random Forest algorithm is endorsed for both imperviousness change detection and mapping of its intensity. In all applications the heterogeneous model ensembles performed at least as well as the best individual models or better. They may be recommended for improving the quality of sub-pixel imperviousness and imperviousness change mapping. The study revealed also limitations of the investigated methodology for detection of subtle changes of imperviousness inside the pixel. None of the tested approaches was able to reliably classify changed and non-changed pixels if the relevant change threshold was set as one or three percent. Also for fi ve percent change threshold most of algorithms did not ensure that the accuracy of change map is higher than the accuracy of random classifi er. For the threshold of relevant change set as ten percent all approaches performed satisfactory.
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-01
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12 mg kg- 1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (w w- 1). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59 mg kg- 1, REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis.
Fine-mapping and initial characterization of QT interval loci in African Americans.
Directory of Open Access Journals (Sweden)
Christy L Avery
Full Text Available The QT interval (QT is heritable and its prolongation is a risk factor for ventricular tachyarrhythmias and sudden death. Most genetic studies of QT have examined European ancestral populations; however, the increased genetic diversity in African Americans provides opportunities to narrow association signals and identify population-specific variants. We therefore evaluated 6,670 SNPs spanning eleven previously identified QT loci in 8,644 African American participants from two Population Architecture using Genomics and Epidemiology (PAGE studies: the Atherosclerosis Risk in Communities study and Women's Health Initiative Clinical Trial. Of the fifteen known independent QT variants at the eleven previously identified loci, six were significantly associated with QT in African American populations (P≤1.20×10(-4: ATP1B1, PLN1, KCNQ1, NDRG4, and two NOS1AP independent signals. We also identified three population-specific signals significantly associated with QT in African Americans (P≤1.37×10(-5: one at NOS1AP and two at ATP1B1. Linkage disequilibrium (LD patterns in African Americans assisted in narrowing the region likely to contain the functional variants for several loci. For example, African American LD patterns showed that 0 SNPs were in LD with NOS1AP signal rs12143842, compared with European LD patterns that indicated 87 SNPs, which spanned 114.2 Kb, were in LD with rs12143842. Finally, bioinformatic-based characterization of the nine African American signals pointed to functional candidates located exclusively within non-coding regions, including predicted binding sites for transcription factors such as TBX5, which has been implicated in cardiac structure and conductance. In this detailed evaluation of QT loci, we identified several African Americans SNPs that better define the association with QT and successfully narrowed intervals surrounding established loci. These results demonstrate that the same loci influence variation in QT
Directory of Open Access Journals (Sweden)
Wang Xiaoqiang
2012-04-01
Full Text Available Abstract Background Quantitative trait loci (QTL detection on a huge amount of phenotypes, like eQTL detection on transcriptomic data, can be dramatically impaired by the statistical properties of interval mapping methods. One of these major outcomes is the high number of QTL detected at marker locations. The present study aims at identifying and specifying the sources of this bias, in particular in the case of analysis of data issued from outbred populations. Analytical developments were carried out in a backcross situation in order to specify the bias and to propose an algorithm to control it. The outbred population context was studied through simulated data sets in a wide range of situations. The likelihood ratio test was firstly analyzed under the "one QTL" hypothesis in a backcross population. Designs of sib families were then simulated and analyzed using the QTL Map software. On the basis of the theoretical results in backcross, parameters such as the population size, the density of the genetic map, the QTL effect and the true location of the QTL, were taken into account under the "no QTL" and the "one QTL" hypotheses. A combination of two non parametric tests - the Kolmogorov-Smirnov test and the Mann-Whitney-Wilcoxon test - was used in order to identify the parameters that affected the bias and to specify how much they influenced the estimation of QTL location. Results A theoretical expression of the bias of the estimated QTL location was obtained for a backcross type population. We demonstrated a common source of bias under the "no QTL" and the "one QTL" hypotheses and qualified the possible influence of several parameters. Simulation studies confirmed that the bias exists in outbred populations under both the hypotheses of "no QTL" and "one QTL" on a linkage group. The QTL location was systematically closer to marker locations than expected, particularly in the case of low QTL effect, small population size or low density of markers, i
International Nuclear Information System (INIS)
Althuwaynee, Omar F; Pradhan, Biswajeet; Ahmad, Noordin
2014-01-01
This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies
Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin
2014-06-01
This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.
Motlagh, Mohadeseh Ghanbari; Kafaky, Sasan Babaie; Mataji, Asadollah; Akhavan, Reza
2018-05-21
Hyrcanian forests of North of Iran are of great importance in terms of various economic and environmental aspects. In this study, Spot-6 satellite images and regression models were applied to estimate above-ground biomass in these forests. This research was carried out in six compartments in three climatic (semi-arid to humid) types and two altitude classes. In the first step, ground sampling methods at the compartment level were used to estimate aboveground biomass (Mg/ha). Then, by reviewing the results of other studies, the most appropriate vegetation indices were selected. In this study, three indices of NDVI, RVI, and TVI were calculated. We investigated the relationship between the vegetation indices and aboveground biomass measured at sample-plot level. Based on the results, the relationship between aboveground biomass values and vegetation indices was a linear regression with the highest level of significance for NDVI in all compartments. Since at the compartment level the correlation coefficient between NDVI and aboveground biomass was the highest, NDVI was used for mapping aboveground biomass. According to the results of this study, biomass values were highly different in various climatic and altitudinal classes with the highest biomass value observed in humid climate and high-altitude class.
Clow, D. W.; Nanus, L.; Huggett, B. W.
2010-12-01
An abundance of exposed bedrock, sparse soil and vegetation, and fast hydrologic flushing rates make aquatic ecosystems in Yosemite National Park susceptible to nutrient enrichment and episodic acidification due to atmospheric deposition of nitrogen (N) and sulfur (S). In this study, multiple-linear regression (MLR) models were created to estimate fall-season nitrate and acid neutralizing capacity (ANC) in surface water in Yosemite wilderness. Input data included estimated winter N deposition, fall-season surface-water chemistry measurements at 52 sites, and basin characteristics derived from geographic information system layers of topography, geology, and vegetation. The MLR models accounted for 84% and 70% of the variance in surface-water nitrate and ANC, respectively. Explanatory variables (and the sign of their coefficients) for nitrate included elevation (positive) and the abundance of neoglacial and talus deposits (positive), unvegetated terrain (positive), alluvium (negative), and riparian (negative) areas in the basins. Explanatory variables for ANC included basin area (positive) and the abundance of metamorphic rocks (positive), unvegetated terrain (negative), water (negative), and winter N deposition (negative) in the basins. The MLR equations were applied to 1407 stream reaches delineated in the National Hydrography Dataset for Yosemite, and maps of predicted surface-water nitrate and ANC concentrations were created. Predicted surface-water nitrate concentrations were highest in small, high-elevation cirques, and concentrations declined downstream. Predicted ANC concentrations showed the opposite pattern, except in high-elevation areas underlain by metamorphic rocks along the Sierran Crest, which had relatively high predicted ANC (>200 µeq L-1). Maps were created to show where basin characteristics predispose aquatic resources to nutrient enrichment and acidification effects from N and S deposition. The maps can be used to help guide development of
Zou, Rui; Riverson, John; Liu, Yong; Murphy, Ryan; Sim, Youn
2015-03-01
Integrated continuous simulation-optimization models can be effective predictors of a process-based responses for cost-benefit optimization of best management practices (BMPs) selection and placement. However, practical application of simulation-optimization model is computationally prohibitive for large-scale systems. This study proposes an enhanced Nonlinearity Interval Mapping Scheme (NIMS) to solve large-scale watershed simulation-optimization problems several orders of magnitude faster than other commonly used algorithms. An efficient interval response coefficient (IRC) derivation method was incorporated into the NIMS framework to overcome a computational bottleneck. The proposed algorithm was evaluated using a case study watershed in the Los Angeles County Flood Control District. Using a continuous simulation watershed/stream-transport model, Loading Simulation Program in C++ (LSPC), three nested in-stream compliance points (CP)—each with multiple Total Maximum Daily Loads (TMDL) targets—were selected to derive optimal treatment levels for each of the 28 subwatersheds, so that the TMDL targets at all the CP were met with the lowest possible BMP implementation cost. Genetic Algorithm (GA) and NIMS were both applied and compared. The results showed that the NIMS took 11 iterations (about 11 min) to complete with the resulting optimal solution having a total cost of 67.2 million, while each of the multiple GA executions took 21-38 days to reach near optimal solutions. The best solution obtained among all the GA executions compared had a minimized cost of 67.7 million—marginally higher, but approximately equal to that of the NIMS solution. The results highlight the utility for decision making in large-scale watershed simulation-optimization formulations.
Hamon, David; Rajendran, Pradeep S.; Chui, Ray W.; Ajijola, Olujimi A.; Irie, Tadanobu; Talebi, Ramin; Salavatian, Siamak; Vaseghi, Marmar; Bradfield, Jason S.; Armour, J. Andrew; Ardell, Jeffrey L.; Shivkumar, Kalyanam
2017-01-01
Background Variability in premature ventricular contraction (PVC) coupling interval (CI) increases the risk of cardiomyopathy and sudden death. The autonomic nervous system regulates cardiac electrical and mechanical indices, and its dysregulation plays an important role in cardiac disease pathogenesis. The impact of PVCs on the intrinsic cardiac nervous system (ICNS), a neural network on the heart, remains unknown. The objective was to determine the effect of PVCs and CI on ICNS function in generating cardiac neuronal and electrical instability using a novel cardio-neural mapping approach. Methods and Results In a porcine model (n=8) neuronal activity was recorded from a ventricular ganglion using a microelectrode array, and cardiac electrophysiological mapping was performed. Neurons were functionally classified based on their response to afferent and efferent cardiovascular stimuli, with neurons that responded to both defined as convergent (local reflex processors). Dynamic changes in neuronal activity were then evaluated in response to right ventricular outflow tract PVCs with fixed short, fixed long, and variable CI. PVC delivery elicited a greater neuronal response than all other stimuli (P<0.001). Compared to fixed short and long CI, PVCs with variable CI had a greater impact on neuronal response (P<0.05 versus short CI), particularly on convergent neurons (P<0.05), as well as neurons receiving sympathetic (P<0.05) and parasympathetic input (P<0.05). The greatest cardiac electrical instability was also observed following variable (short) CI PVCs. Conclusions Variable CI PVCs affect critical populations of ICNS neurons and alter cardiac repolarization. These changes may be critical for arrhythmogenesis and remodeling leading to cardiomyopathy. PMID:28408652
Hamon, David; Rajendran, Pradeep S; Chui, Ray W; Ajijola, Olujimi A; Irie, Tadanobu; Talebi, Ramin; Salavatian, Siamak; Vaseghi, Marmar; Bradfield, Jason S; Armour, J Andrew; Ardell, Jeffrey L; Shivkumar, Kalyanam
2017-04-01
Variability in premature ventricular contraction (PVC) coupling interval (CI) increases the risk of cardiomyopathy and sudden death. The autonomic nervous system regulates cardiac electrical and mechanical indices, and its dysregulation plays an important role in cardiac disease pathogenesis. The impact of PVCs on the intrinsic cardiac nervous system, a neural network on the heart, remains unknown. The objective was to determine the effect of PVCs and CI on intrinsic cardiac nervous system function in generating cardiac neuronal and electric instability using a novel cardioneural mapping approach. In a porcine model (n=8), neuronal activity was recorded from a ventricular ganglion using a microelectrode array, and cardiac electrophysiological mapping was performed. Neurons were functionally classified based on their response to afferent and efferent cardiovascular stimuli, with neurons that responded to both defined as convergent (local reflex processors). Dynamic changes in neuronal activity were then evaluated in response to right ventricular outflow tract PVCs with fixed short, fixed long, and variable CI. PVC delivery elicited a greater neuronal response than all other stimuli ( P <0.001). Compared with fixed short and long CI, PVCs with variable CI had a greater impact on neuronal response ( P <0.05 versus short CI), particularly on convergent neurons ( P <0.05), as well as neurons receiving sympathetic ( P <0.05) and parasympathetic input ( P <0.05). The greatest cardiac electric instability was also observed after variable (short) CI PVCs. Variable CI PVCs affect critical populations of intrinsic cardiac nervous system neurons and alter cardiac repolarization. These changes may be critical for arrhythmogenesis and remodeling, leading to cardiomyopathy. © 2017 American Heart Association, Inc.
International Nuclear Information System (INIS)
Briggs, D.J.; De Hoogh, C.; Elliot, P.; Gulliver, J.; Wills, J.; Kingham, S.; Smallbone, K.
2000-01-01
Accurate, high-resolution maps of traffic-related air pollution are needed both as a basis for assessing exposures as part of epidemiological studies, and to inform urban air-quality policy and traffic management. This paper assesses the use of a GIS-based, regression mapping technique to model spatial patterns of traffic-related air pollution. The model - developed using data from 80 passive sampler sites in Huddersfield, as part of the SAVIAH (Small Area Variations in Air Quality and Health) project - uses data on traffic flows and land cover in the 300-m buffer zone around each site, and altitude of the site, as predictors of NO 2 concentrations. It was tested here by application in four urban areas in the UK: Huddersfield (for the year following that used for initial model development), Sheffield, Northampton, and part of London. In each case, a GIS was built in ArcInfo, integrating relevant data on road traffic, urban land use and topography. Monitoring of NO 2 was undertaken using replicate passive samplers (in London, data were obtained from surveys carried out as part of the London network). In Huddersfield, Sheffield and Northampton, the model was first calibrated by comparing modelled results with monitored NO 2 concentrations at 10 randomly selected sites; the calibrated model was then validated against data from a further 10-28 sites. In London, where data for only 11 sites were available, validation was not undertaken. Results showed that the model performed well in all cases. After local calibration, the model gave estimates of mean annual NO 2 concentrations within a factor of 1.5 of the actual mean (approx. 70-90%) of the time and within a factor of 2 between 70 and 100% of the time. r 2 values between modelled and observed concentrations are in the range of 0.58-0.76. These results are comparable to those achieved by more sophisticated dispersion models. The model also has several advantages over dispersion modelling. It is able, for example, to
Directory of Open Access Journals (Sweden)
Brian A. Johnson
2018-01-01
Full Text Available The advent of very high resolution (VHR satellite imagery and the development of Geographic Object-Based Image Analysis (GEOBIA have led to many new opportunities for fine-scale land cover mapping, especially in urban areas. Image segmentation is an important step in the GEOBIA framework, so great time/effort is often spent to ensure that computer-generated image segments closely match real-world objects of interest. In the remote sensing community, segmentation is frequently performed using the multiresolution segmentation (MRS algorithm, which is tuned through three user-defined parameters (the scale, shape/color, and compactness/smoothness parameters. The scale parameter (SP is the most important parameter and governs the average size of generated image segments. Existing automatic methods to determine suitable SPs for segmentation are scene-specific and often computationally intensive, so an approach to estimating appropriate SPs that is generalizable (i.e., not scene-specific could speed up the GEOBIA workflow considerably. In this study, we attempted to identify generalizable SPs for five common urban land cover types (buildings, vegetation, roads, bare soil, and water through meta-analysis and nonlinear regression tree (RT modeling. First, we performed a literature search of recent studies that employed GEOBIA for urban land cover mapping and extracted the MRS parameters used, the image properties (i.e., spatial and radiometric resolutions, and the land cover classes mapped. Using this data extracted from the literature, we constructed RT models for each land cover class to predict suitable SP values based on the: image spatial resolution, image radiometric resolution, shape/color parameter, and compactness/smoothness parameter. Based on a visual and quantitative analysis of results, we found that for all land cover classes except water, relatively accurate SPs could be identified using our RT modeling results. The main advantage of our
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
A simple proof of the exactness of expanding maps of the interval with an indifferent fixed point
International Nuclear Information System (INIS)
Lenci, Marco
2016-01-01
Expanding maps with indifferent fixed points, a.k.a. intermittent maps, are popular models in nonlinear dynamics and infinite ergodic theory. We present a simple proof of the exactness of a wide class of expanding maps of [0, 1], with countably many surjective branches and a strongly neutral fixed point in 0.
Directory of Open Access Journals (Sweden)
Gwo-Fong Lin
2016-01-01
Full Text Available This study describes the development of a reservoir inflow forecasting model for typhoon events to improve short lead-time flood forecasting performance. To strengthen the forecasting ability of the original support vector machines (SVMs model, the self-organizing map (SOM is adopted to group inputs into different clusters in advance of the proposed SOM-SVM model. Two different input methods are proposed for the SVM-based forecasting method, namely, SOM-SVM1 and SOM-SVM2. The methods are applied to an actual reservoir watershed to determine the 1 to 3 h ahead inflow forecasts. For 1, 2, and 3 h ahead forecasts, improvements in mean coefficient of efficiency (MCE due to the clusters obtained from SOM-SVM1 are 21.5%, 18.5%, and 23.0%, respectively. Furthermore, improvement in MCE for SOM-SVM2 is 20.9%, 21.2%, and 35.4%, respectively. Another SOM-SVM2 model increases the SOM-SVM1 model for 1, 2, and 3 h ahead forecasts obtained improvement increases of 0.33%, 2.25%, and 10.08%, respectively. These results show that the performance of the proposed model can provide improved forecasts of hourly inflow, especially in the proposed SOM-SVM2 model. In conclusion, the proposed model, which considers limit and higher related inputs instead of all inputs, can generate better forecasts in different clusters than are generated from the SOM process. The SOM-SVM2 model is recommended as an alternative to the original SVR (Support Vector Regression model because of its accuracy and robustness.
L.R. Iverson; A.M. Prasad; A. Liaw
2004-01-01
More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Arjmand, N; Ekrami, O; Shirazi-Adl, A; Plamondon, A; Parnianpour, M
2013-05-31
Two artificial neural networks (ANNs) are constructed, trained, and tested to map inputs of a complex trunk finite element (FE) model to its outputs for spinal loads and muscle forces. Five input variables (thorax flexion angle, load magnitude, its anterior and lateral positions, load handling technique, i.e., one- or two-handed static lifting) and four model outputs (L4-L5 and L5-S1 disc compression and anterior-posterior shear forces) for spinal loads and 76 model outputs (forces in individual trunk muscles) are considered. Moreover, full quadratic regression equations mapping input-outputs of the model developed here for muscle forces and previously for spine loads are used to compare the relative accuracy of these two mapping tools (ANN and regression equations). Results indicate that the ANNs are more accurate in mapping input-output relationships of the FE model (RMSE= 20.7 N for spinal loads and RMSE= 4.7 N for muscle forces) as compared to regression equations (RMSE= 120.4 N for spinal loads and RMSE=43.2 N for muscle forces). Quadratic regression equations map up to second order variations of outputs with inputs while ANNs capture higher order variations too. Despite satisfactory achievement in estimating overall muscle forces by the ANN, some inadequacies are noted including assigning force to antagonistic muscles with no activity in the optimization algorithm of the FE model or predicting slightly different forces in bilateral pair muscles in symmetric lifting activities. Using these user-friendly tools spine loads and trunk muscle forces during symmetric and asymmetric static lifts can be easily estimated. Copyright © 2013 Elsevier Ltd. All rights reserved.
Yilmaz, Işık
2009-06-01
The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.
Ozdemir, Adnan
2011-07-01
SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model
A comparison of multiple regression and neural network techniques for mapping in situ pCO2 data
International Nuclear Information System (INIS)
Lefevre, Nathalie; Watson, Andrew J.; Watson, Adam R.
2005-01-01
Using about 138,000 measurements of surface pCO 2 in the Atlantic subpolar gyre (50-70 deg N, 60-10 deg W) during 1995-1997, we compare two methods of interpolation in space and time: a monthly distribution of surface pCO 2 constructed using multiple linear regressions on position and temperature, and a self-organizing neural network approach. Both methods confirm characteristics of the region found in previous work, i.e. the subpolar gyre is a sink for atmospheric CO 2 throughout the year, and exhibits a strong seasonal variability with the highest undersaturations occurring in spring and summer due to biological activity. As an annual average the surface pCO 2 is higher than estimates based on available syntheses of surface pCO 2 . This supports earlier suggestions that the sink of CO 2 in the Atlantic subpolar gyre has decreased over the last decade instead of increasing as previously assumed. The neural network is able to capture a more complex distribution than can be well represented by linear regressions, but both techniques agree relatively well on the average values of pCO 2 and derived fluxes. However, when both techniques are used with a subset of the data, the neural network predicts the remaining data to a much better accuracy than the regressions, with a residual standard deviation ranging from 3 to 11 μatm. The subpolar gyre is a net sink of CO 2 of 0.13 Gt-C/yr using the multiple linear regressions and 0.15 Gt-C/yr using the neural network, on average between 1995 and 1997. Both calculations were made with the NCEP monthly wind speeds converted to 10 m height and averaged between 1995 and 1997, and using the gas exchange coefficient of Wanninkhof
Ozdemir, Adnan; Altural, Tolga
2013-03-01
This study evaluated and compared landslide susceptibility maps produced with three different methods, frequency ratio, weights of evidence, and logistic regression, by using validation datasets. The field surveys performed as part of this investigation mapped the locations of 90 landslides that had been identified in the Sultan Mountains of south-western Turkey. The landslide influence parameters used for this study are geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transportation capacity index, distance to drainage, distance to fault, drainage density, fault density, and spring density maps. The relationships between landslide distributions and these parameters were analysed using the three methods, and the results of these methods were then used to calculate the landslide susceptibility of the entire study area. The accuracy of the final landslide susceptibility maps was evaluated based on the landslides observed during the fieldwork, and the accuracy of the models was evaluated by calculating each model's relative operating characteristic curve. The predictive capability of each model was determined from the area under the relative operating characteristic curve and the areas under the curves obtained using the frequency ratio, logistic regression, and weights of evidence methods are 0.976, 0.952, and 0.937, respectively. These results indicate that the frequency ratio and weights of evidence models are relatively good estimators of landslide susceptibility in the study area. Specifically, the results of the correlation analysis show a high correlation between the frequency ratio and weights of evidence results, and the frequency ratio and logistic regression methods exhibit correlation coefficients of 0.771 and 0.727, respectively. The frequency ratio model is simple, and its input, calculation and output processes are
Directory of Open Access Journals (Sweden)
Xianyu Yu
2016-05-01
Full Text Available In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%–19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian
2016-05-11
In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Directory of Open Access Journals (Sweden)
JEMMAH A I
2018-01-01
Full Text Available Taounate region is known by a high density of mass movements which cause several human and economic losses. The goal of this paper is to assess the landslide susceptibility of Taounate using the Weight of Evidence method (WofE and the Logistic Regression method (LR. Seven conditioning factors were used in this study: lithology, fault, drainage, slope, elevation, exposure and land use. Over the years, this site and its surroundings have experienced repeated landslides. For this reason, landslide susceptibility mapping is mandatory for risk prevention and land-use management. In this study, we have focused on recent large-scale mass movements. Finally, the ROC curves were established to evaluate the degree of fit of the model and to choose the best landslide susceptibility zonation. A total mass movements location were detected; 50% were randomly selected as input data for the entire process using the Spatial Data Model (SDM and the remaining locations were used for validation purposes. The obtained WofE’s landslide susceptibility map shows that high to very high susceptibility zones contain 62% of the total of inventoried landslides, while the same zones contain only 47% of landslides in the map obtained by the LR method. This landslide susceptibility map obtained is a major contribution to various urban and regional development plans under the Taounate Region National Development Program.
Directory of Open Access Journals (Sweden)
Elvio Giasson
2006-06-01
Full Text Available Soil surveys are necessary sources of information for land use planning, but they are not always available. This study proposes the use of multiple logistic regressions on the prediction of occurrence of soil types based on reference areas. From a digitalized soil map and terrain parameters derived from the digital elevation model in ArcView environment, several sets of multiple logistic regressions were defined using statistical software Minitab, establishing relationship between explanatory terrain variables and soil types, using either the original legend or a simplified legend, and using or not stratification of the study area by drainage classes. Terrain parameters, such as elevation, distance to stream, flow accumulation, and topographic wetness index, were the variables that best explained soil distribution. Stratification by drainage classes did not have significant effect. Simplification of the original legend increased the accuracy of the method on predicting soil distribution.Os levantamentos de solos são fontes de informação necessárias para o planejamento de uso das terras, entretanto eles nem sempre estão disponíveis. Este estudo propõe o uso de regressões logísticas múltiplas na predição de ocorrência de classes de solos a partir de áreas de referência. Baseado no mapa original de solos em formato digital e parâmetros do terreno derivados do modelo numérico do terreno em ambiente ArcView, vários conjuntos de regressões logísticas múltiplas foram definidas usando o programa estatístico Minitab, estabelecendo relações entre as variáveis do terreno independentes e tipos de solos, usando tanto a legenda original como uma legenda simplificada, e usando ou não estratificação da área de estudo por classes de drenagem. Os parâmetros do terreno como elevação, distância dos rios, acúmulo de fluxo e índice de umidade topográfica foram as variáveis que melhor explicaram a distribuição das classes de
Mohebbi, Mohammadreza; Wolfe, Rory; Forbes, Andrew
2014-01-01
This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when overdispersion and residual spatial autocorrelation are present. These considerations lead to nine regression models derived from using three probability distributions for count data: Poisson, generalised Poisson and negative binomial, and three different autocorrelation structures. We employ the framework of Bayesian variable selection and a Gibbs sampling based technique to identify significant cancer risk factors. The framework deals with situations where the number of possible models based on different combinations of candidate explanatory variables is large enough such that calculation of posterior probabilities for all models is difficult or infeasible. The evidence from applying the modelling methodology suggests that modelling strategies based on the use of generalised Poisson and negative binomial with spatial autocorrelation work well and provide a robust basis for inference. PMID:24413702
Will, R. M.; Glenn, N. F.; Benner, S. G.; Pierce, J. L.; Spaete, L.; Li, A.
2015-12-01
Quantifying SOC (Soil Organic Carbon) storage in complex terrain is challenging due to high spatial variability. Generally, the challenge is met by transforming point data to the entire landscape using surrogate, spatially-distributed, variables like elevation or precipitation. In many ecosystems, remotely sensed information on above-ground vegetation (e.g. NDVI) is a good predictor of below-ground carbon stocks. In this project, we are attempting to improve this predictive method by incorporating LiDAR-derived vegetation indices. LiDAR provides a mechanism for improved characterization of aboveground vegetation by providing structural parameters such as vegetation height and biomass. In this study, a random forest model is used to predict SOC using a suite of LiDAR-derived vegetation indices as predictor variables. The Reynolds Creek Experimental Watershed (RCEW) is an ideal location for a study of this type since it encompasses a strong elevation/precipitation gradient that supports lower biomass sagebrush ecosystems at low elevations and forests with more biomass at higher elevations. Sagebrush ecosystems composed of Wyoming, Low and Mountain Sagebrush have SOC values ranging from .4 to 1% (top 30 cm), while higher biomass ecosystems composed of aspen, juniper and fir have SOC values approaching 4% (top 30 cm). Large differences in SOC have been observed between canopy and interspace locations and high resolution vegetation information is likely to explain plot scale variability in SOC. Mapping of the SOC reservoir will help identify underlying controls on SOC distribution and provide insight into which processes are most important in determining SOC in semi-arid mountainous regions. In addition, airborne LiDAR has the potential to characterize vegetation communities at a high resolution and could be a tool for improving estimates of SOC at larger scales.
Directory of Open Access Journals (Sweden)
Dieu Tien Bui
2016-04-01
Full Text Available The Cat Ba National Park area (Vietnam with its tropical forest is recognized as being part of the world biodiversity conservation by the United Nations Educational, Scientific and Cultural Organization (UNESCO and is a well-known destination for tourists, with around 500,000 travelers per year. This area has been the site for many research projects; however, no project has been carried out for forest fire susceptibility assessment. Thus, protection of the forest including fire prevention is one of the main concerns of the local authorities. This work aims to produce a tropical forest fire susceptibility map for the Cat Ba National Park area, which may be helpful for the local authorities in forest fire protection management. To obtain this purpose, first, historical forest fires and related factors were collected from various sources to construct a GIS database. Then, a forest fire susceptibility model was developed using Kernel logistic regression. The quality of the model was assessed using the Receiver Operating Characteristic (ROC curve, area under the ROC curve (AUC, and five statistical evaluation measures. The usability of the resulting model is further compared with a benchmark model, the support vector machine (SVM. The results show that the Kernel logistic regression model has a high level of performance in both the training and validation dataset, with a prediction capability of 92.2%. Since the Kernel logistic regression model outperforms the benchmark model, we conclude that the proposed model is a promising alternative tool that should also be considered for forest fire susceptibility mapping in other areas. The results of this study are useful for the local authorities in forest planning and management.
Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng
2014-10-01
The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process. Copyright © 2014 Elsevier Ltd. All rights reserved.
Forkuor, Gerald; Hounkpatin, Ozias K L; Welp, Gerhard; Thiel, Michael
2017-01-01
Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness
Directory of Open Access Journals (Sweden)
Gerald Forkuor
Full Text Available Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat, terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC, soil organic carbon (SOC and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR, random forest regression (RFR, support vector machine (SVM, stochastic gradient boosting (SGB-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Mägi, Reedik; Horikoshi, Momoko; Sofer, Tamar; Mahajan, Anubha; Kitajima, Hidetoshi; Franceschini, Nora; McCarthy, Mark I; Morris, Andrew P
2017-09-15
Trans-ethnic meta-analysis of genome-wide association studies (GWAS) across diverse populations can increase power to detect complex trait loci when the underlying causal variants are shared between ancestry groups. However, heterogeneity in allelic effects between GWAS at these loci can occur that is correlated with ancestry. Here, a novel approach is presented to detect SNP association and quantify the extent of heterogeneity in allelic effects that is correlated with ancestry. We employ trans-ethnic meta-regression to model allelic effects as a function of axes of genetic variation, derived from a matrix of mean pairwise allele frequency differences between GWAS, and implemented in the MR-MEGA software. Through detailed simulations, we demonstrate increased power to detect association for MR-MEGA over fixed- and random-effects meta-analysis across a range of scenarios of heterogeneity in allelic effects between ethnic groups. We also demonstrate improved fine-mapping resolution, in loci containing a single causal variant, compared to these meta-analysis approaches and PAINTOR, and equivalent performance to MANTRA at reduced computational cost. Application of MR-MEGA to trans-ethnic GWAS of kidney function in 71,461 individuals indicates stronger signals of association than fixed-effects meta-analysis when heterogeneity in allelic effects is correlated with ancestry. Application of MR-MEGA to fine-mapping four type 2 diabetes susceptibility loci in 22,086 cases and 42,539 controls highlights: (i) strong evidence for heterogeneity in allelic effects that is correlated with ancestry only at the index SNP for the association signal at the CDKAL1 locus; and (ii) 99% credible sets with six or fewer variants for five distinct association signals. © The Author 2017. Published by Oxford University Press.
Directory of Open Access Journals (Sweden)
Seyed Ali Akbar Afjeh
2014-05-01
Full Text Available Market segmentation plays essential role on understanding the behavior of people’s interests in purchasing various products and services through various channels. This paper presents an empirical investigation to shed light on consumer’s purchasing attitude as well as gathering information in multi-channel environment. The proposed study of this paper designed a questionnaire and distributed it among 800 people who were at least 18 years of age and had some experiences on purchasing goods and services on internet, catalog or regular shopping centers. Self-organizing map, SOM, clustering technique was performed based on consumer’s interest in gathering information as well as purchasing products through internet, catalog and shopping centers and determined four segments. There were two types of questions for the proposed study of this paper. The first group considered participants’ personal characteristics such as age, gender, income, etc. The second group of questions was associated with participants’ psychographic characteristics including price consciousness, quality consciousness, time pressure, etc. Using multinominal logistic regression technique, the study determines consumers’ behaviors in each four segments.
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Kato, S; Ishii, A; Nishi, A; Kuriki, S; Koide, T
2014-01-01
Recent genetic studies have shown that genetic loci with significant effects in whole-genome quantitative trait loci (QTL) analyses were lost or weakened in congenic strains. Characterisation of the genetic basis of this attenuated QTL effect is important to our understanding of the genetic mechanisms of complex traits. We previously found that a consomic strain, B6-Chr6CMSM, which carries chromosome 6 of a wild-derived strain MSM/Ms on the genetic background of C57BL/6J, exhibited lower home-cage activity than C57BL/6J. In the present study, we conducted a composite interval QTL analysis using the F2 mice derived from a cross between C57BL/6J and B6-Chr6CMSM. We found one QTL peak that spans 17.6 Mbp of chromosome 6. A subconsomic strain that covers the entire QTL region also showed lower home-cage activity at the same level as the consomic strain. We developed 15 congenic strains, each of which carries a shorter MSM/Ms-derived chromosomal segment from the subconsomic strain. Given that the results of home-cage activity tests on the congenic strains cannot be explained by a simple single-gene model, we applied regression analysis to segregate the multiple genetic loci. The results revealed three loci (loci 1–3) that have the effect of reducing home-cage activity and one locus (locus 4) that increases activity. We also found that the combination of loci 3 and 4 cancels out the effects of the congenic strains, which indicates the existence of a genetic mechanism related to the loss of QTLs. PMID:24781804
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
International Nuclear Information System (INIS)
Boyd, John P.; Rangan, C.; Bucksbaum, P.H.
2003-01-01
The Fourier-sine-with-mapping pseudospectral algorithm of Fattal et al. [Phys. Rev. E 53 (1996) 1217] has been applied in several quantum physics problems. Here, we compare it with pseudospectral methods using Laguerre functions and rational Chebyshev functions. We show that Laguerre and Chebyshev expansions are better suited for solving problems in the interval r in R set of [0,∞] (for example, the Coulomb-Schroedinger equation), than the Fourier-sine-mapping scheme. All three methods give similar accuracy for the hydrogen atom when the scaling parameter L is optimum, but the Laguerre and Chebyshev methods are less sensitive to variations in L. We introduce a new variant of rational Chebyshev functions which has a more uniform spacing of grid points for large r, and gives somewhat better results than the rational Chebyshev functions of Boyd [J. Comp. Phys. 70 (1987) 63
Ruette, Sylvie
2017-01-01
The aim of this book is to survey the relations between the various kinds of chaos and related notions for continuous interval maps from a topological point of view. The papers on this topic are numerous and widely scattered in the literature; some of them are little known, difficult to find, or originally published in Russian, Ukrainian, or Chinese. Dynamical systems given by the iteration of a continuous map on an interval have been broadly studied because they are simple but nevertheless exhibit complex behaviors. They also allow numerical simulations, which enabled the discovery of some chaotic phenomena. Moreover, the "most interesting" part of some higher-dimensional systems can be of lower dimension, which allows, in some cases, boiling it down to systems in dimension one. Some of the more recent developments such as distributional chaos, the relation between entropy and Li-Yorke chaos, sequence entropy, and maps with infinitely many branches are presented in book form for the first time. The author gi...
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
International Nuclear Information System (INIS)
Al-Hadeethi, Farqad; Al-Nimr, Moh'd; Al-Safadi, Mohammad
2015-01-01
The performance of PEM (proton exchange membrane) fuel cell was experimentally investigated at three temperatures (30, 50 and 70 °C), four flow rates (5, 10, 15 and 20 ml/min) and two flow patterns (co-current and counter current) in order to generate two correlations using multiple regression analysis with respect to ANOVA. Results revealed that increasing the temperature for co-current and counter current flow patterns will increase both hydrogen and oxygen diffusivities, water management and membrane conductivity. The derived mathematical correlations and three dimensional mapping (i.e. surface response) for the co-current and countercurrent flow patterns showed that there is a clear interaction among the various variables (temperatures and flow rates). - Highlights: • Generating mathematical correlations using multiple regression analysis with respect to ANOVA for the performance of the PEM fuel cell. • Using the 3D mapping to diagnose the optimum performance of the PEM fuel cell at the given operating conditions. • Results revealed that increasing the flow rate had direct influence on the consumption of oxygen. • Results assured that increasing the temperature in co-current and counter current flow patterns increases the performance of PEM fuel cell.
Directory of Open Access Journals (Sweden)
Fabyano Fonseca Silva
2011-01-01
Full Text Available Nowadays, an important and interesting alternative in the control of tick-infestation in cattle is to select resistant animals, and identify the respective quantitative trait loci (QTLs and DNA markers, for posterior use in breeding programs. The number of ticks/animal is characterized as a discrete-counting trait, which could potentially follow Poisson distribution. However, in the case of an excess of zeros, due to the occurrence of several noninfected animals, zero-inflated Poisson and generalized zero-inflated distribution (GZIP may provide a better description of the data. Thus, the objective here was to compare through simulation, Poisson and ZIP models (simple and generalized with classical approaches, for QTL mapping with counting phenotypes under different scenarios, and to apply these approaches to a QTL study of tick resistance in an F2 cattle (Gyr x Holstein population. It was concluded that, when working with zero-inflated data, it is recommendable to use the generalized and simple ZIP model for analysis. On the other hand, when working with data with zeros, but not zero-inflated, the Poisson model or a data-transformation-approach, such as square-root or Box-Cox transformation, are applicable.
Silva, Fabyano Fonseca; Tunin, Karen P.; Rosa, Guilherme J.M.; da Silva, Marcos V.B.; Azevedo, Ana Luisa Souza; da Silva Verneque, Rui; Machado, Marco Antonio; Packer, Irineu Umberto
2011-01-01
Now a days, an important and interesting alternative in the control of tick-infestation in cattle is to select resistant animals, and identify the respective quantitative trait loci (QTLs) and DNA markers, for posterior use in breeding programs. The number of ticks/animal is characterized as a discrete-counting trait, which could potentially follow Poisson distribution. However, in the case of an excess of zeros, due to the occurrence of several noninfected animals, zero-inflated Poisson and generalized zero-inflated distribution (GZIP) may provide a better description of the data. Thus, the objective here was to compare through simulation, Poisson and ZIP models (simple and generalized) with classical approaches, for QTL mapping with counting phenotypes under different scenarios, and to apply these approaches to a QTL study of tick resistance in an F2 cattle (Gyr × Holstein) population. It was concluded that, when working with zero-inflated data, it is recommendable to use the generalized and simple ZIP model for analysis. On the other hand, when working with data with zeros, but not zero-inflated, the Poisson model or a data-transformation-approach, such as square-root or Box-Cox transformation, are applicable. PMID:22215960
Demeure, Olivier; Duclos, Michel J; Bacciu, Nicola; Le Mignon, Guillaume; Filangi, Olivier; Pitel, Frédérique; Boland, Anne; Lagarrigue, Sandrine; Cogburn, Larry A; Simon, Jean; Le Roy, Pascale; Le Bihan-Duval, Elisabeth
2013-09-30
For decades, genetic improvement based on measuring growth and body composition traits has been successfully applied in the production of meat-type chickens. However, this conventional approach is hindered by antagonistic genetic correlations between some traits and the high cost of measuring body composition traits. Marker-assisted selection should overcome these problems by selecting loci that have effects on either one trait only or on more than one trait but with a favorable genetic correlation. In the present study, identification of such loci was done by genotyping an F2 intercross between fat and lean lines divergently selected for abdominal fatness genotyped with a medium-density genetic map (120 microsatellites and 1302 single nucleotide polymorphisms). Genome scan linkage analyses were performed for growth (body weight at 1, 3, 5, and 7 weeks, and shank length and diameter at 9 weeks), body composition at 9 weeks (abdominal fat weight and percentage, breast muscle weight and percentage, and thigh weight and percentage), and for several physiological measurements at 7 weeks in the fasting state, i.e. body temperature and plasma levels of IGF-I, NEFA and glucose. Interval mapping analyses were performed with the QTLMap software, including single-trait analyses with single and multiple QTL on the same chromosome. Sixty-seven QTL were detected, most of which had never been described before. Of these 67 QTL, 47 were detected by single-QTL analyses and 20 by multiple-QTL analyses, which underlines the importance of using different statistical models. Close analysis of the genes located in the defined intervals identified several relevant functional candidates, such as ACACA for abdominal fatness, GHSR and GAS1 for breast muscle weight, DCRX and ASPSCR1 for plasma glucose content, and ChEBP for shank diameter. The medium-density genetic map enabled us to genotype new regions of the chicken genome (including micro-chromosomes) that influenced the traits
Inflation, Forecast Intervals and Long Memory Regression Models
C.S. Bos (Charles); Ph.H.B.F. Franses (Philip Hans); M. Ooms (Marius)
2001-01-01
textabstractWe examine recursive out-of-sample forecasting of monthly postwar U.S. core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading
Inflation, Forecast Intervals and Long Memory Regression Models
Ooms, M.; Bos, C.S.; Franses, P.H.
2003-01-01
We examine recursive out-of-sample forecasting of monthly postwar US core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading indicators
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Interval selection with machine-dependent intervals
Bohmova K.; Disser Y.; Mihalak M.; Widmayer P.
2013-01-01
We study an offline interval scheduling problem where every job has exactly one associated interval on every machine. To schedule a set of jobs, exactly one of the intervals associated with each job must be selected, and the intervals selected on the same machine must not intersect.We show that deciding whether all jobs can be scheduled is NP-complete already in various simple cases. In particular, by showing the NP-completeness for the case when all the intervals associated with the same job...
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Khan, Iftekhar; Morris, Stephen
2014-11-12
The performance of the Beta Binomial (BB) model is compared with several existing models for mapping the EORTC QLQ-C30 (QLQ-C30) on to the EQ-5D-3L using data from lung cancer trials. Data from 2 separate non small cell lung cancer clinical trials (TOPICAL and SOCCAR) are used to develop and validate the BB model. Comparisons with Linear, TOBIT, Quantile, Quadratic and CLAD models are carried out. The mean prediction error, R(2), proportion predicted outside the valid range, clinical interpretation of coefficients, model fit and estimation of Quality Adjusted Life Years (QALY) are reported and compared. Monte-Carlo simulation is also used. The Beta-Binomial regression model performed 'best' among all models. For TOPICAL and SOCCAR trials, respectively, residual mean square error (RMSE) was 0.09 and 0.11; R(2) was 0.75 and 0.71; observed vs. predicted means were 0.612 vs. 0.608 and 0.750 vs. 0.749. Mean difference in QALY's (observed vs. predicted) were 0.051 vs. 0.053 and 0.164 vs. 0.162 for TOPICAL and SOCCAR respectively. Models tested on independent data show simulated 95% confidence from the BB model containing the observed mean more often (77% and 59% for TOPICAL and SOCCAR respectively) compared to the other models. All algorithms over-predict at poorer health states but the BB model was relatively better, particularly for the SOCCAR data. The BB model may offer superior predictive properties amongst mapping algorithms considered and may be more useful when predicting EQ-5D-3L at poorer health states. We recommend the algorithm derived from the TOPICAL data due to better predictive properties and less uncertainty.
Alparslan-Gok, S.Z.; Brânzei, R.; Tijs, S.H.
2008-01-01
In this paper, convex interval games are introduced and some characterizations are given. Some economic situations leading to convex interval games are discussed. The Weber set and the Shapley value are defined for a suitable class of interval games and their relations with the interval core for
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Spatial vulnerability assessments by regression kriging
Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor
2016-04-01
information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).
Regression algorithm for emotion detection
Berthelon , Franck; Sander , Peter
2013-01-01
International audience; We present here two components of a computational system for emotion detection. PEMs (Personalized Emotion Maps) store links between bodily expressions and emotion values, and are individually calibrated to capture each person's emotion profile. They are an implementation based on aspects of Scherer's theoretical complex system model of emotion~\\cite{scherer00, scherer09}. We also present a regression algorithm that determines a person's emotional feeling from sensor m...
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Generalized Confidence Intervals and Fiducial Intervals for Some Epidemiological Measures
Directory of Open Access Journals (Sweden)
Ionut Bebu
2016-06-01
Full Text Available For binary outcome data from epidemiological studies, this article investigates the interval estimation of several measures of interest in the absence or presence of categorical covariates. When covariates are present, the logistic regression model as well as the log-binomial model are investigated. The measures considered include the common odds ratio (OR from several studies, the number needed to treat (NNT, and the prevalence ratio. For each parameter, confidence intervals are constructed using the concepts of generalized pivotal quantities and fiducial quantities. Numerical results show that the confidence intervals so obtained exhibit satisfactory performance in terms of maintaining the coverage probabilities even when the sample sizes are not large. An appealing feature of the proposed solutions is that they are not based on maximization of the likelihood, and hence are free from convergence issues associated with the numerical calculation of the maximum likelihood estimators, especially in the context of the log-binomial model. The results are illustrated with a number of examples. The overall conclusion is that the proposed methodologies based on generalized pivotal quantities and fiducial quantities provide an accurate and unified approach for the interval estimation of the various epidemiological measures in the context of binary outcome data with or without covariates.
Matsakis, Nicholas D.; Gross, Thomas R.
Intervals are a new, higher-level primitive for parallel programming with which programmers directly construct the program schedule. Programs using intervals can be statically analyzed to ensure that they do not deadlock or contain data races. In this paper, we demonstrate the flexibility of intervals by showing how to use them to emulate common parallel control-flow constructs like barriers and signals, as well as higher-level patterns such as bounded-buffer producer-consumer. We have implemented intervals as a publicly available library for Java and Scala.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Understanding logistic regression analysis
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Meta-Modeling by Symbolic Regression and Pareto Simulated Annealing
Stinstra, E.; Rennen, G.; Teeuwen, G.J.A.
2006-01-01
The subject of this paper is a new approach to Symbolic Regression.Other publications on Symbolic Regression use Genetic Programming.This paper describes an alternative method based on Pareto Simulated Annealing.Our method is based on linear regression for the estimation of constants.Interval
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Overconfidence in Interval Estimates
Soll, Jack B.; Klayman, Joshua
2004-01-01
Judges were asked to make numerical estimates (e.g., "In what year was the first flight of a hot air balloon?"). Judges provided high and low estimates such that they were X% sure that the correct answer lay between them. They exhibited substantial overconfidence: The correct answer fell inside their intervals much less than X% of the time. This…
Socioeconomic position and the primary care interval
DEFF Research Database (Denmark)
Vedsted, Anders
2018-01-01
to the easiness to interpret the symptoms of the underlying cancer. Methods. We conducted a population-based cohort study using survey data on time intervals linked at an individually level to routine collected data on demographics from Danish registries. Using logistic regression we estimated the odds......Introduction. Diagnostic delays affect cancer survival negatively. Thus, the time interval from symptomatic presentation to a GP until referral to secondary care (i.e. primary care interval (PCI)), should be as short as possible. Lower socioeconomic position seems associated with poorer cancer...... younger than 45 years of age and older than 54 years of age had longer primary care interval than patients aged ‘45-54’ years. No other associations for SEP characteristics were observed. The findings may imply that GPs are referring patients regardless of SEP, although some room for improvement prevails...
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Applications of interval computations
Kreinovich, Vladik
1996-01-01
Primary Audience for the Book • Specialists in numerical computations who are interested in algorithms with automatic result verification. • Engineers, scientists, and practitioners who desire results with automatic verification and who would therefore benefit from the experience of suc cessful applications. • Students in applied mathematics and computer science who want to learn these methods. Goal Of the Book This book contains surveys of applications of interval computations, i. e. , appli cations of numerical methods with automatic result verification, that were pre sented at an international workshop on the subject in EI Paso, Texas, February 23-25, 1995. The purpose of this book is to disseminate detailed and surveyed information about existing and potential applications of this new growing field. Brief Description of the Papers At the most fundamental level, interval arithmetic operations work with sets: The result of a single arithmetic operation is the set of all possible results as the o...
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Multicollinearity and Regression Analysis
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....
and Multinomial Logistic Regression
African Journals Online (AJOL)
This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
QTL mapping for yield components and agronomic traits in a Brazilian soybean population
Directory of Open Access Journals (Sweden)
Josiane Isabela da Silva Rodrigues
2016-11-01
Full Text Available The objective of this work was to map QTL for agronomic traits in a Brazilian soybean population. For this, 207 F2:3 progenies from the cross CS3035PTA276-1-5-2x UFVS2012 were genotyped and cultivated in Viçosa-MG, using randomized block design with three replications. QTL detection was carried out by linear regression and composite interval mapping. Thirty molecular markers linked to QTL were detected by linear regression for the total of nine agronomic traits. QTL for SWP (seed weight per plant, W100S (weight of 100 seeds, NPP (number of pods per plant, and NSP (number of seeds per plant were detected by composite interval mapping. Four QTL with additive effect are promising for marker-assisted selection (MAS. Particularly, the markers Satt155 and Satt300 could be useful in simultaneous selection for greater SWP, NPP, and NSP.
Magnetic Resonance Fingerprinting with short relaxation intervals.
Amthor, Thomas; Doneva, Mariya; Koken, Peter; Sommer, Karsten; Meineke, Jakob; Börnert, Peter
2017-09-01
The aim of this study was to investigate a technique for improving the performance of Magnetic Resonance Fingerprinting (MRF) in repetitive sampling schemes, in particular for 3D MRF acquisition, by shortening relaxation intervals between MRF pulse train repetitions. A calculation method for MRF dictionaries adapted to short relaxation intervals and non-relaxed initial spin states is presented, based on the concept of stationary fingerprints. The method is applicable to many different k-space sampling schemes in 2D and 3D. For accuracy analysis, T 1 and T 2 values of a phantom are determined by single-slice Cartesian MRF for different relaxation intervals and are compared with quantitative reference measurements. The relevance of slice profile effects is also investigated in this case. To further illustrate the capabilities of the method, an application to in-vivo spiral 3D MRF measurements is demonstrated. The proposed computation method enables accurate parameter estimation even for the shortest relaxation intervals, as investigated for different sampling patterns in 2D and 3D. In 2D Cartesian measurements, we achieved a scan acceleration of more than a factor of two, while maintaining acceptable accuracy: The largest T 1 values of a sample set deviated from their reference values by 0.3% (longest relaxation interval) and 2.4% (shortest relaxation interval). The largest T 2 values showed systematic deviations of up to 10% for all relaxation intervals, which is discussed. The influence of slice profile effects for multislice acquisition is shown to become increasingly relevant for short relaxation intervals. In 3D spiral measurements, a scan time reduction of 36% was achieved, maintaining the quality of in-vivo T1 and T2 maps. Reducing the relaxation interval between MRF sequence repetitions using stationary fingerprint dictionaries is a feasible method to improve the scan efficiency of MRF sequences. The method enables fast implementations of 3D spatially
Mapping earthworm communities in Europe
Rutgers, M.; Orgiazzi, A.; Gardi, C.; Römbke, J.; Jansch, S.; Keith, A.; Neilson, R.; Boag, B.; Schmidt, O.; Murchie, A.K.; Blackshaw, R.P.; Pérès, G.; Cluzeau, D.; Guernion, M.; Briones, M.J.I.; Rodeiro, J.; Pineiro, R.; Diaz Cosin, D.J.; Sousa, J.P.; Suhadolc, M.; Kos, I.; Krogh, P.H.; Faber, J.H.; Mulder, C.; Bogte, J.J.; Wijnen, van H.J.; Schouten, A.J.; Zwart, de D.
2016-01-01
Existing data sets on earthworm communities in Europe were collected, harmonized, collated, modelled and depicted on a soil biodiversity map. Digital Soil Mapping was applied using multiple regressions relating relatively low density earthworm community data to soil characteristics, land use,
Surveillance test interval optimization
International Nuclear Information System (INIS)
Cepin, M.; Mavko, B.
1995-01-01
Technical specifications have been developed on the bases of deterministic analyses, engineering judgment, and expert opinion. This paper introduces our risk-based approach to surveillance test interval (STI) optimization. This approach consists of three main levels. The first level is the component level, which serves as a rough estimation of the optimal STI and can be calculated analytically by a differentiating equation for mean unavailability. The second and third levels give more representative results. They take into account the results of probabilistic risk assessment (PRA) calculated by a personal computer (PC) based code and are based on system unavailability at the system level and on core damage frequency at the plant level
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Interval methods: An introduction
DEFF Research Database (Denmark)
Achenie, L.E.K.; Kreinovich, V.; Madsen, Kaj
2006-01-01
This chapter contains selected papers presented at the Minisymposium on Interval Methods of the PARA'04 Workshop '' State-of-the-Art in Scientific Computing ''. The emphasis of the workshop was on high-performance computing (HPC). The ongoing development of ever more advanced computers provides...... the potential for solving increasingly difficult computational problems. However, given the complexity of modern computer architectures, the task of realizing this potential needs careful attention. A main concern of HPC is the development of software that optimizes the performance of a given computer....... An important characteristic of the computer performance in scientific computing is the accuracy of the Computation results. Often, we can estimate this accuracy by using traditional statistical techniques. However, in many practical situations, we do not know the probability distributions of different...
International Nuclear Information System (INIS)
Turko, B.T.
1983-10-01
A CAMAC based modular multichannel interval timer is described. The timer comprises twelve high resolution time digitizers with a common start enabling twelve independent stop inputs. Ten time ranges from 2.5 μs to 1.3 μs can be preset. Time can be read out in twelve 24-bit words either via CAMAC Crate Controller or an external FIFO register. LSB time calibration is 78.125 ps. An additional word reads out the operational status of twelve stop channels. The system consists of two modules. The analog module contains a reference clock and 13 analog time stretchers. The digital module contains counters, logic and interface circuits. The timer has an excellent differential linearity, thermal stability and crosstalk free performance
Experimenting with musical intervals
Lo Presto, Michael C.
2003-07-01
When two tuning forks of different frequency are sounded simultaneously the result is a complex wave with a repetition frequency that is the fundamental of the harmonic series to which both frequencies belong. The ear perceives this 'musical interval' as a single musical pitch with a sound quality produced by the harmonic spectrum responsible for the waveform. This waveform can be captured and displayed with data collection hardware and software. The fundamental frequency can then be calculated and compared with what would be expected from the frequencies of the tuning forks. Also, graphing software can be used to determine equations for the waveforms and predict their shapes. This experiment could be used in an introductory physics or musical acoustics course as a practical lesson in superposition of waves, basic Fourier series and the relationship between some of the ear's subjective perceptions of sound and the physical properties of the waves that cause them.
Speckmann, B.; Verbeek, K.A.B.
2015-01-01
Necklace maps visualize quantitative data associated with regions by placing scaled symbols, usually disks, without overlap on a closed curve (the necklace) surrounding the map regions. Each region is projected onto an interval on the necklace that contains its symbol. In this paper we address the
Ergodicity of polygonal slap maps
International Nuclear Information System (INIS)
Del Magno, Gianluigi; Pedro Gaivão, José; Lopes Dias, João; Duarte, Pedro
2014-01-01
Polygonal slap maps are piecewise affine expanding maps of the interval obtained by projecting the sides of a polygon along their normals onto the perimeter of the polygon. These maps arise in the study of polygonal billiards with non-specular reflection laws. We study the absolutely continuous invariant probabilities (acips) of the slap maps for several polygons, including regular polygons and triangles. We also present a general method for constructing polygons with slap maps with more than one ergodic acip. (paper)
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...
Directory of Open Access Journals (Sweden)
Ferjan Ormeling
2008-09-01
Full Text Available Discussing the requirements for map data quality, map users and their library/archives environment, the paper focuses on the metadata the user would need for a correct and efficient interpretation of the map data. For such a correct interpretation, knowledge of the rules and guidelines according to which the topographers/cartographers work (such as the kind of data categories to be collected, and the degree to which these rules and guidelines were indeed followed are essential. This is not only valid for the old maps stored in our libraries and archives, but perhaps even more so for the new digital files as the format in which we now have to access our geospatial data. As this would be too much to ask from map librarians/curators, some sort of web 2.0 environment is sought where comments about data quality, completeness and up-to-dateness from knowledgeable map users regarding the specific maps or map series studied can be collected and tagged to scanned versions of these maps on the web. In order not to be subject to the same disadvantages as Wikipedia, where the ‘communis opinio’ rather than scholarship, seems to be decisive, some checking by map curators of this tagged map use information would still be needed. Cooperation between map curators and the International Cartographic Association ( ICA map and spatial data use commission to this end is suggested.
Nonlinear Forecasting With Many Predictors Using Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter; Groenen, Patrick J.F.; Heij, Christiaan
This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predi...
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Constraint-based Attribute and Interval Planning
Jonsson, Ari; Frank, Jeremy
2013-01-01
In this paper we describe Constraint-based Attribute and Interval Planning (CAIP), a paradigm for representing and reasoning about plans. The paradigm enables the description of planning domains with time, resources, concurrent activities, mutual exclusions among sets of activities, disjunctive preconditions and conditional effects. We provide a theoretical foundation for the paradigm, based on temporal intervals and attributes. We then show how the plans are naturally expressed by networks of constraints, and show that the process of planning maps directly to dynamic constraint reasoning. In addition, we de ne compatibilities, a compact mechanism for describing planning domains. We describe how this framework can incorporate the use of constraint reasoning technology to improve planning. Finally, we describe EUROPA, an implementation of the CAIP framework.
Directory of Open Access Journals (Sweden)
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
On piecewise affine interval maps with countably many laps
Czech Academy of Sciences Publication Activity Database
Bobok, J.; Soukenka, Martin
2011-01-01
Roč. 31, č. 3 (2011), s. 753-762 ISSN 1078-0947 Institutional research plan: CEZ:AV0Z20760514 Keywords : dynamical systems * entropy Subject RIV: BA - General Mathematics Impact factor: 0.913, year: 2011 http://www.aimsciences.org/journals/displayArticlesnew.jsp?paperID=6407
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events (SPEs). The total dose received from a major SPE in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be mitigated with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train and provides predictions as accurate as neural network models previously used. (authors)
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside Earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events. The total dose received from a major solar particle event in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be reduced with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train, and provides predictions as accurate as the neural network models that were used previously. (authors)
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Interval stability for complex systems
Klinshov, Vladimir V.; Kirillov, Sergey; Kurths, Jürgen; Nekorkin, Vladimir I.
2018-04-01
Stability of dynamical systems against strong perturbations is an important problem of nonlinear dynamics relevant to many applications in various areas. Here, we develop a novel concept of interval stability, referring to the behavior of the perturbed system during a finite time interval. Based on this concept, we suggest new measures of stability, namely interval basin stability (IBS) and interval stability threshold (IST). IBS characterizes the likelihood that the perturbed system returns to the stable regime (attractor) in a given time. IST provides the minimal magnitude of the perturbation capable to disrupt the stable regime for a given interval of time. The suggested measures provide important information about the system susceptibility to external perturbations which may be useful for practical applications. Moreover, from a theoretical viewpoint the interval stability measures are shown to bridge the gap between linear and asymptotic stability. We also suggest numerical algorithms for quantification of the interval stability characteristics and demonstrate their potential for several dynamical systems of various nature, such as power grids and neural networks.
A gentle introduction to quantile regression for ecologists
Cade, B.S.; Noon, B.R.
2003-01-01
Quantile regression is a way to estimate the conditional quantiles of a response variable distribution in the linear model that provides a more complete view of possible causal relationships between variables in ecological processes. Typically, all the factors that affect ecological processes are not measured and included in the statistical models used to investigate relationships between variables associated with those processes. As a consequence, there may be a weak or no predictive relationship between the mean of the response variable (y) distribution and the measured predictive factors (X). Yet there may be stronger, useful predictive relationships with other parts of the response variable distribution. This primer relates quantile regression estimates to prediction intervals in parametric error distribution regression models (eg least squares), and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of the estimates for homogeneous and heterogeneous regression models.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest haveÂ increased for the spatial modeling and mapping of continuousÂ variables. Random forest is a non-parametric ensembleÂ approach, and unlike traditional regression approaches thereÂ is no direct quantification of prediction error. UnderstandingÂ prediction uncertainty is important when using model-basedÂ continuous maps as...
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
A Practical pedestrian approach to parsimonious regression with inaccurate inputs
Directory of Open Access Journals (Sweden)
Seppo Karrila
2014-04-01
Full Text Available A measurement result often dictates an interval containing the correct value. Interval data is also created by roundoff, truncation, and binning. We focus on such common interval uncertainty in data. Inaccuracy in model inputs is typically ignored on model fitting. We provide a practical approach for regression with inaccurate data: the mathematics is easy, and the linear programming formulations simple to use even in a spreadsheet. This self-contained elementary presentation introduces interval linear systems and requires only basic knowledge of algebra. Feature selection is automatic; but can be controlled to find only a few most relevant inputs; and joint feature selection is enabled for multiple modeled outputs. With more features than cases, a novel connection to compressed sensing emerges: robustness against interval errors-in-variables implies model parsimony, and the input inaccuracies determine the regularization term. A small numerical example highlights counterintuitive results and a dramatic difference to total least squares.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
International Nuclear Information System (INIS)
Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei
2007-01-01
Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
Coronel, Ruben; de Bakker, Jacques M. T.; Wilms-Schopman, Francien J. G.; Opthof, Tobias; Linnenbank, André C.; Belterman, Charly N.; Janse, Michiel J.
2006-01-01
BACKGROUND: Activation recovery intervals (ARIs) and monophasic action potential (MAP) duration are used as measures of action potential duration in beating hearts. However, controversies exist concerning the correct way to record MAPs or calculate ARIs. We have addressed these issues
Confidence Intervals for Assessing Heterogeneity in Generalized Linear Mixed Models
Wagler, Amy E.
2014-01-01
Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. The effect of clustering on the response is often difficult to practically assess partly because it is reported on a scale on which comparisons with regression parameters are difficult to make. This article proposes confidence intervals for…
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
A comparison of regression algorithms for wind speed forecasting at Alexander Bay
CSIR Research Space (South Africa)
Botha, Nicolene
2016-12-01
Full Text Available to forecast 1 to 24 hours ahead, in hourly intervals. Predictions are performed on a wind speed time series with three machine learning regression algorithms, namely support vector regression, ordinary least squares and Bayesian ridge regression. The resulting...
Haemostatic reference intervals in pregnancy
DEFF Research Database (Denmark)
Szecsi, Pal Bela; Jørgensen, Maja; Klajnbard, Anna
2010-01-01
largely unchanged during pregnancy, delivery, and postpartum and were within non-pregnant reference intervals. However, levels of fibrinogen, D-dimer, and coagulation factors VII, VIII, and IX increased markedly. Protein S activity decreased substantially, while free protein S decreased slightly and total......Haemostatic reference intervals are generally based on samples from non-pregnant women. Thus, they may not be relevant to pregnant women, a problem that may hinder accurate diagnosis and treatment of haemostatic disorders during pregnancy. In this study, we establish gestational age......-20, 21-28, 29-34, 35-42, at active labor, and on postpartum days 1 and 2. Reference intervals for each gestational period using only the uncomplicated pregnancies were calculated in all 391 women for activated partial thromboplastin time (aPTT), fibrinogen, fibrin D-dimer, antithrombin, free protein S...
Research on Driver Behavior in Yellow Interval at Signalized Intersections
Directory of Open Access Journals (Sweden)
Zhaosheng Yang
2014-01-01
Full Text Available Vehicles are often caught in dilemma zone when they approach signalized intersections in yellow interval. The existence of dilemma zone which is significantly influenced by driver behavior seriously affects the efficiency and safety of intersections. This paper proposes the driver behavior models in yellow interval by logistic regression and fuzzy decision tree modeling, respectively, based on camera image data. Vehicle’s speed and distance to stop line are considered in logistic regression model, which also brings in a dummy variable to describe installation of countdown timer display. Fuzzy decision tree model is generated by FID3 algorithm whose heuristic information is fuzzy information entropy based on membership functions. This paper concludes that fuzzy decision tree is more accurate to describe driver behavior at signalized intersection than logistic regression model.
Symplectic maps for accelerator lattices
International Nuclear Information System (INIS)
Warnock, R.L.; Ruth, R.; Gabella, W.
1988-05-01
We describe a method for numerical construction of a symplectic map for particle propagation in a general accelerator lattice. The generating function of the map is obtained by integrating the Hamilton-Jacobi equation as an initial-value problem on a finite time interval. Given the generating function, the map is put in explicit form by means of a Fourier inversion technique. We give an example which suggests that the method has promise. 9 refs., 9 figs
Inverse Interval Matrix: A Survey
Czech Academy of Sciences Publication Activity Database
Rohn, Jiří; Farhadsefat, R.
2011-01-01
Roč. 22, - (2011), s. 704-719 E-ISSN 1081-3810 R&D Projects: GA ČR GA201/09/1957; GA ČR GC201/08/J020 Institutional research plan: CEZ:AV0Z10300504 Keywords : interval matrix * inverse interval matrix * NP-hardness * enclosure * unit midpoint * inverse sign stability * nonnegative invertibility * absolute value equation * algorithm Subject RIV: BA - General Mathematics Impact factor: 0.808, year: 2010 http://www.math.technion.ac.il/iic/ ela / ela -articles/articles/vol22_pp704-719.pdf
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2009-01-01
. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Dynamic Properties of QT Intervals
Czech Academy of Sciences Publication Activity Database
Halámek, Josef; Jurák, Pavel; Vondra, Vlastimil; Lipoldová, J.; Leinveber, Pavel; Plachý, M.; Fráňa, P.; Kára, T.
2009-01-01
Roč. 36, - (2009), s. 517-520 ISSN 0276-6574 R&D Projects: GA ČR GA102/08/1129; GA MŠk ME09050 Institutional research plan: CEZ:AV0Z20650511 Keywords : QT Intervals * arrhythmia diagnosis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering http://cinc.mit.edu/archives/2009/pdf/0517.pdf
Haemostatic reference intervals in pregnancy
DEFF Research Database (Denmark)
Szecsi, Pal Bela; Jørgensen, Maja; Klajnbard, Anna
2010-01-01
Haemostatic reference intervals are generally based on samples from non-pregnant women. Thus, they may not be relevant to pregnant women, a problem that may hinder accurate diagnosis and treatment of haemostatic disorders during pregnancy. In this study, we establish gestational age-specific refe......Haemostatic reference intervals are generally based on samples from non-pregnant women. Thus, they may not be relevant to pregnant women, a problem that may hinder accurate diagnosis and treatment of haemostatic disorders during pregnancy. In this study, we establish gestational age......-specific reference intervals for coagulation tests during normal pregnancy. Eight hundred one women with expected normal pregnancies were included in the study. Of these women, 391 had no complications during pregnancy, vaginal delivery, or postpartum period. Plasma samples were obtained at gestational weeks 13......-20, 21-28, 29-34, 35-42, at active labor, and on postpartum days 1 and 2. Reference intervals for each gestational period using only the uncomplicated pregnancies were calculated in all 391 women for activated partial thromboplastin time (aPTT), fibrinogen, fibrin D-dimer, antithrombin, free protein S...
Robust misinterpretation of confidence intervals
Hoekstra, Rink; Morey, Richard; Rouder, Jeffrey N.; Wagenmakers, Eric-Jan
2014-01-01
Null hypothesis significance testing (NHST) is undoubtedly the most common inferential technique used to justify claims in the social sciences. However, even staunch defenders of NHST agree that its outcomes are often misinterpreted. Confidence intervals (CIs) have frequently been proposed as a more
Interval matrices: Regularity generates singularity
Czech Academy of Sciences Publication Activity Database
Rohn, Jiří; Shary, S.P.
2018-01-01
Roč. 540, 1 March (2018), s. 149-159 ISSN 0024-3795 Institutional support: RVO:67985807 Keywords : interval matrix * regularity * singularity * P-matrix * absolute value equation * diagonally singilarizable matrix Subject RIV: BA - General Mathematics Impact factor: 0.973, year: 2016
Chaotic dynamics from interspike intervals
DEFF Research Database (Denmark)
Pavlov, A N; Sosnovtseva, Olga; Mosekilde, Erik
2001-01-01
Considering two different mathematical models describing chaotic spiking phenomena, namely, an integrate-and-fire and a threshold-crossing model, we discuss the problem of extracting dynamics from interspike intervals (ISIs) and show that the possibilities of computing the largest Lyapunov expone...
Mapping earthworm communities in Europe
DEFF Research Database (Denmark)
Rutgers, Michiel; Orgiazzi, Alberto; Gardi, Ciro
Existing data sets on earthworm communities in Europe were collected, harmonized, modelled and depicted on a soil biodiversity map of Europe. Digital Soil Mapping was applied using multiple regressions relating relatively low density earthworm community data to soil characteristics, land use...
Regression-based Multi-View Facial Expression Recognition
Rudovic, Ognjen; Patras, Ioannis; Pantic, Maja
2010-01-01
We present a regression-based scheme for multi-view facial expression recognition based on 2蚠D geometric features. We address the problem by mapping facial points (e.g. mouth corners) from non-frontal to frontal view where further recognition of the expressions can be performed using a
Experimental uncertainty estimation and statistics for data having interval uncertainty.
Energy Technology Data Exchange (ETDEWEB)
Kreinovich, Vladik (Applied Biomathematics, Setauket, New York); Oberkampf, William Louis (Applied Biomathematics, Setauket, New York); Ginzburg, Lev (Applied Biomathematics, Setauket, New York); Ferson, Scott (Applied Biomathematics, Setauket, New York); Hajagos, Janos (Applied Biomathematics, Setauket, New York)
2007-05-01
This report addresses the characterization of measurements that include epistemic uncertainties in the form of intervals. It reviews the application of basic descriptive statistics to data sets which contain intervals rather than exclusively point estimates. It describes algorithms to compute various means, the median and other percentiles, variance, interquartile range, moments, confidence limits, and other important statistics and summarizes the computability of these statistics as a function of sample size and characteristics of the intervals in the data (degree of overlap, size and regularity of widths, etc.). It also reviews the prospects for analyzing such data sets with the methods of inferential statistics such as outlier detection and regressions. The report explores the tradeoff between measurement precision and sample size in statistical results that are sensitive to both. It also argues that an approach based on interval statistics could be a reasonable alternative to current standard methods for evaluating, expressing and propagating measurement uncertainties.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Meaney, Christopher; Moineddin, Rahim
2014-01-24
In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the
International Nuclear Information System (INIS)
McParland, C.; Bieser, F.
1984-01-01
The principal component of the Bevalac HISS facility is a large super-conducting 3 Tesla dipole. The facility's need for a large magnetic volume spectrometer resulted in a large gap geometry - a 2 meter pole tip diameter and a 1 meter pole gap. Obviously, the field required detailed mapping for effective use as a spectrometer. The mapping device was designed with several major features in mind. The device would measure field values on a grid which described a closed rectangular solid. The grid would be a regular with the exact measurement intervals adjustable by software. The device would function unattended over the long period of time required to complete a field map. During this time, the progress of the map could be monitored by anyone with access to the HISS VAX computer. Details of the mechanical, electrical, and control design follow
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
Gaussian process regression for tool wear prediction
Kong, Dongdong; Chen, Yongjie; Li, Ning
2018-05-01
To realize and accelerate the pace of intelligent manufacturing, this paper presents a novel tool wear assessment technique based on the integrated radial basis function based kernel principal component analysis (KPCA_IRBF) and Gaussian process regression (GPR) for real-timely and accurately monitoring the in-process tool wear parameters (flank wear width). The KPCA_IRBF is a kind of new nonlinear dimension-increment technique and firstly proposed for feature fusion. The tool wear predictive value and the corresponding confidence interval are both provided by utilizing the GPR model. Besides, GPR performs better than artificial neural networks (ANN) and support vector machines (SVM) in prediction accuracy since the Gaussian noises can be modeled quantitatively in the GPR model. However, the existence of noises will affect the stability of the confidence interval seriously. In this work, the proposed KPCA_IRBF technique helps to remove the noises and weaken its negative effects so as to make the confidence interval compressed greatly and more smoothed, which is conducive for monitoring the tool wear accurately. Moreover, the selection of kernel parameter in KPCA_IRBF can be easily carried out in a much larger selectable region in comparison with the conventional KPCA_RBF technique, which helps to improve the efficiency of model construction. Ten sets of cutting tests are conducted to validate the effectiveness of the presented tool wear assessment technique. The experimental results show that the in-process flank wear width of tool inserts can be monitored accurately by utilizing the presented tool wear assessment technique which is robust under a variety of cutting conditions. This study lays the foundation for tool wear monitoring in real industrial settings.
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Statistical intervals a guide for practitioners
Hahn, Gerald J
2011-01-01
Presents a detailed exposition of statistical intervals and emphasizes applications in industry. The discussion differentiates at an elementary level among different kinds of statistical intervals and gives instruction with numerous examples and simple math on how to construct such intervals from sample data. This includes confidence intervals to contain a population percentile, confidence intervals on probability of meeting specified threshold value, and prediction intervals to include observation in a future sample. Also has an appendix containing computer subroutines for nonparametric stati
SPLINE LINEAR REGRESSION USED FOR EVALUATING FINANCIAL ASSETS 1
Directory of Open Access Journals (Sweden)
Liviu GEAMBAŞU
2010-12-01
Full Text Available One of the most important preoccupations of financial markets participants was and still is the problem of determining more precise the trend of financial assets prices. For solving this problem there were written many scientific papers and were developed many mathematical and statistical models in order to better determine the financial assets price trend. If until recently the simple linear models were largely used due to their facile utilization, the financial crises that affected the world economy starting with 2008 highlight the necessity of adapting the mathematical models to variation of economy. A simple to use model but adapted to economic life realities is the spline linear regression. This type of regression keeps the continuity of regression function, but split the studied data in intervals with homogenous characteristics. The characteristics of each interval are highlighted and also the evolution of market over all the intervals, resulting reduced standard errors. The first objective of the article is the theoretical presentation of the spline linear regression, also referring to scientific national and international papers related to this subject. The second objective is applying the theoretical model to data from the Bucharest Stock Exchange
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
Dijets at large rapidity intervals
Pope, B G
2001-01-01
Inclusive diet production at large pseudorapidity intervals ( Delta eta ) between the two jets has been suggested as a regime for observing BFKL dynamics. We have measured the dijet cross section for large Delta eta in pp collisions at square root s = 1800 and 630 GeV using the DOE detector. The partonic cross section increases strongly with the size of Delta eta . The observed growth is even stronger than expected on the basis of BFKL resummation in the leading logarithmic approximation. The growth of the partonic cross section can be accommodated with an effective BFKL intercept of alpha /sub BFKL/(20 GeV) = 1.65 +or- 0.07.
Variational collocation on finite intervals
International Nuclear Information System (INIS)
Amore, Paolo; Cervantes, Mayra; Fernandez, Francisco M
2007-01-01
In this paper, we study a set of functions, defined on an interval of finite width, which are orthogonal and which reduce to the sinc functions when the appropriate limit is taken. We show that these functions can be used within a variational approach to obtain accurate results for a variety of problems. We have applied them to the interpolation of functions on finite domains and to the solution of the Schroedinger equation, and we have compared the performance of the present approach with others
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Comparing the performance of various digital soil mapping approaches to map physical soil properties
Laborczi, Annamária; Takács, Katalin; Pásztor, László
2015-04-01
Spatial information on physical soil properties is intensely expected, in order to support environmental related and land use management decisions. One of the most widely used properties to characterize soils physically is particle size distribution (PSD), which determines soil water management and cultivability. According to their size, different particles can be categorized as clay, silt, or sand. The size intervals are defined by national or international textural classification systems. The relative percentage of sand, silt, and clay in the soil constitutes textural classes, which are also specified miscellaneously in various national and/or specialty systems. The most commonly used is the classification system of the United States Department of Agriculture (USDA). Soil texture information is essential input data in meteorological, hydrological and agricultural prediction modelling. Although Hungary has a great deal of legacy soil maps and other relevant soil information, it often occurs, that maps do not exist on a certain characteristic with the required thematic and/or spatial representation. The recent developments in digital soil mapping (DSM), however, provide wide opportunities for the elaboration of object specific soil maps (OSSM) with predefined parameters (resolution, accuracy, reliability etc.). Due to the simultaneous richness of available Hungarian legacy soil data, spatial inference methods and auxiliary environmental information, there is a high versatility of possible approaches for the compilation of a given soil map. This suggests the opportunity of optimization. For the creation of an OSSM one might intend to identify the optimum set of soil data, method and auxiliary co-variables optimized for the resources (data costs, computation requirements etc.). We started comprehensive analysis of the effects of the various DSM components on the accuracy of the output maps on pilot areas. The aim of this study is to compare and evaluate different
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Background stratified Poisson regression analysis of cohort data
International Nuclear Information System (INIS)
Richardson, David B.; Langholz, Bryan
2012-01-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. (orig.)
Some Characterizations of Convex Interval Games
Brânzei, R.; Tijs, S.H.; Alparslan-Gok, S.Z.
2008-01-01
This paper focuses on new characterizations of convex interval games using the notions of exactness and superadditivity. We also relate big boss interval games with concave interval games and obtain characterizations of big boss interval games in terms of exactness and subadditivity.
Interval-Censored Time-to-Event Data Methods and Applications
Chen, Ding-Geng
2012-01-01
Interval-Censored Time-to-Event Data: Methods and Applications collects the most recent techniques, models, and computational tools for interval-censored time-to-event data. Top biostatisticians from academia, biopharmaceutical industries, and government agencies discuss how these advances are impacting clinical trials and biomedical research. Divided into three parts, the book begins with an overview of interval-censored data modeling, including nonparametric estimation, survival functions, regression analysis, multivariate data analysis, competing risks analysis, and other models for interva
Assessment of deforestation using regression; Hodnotenie odlesnenia s vyuzitim regresie
Energy Technology Data Exchange (ETDEWEB)
Juristova, J. [Univerzita Komenskeho, Prirodovedecka fakulta, Katedra kartografie, geoinformatiky a DPZ, 84215 Bratislava (Slovakia)
2013-04-16
This work is devoted to the evaluation of deforestation using regression methods through software Idrisi Taiga. Deforestation is evaluated by the method of logistic regression. The dependent variable has discrete values '0' and '1', indicating that the deforestation occurred or not. Independent variables have continuous values, expressing the distance from the edge of the deforested areas of forests from urban areas, the river and the road network. The results were also used in predicting the probability of deforestation in subsequent periods. The result is a map showing the output probability of deforestation for the periods 1990/2000 and 200/2006 in accordance with predetermined coefficients (values of independent variables). (authors)
Lilly, P.; Yanai, R. D.; Buckley, H. L.; Case, B. S.; Woollons, R. C.; Holdaway, R. J.; Johnson, J.
2016-12-01
Calculations of forest biomass and elemental content require many measurements and models, each contributing uncertainty to the final estimates. While sampling error is commonly reported, based on replicate plots, error due to uncertainty in the regression used to estimate biomass from tree diameter is usually not quantified. Some published estimates of uncertainty due to the regression models have used the uncertainty in the prediction of individuals, ignoring uncertainty in the mean, while others have propagated uncertainty in the mean while ignoring individual variation. Using the simple case of the calcium concentration of sugar maple leaves, we compare the variation among individuals (the standard deviation) to the uncertainty in the mean (the standard error) and illustrate the declining importance in the prediction of individual concentrations as the number of individuals increases. For allometric models, the analogous statistics are the prediction interval (or the residual variation in the model fit) and the confidence interval (describing the uncertainty in the best fit model). The effect of propagating these two sources of error is illustrated using the mass of sugar maple foliage. The uncertainty in individual tree predictions was large for plots with few trees; for plots with 30 trees or more, the uncertainty in individuals was less important than the uncertainty in the mean. Authors of previously published analyses have reanalyzed their data to show the magnitude of these two sources of uncertainty in scales ranging from experimental plots to entire countries. The most correct analysis will take both sources of uncertainty into account, but for practical purposes, country-level reports of uncertainty in carbon stocks, as required by the IPCC, can ignore the uncertainty in individuals. Ignoring the uncertainty in the mean will lead to exaggerated estimates of confidence in estimates of forest biomass and carbon and nutrient contents.
Rossi, M.; Apuani, T.; Felletti, F.
2009-04-01
The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9
Spontaneous regression of pulmonary bullae
International Nuclear Information System (INIS)
Satoh, H.; Ishikawa, H.; Ohtsuka, M.; Sekizawa, K.
2002-01-01
The natural history of pulmonary bullae is often characterized by gradual, progressive enlargement. Spontaneous regression of bullae is, however, very rare. We report a case in which complete resolution of pulmonary bullae in the left upper lung occurred spontaneously. The management of pulmonary bullae is occasionally made difficult because of gradual progressive enlargement associated with abnormal pulmonary function. Some patients have multiple bulla in both lungs and/or have a history of pulmonary emphysema. Others have a giant bulla without emphysematous change in the lungs. Our present case had treated lung cancer with no evidence of local recurrence. He had no emphysematous change in lung function test and had no complaints, although the high resolution CT scan shows evidence of underlying minimal changes of emphysema. Ortin and Gurney presented three cases of spontaneous reduction in size of bulla. Interestingly, one of them had a marked decrease in the size of a bulla in association with thickening of the wall of the bulla, which was observed in our patient. This case we describe is of interest, not only because of the rarity with which regression of pulmonary bulla has been reported in the literature, but also because of the spontaneous improvements in the radiological picture in the absence of overt infection or tumor. Copyright (2002) Blackwell Science Pty Ltd
Quantum algorithm for linear regression
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
,
2008-01-01
The U.S. Geological Survey (USGS) produced its first topographic map in 1879, the same year it was established. Today, more than 100 years and millions of map copies later, topographic mapping is still a central activity for the USGS. The topographic map remains an indispensable tool for government, science, industry, and leisure. Much has changed since early topographers traveled the unsettled West and carefully plotted the first USGS maps by hand. Advances in survey techniques, instrumentation, and design and printing technologies, as well as the use of aerial photography and satellite data, have dramatically improved mapping coverage, accuracy, and efficiency. Yet cartography, the art and science of mapping, may never before have undergone change more profound than today.
Robust misinterpretation of confidence intervals.
Hoekstra, Rink; Morey, Richard D; Rouder, Jeffrey N; Wagenmakers, Eric-Jan
2014-10-01
Null hypothesis significance testing (NHST) is undoubtedly the most common inferential technique used to justify claims in the social sciences. However, even staunch defenders of NHST agree that its outcomes are often misinterpreted. Confidence intervals (CIs) have frequently been proposed as a more useful alternative to NHST, and their use is strongly encouraged in the APA Manual. Nevertheless, little is known about how researchers interpret CIs. In this study, 120 researchers and 442 students-all in the field of psychology-were asked to assess the truth value of six particular statements involving different interpretations of a CI. Although all six statements were false, both researchers and students endorsed, on average, more than three statements, indicating a gross misunderstanding of CIs. Self-declared experience with statistics was not related to researchers' performance, and, even more surprisingly, researchers hardly outperformed the students, even though the students had not received any education on statistical inference whatsoever. Our findings suggest that many researchers do not know the correct interpretation of a CI. The misunderstandings surrounding p-values and CIs are particularly unfortunate because they constitute the main tools by which psychologists draw conclusions from data.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Using the confidence interval confidently.
Hazra, Avijit
2017-10-01
Biomedical research is seldom done with entire populations but rather with samples drawn from a population. Although we work with samples, our goal is to describe and draw inferences regarding the underlying population. It is possible to use a sample statistic and estimates of error in the sample to get a fair idea of the population parameter, not as a single value, but as a range of values. This range is the confidence interval (CI) which is estimated on the basis of a desired confidence level. Calculation of the CI of a sample statistic takes the general form: CI = Point estimate ± Margin of error, where the margin of error is given by the product of a critical value (z) derived from the standard normal curve and the standard error of point estimate. Calculation of the standard error varies depending on whether the sample statistic of interest is a mean, proportion, odds ratio (OR), and so on. The factors affecting the width of the CI include the desired confidence level, the sample size and the variability in the sample. Although the 95% CI is most often used in biomedical research, a CI can be calculated for any level of confidence. A 99% CI will be wider than 95% CI for the same sample. Conflict between clinical importance and statistical significance is an important issue in biomedical research. Clinical importance is best inferred by looking at the effect size, that is how much is the actual change or difference. However, statistical significance in terms of P only suggests whether there is any difference in probability terms. Use of the CI supplements the P value by providing an estimate of actual clinical effect. Of late, clinical trials are being designed specifically as superiority, non-inferiority or equivalence studies. The conclusions from these alternative trial designs are based on CI values rather than the P value from intergroup comparison.
Direct Interval Forecasting of Wind Power
DEFF Research Database (Denmark)
Wan, Can; Xu, Zhao; Pinson, Pierre
2013-01-01
This letter proposes a novel approach to directly formulate the prediction intervals of wind power generation based on extreme learning machine and particle swarm optimization, where prediction intervals are generated through direct optimization of both the coverage probability and sharpness...
A note on birth interval distributions
International Nuclear Information System (INIS)
Shrestha, G.
1989-08-01
A considerable amount of work has been done regarding the birth interval analysis in mathematical demography. This paper is prepared with the intention of reviewing some probability models related to interlive birth intervals proposed by different researchers. (author). 14 refs
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2016-12-01
Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.
Optimal Data Interval for Estimating Advertising Response
Gerard J. Tellis; Philip Hans Franses
2006-01-01
The abundance of highly disaggregate data (e.g., at five-second intervals) raises the question of the optimal data interval to estimate advertising carryover. The literature assumes that (1) the optimal data interval is the interpurchase time, (2) too disaggregate data causes a disaggregation bias, and (3) recovery of true parameters requires assumption of the underlying advertising process. In contrast, we show that (1) the optimal data interval is what we call , (2) too disaggregate data do...
An Adequate First Order Logic of Intervals
DEFF Research Database (Denmark)
Chaochen, Zhou; Hansen, Michael Reichhardt
1998-01-01
This paper introduces left and right neighbourhoods as primitive interval modalities to define other unary and binary modalities of intervals in a first order logic with interval length. A complete first order logic for the neighbourhood modalities is presented. It is demonstrated how the logic can...... support formal specification and verification of liveness and fairness, and also of various notions of real analysis....
Consistency and refinement for Interval Markov Chains
DEFF Research Database (Denmark)
Delahaye, Benoit; Larsen, Kim Guldstrand; Legay, Axel
2012-01-01
Interval Markov Chains (IMC), or Markov Chains with probability intervals in the transition matrix, are the base of a classic specification theory for probabilistic systems [18]. The standard semantics of IMCs assigns to a specification the set of all Markov Chains that satisfy its interval...
Multivariate interval-censored survival data
DEFF Research Database (Denmark)
Hougaard, Philip
2014-01-01
Interval censoring means that an event time is only known to lie in an interval (L,R], with L the last examination time before the event, and R the first after. In the univariate case, parametric models are easily fitted, whereas for non-parametric models, the mass is placed on some intervals, de...
Determinants of LSIL Regression in Women from a Colombian Cohort
International Nuclear Information System (INIS)
Molano, Monica; Gonzalez, Mauricio; Gamboa, Oscar; Ortiz, Natasha; Luna, Joaquin; Hernandez, Gustavo; Posso, Hector; Murillo, Raul; Munoz, Nubia
2010-01-01
Objective: To analyze the role of Human Papillomavirus (HPV) and other risk factors in the regression of cervical lesions in women from the Bogota Cohort. Methods: 200 HPV positive women with abnormal cytology were included for regression analysis. The time of lesion regression was modeled using methods for interval censored survival time data. Median duration of total follow-up was 9 years. Results: 80 (40%) women were diagnosed with Atypical Squamous Cells of Undetermined Significance (ASCUS) or Atypical Glandular Cells of Undetermined Significance (AGUS) while 120 (60%) were diagnosed with Low Grade Squamous Intra-epithelial Lesions (LSIL). Globally, 40% of the lesions were still present at first year of follow up, while 1.5% was still present at 5 year check-up. The multivariate model showed similar regression rates for lesions in women with ASCUS/AGUS and women with LSIL (HR= 0.82, 95% CI 0.59-1.12). Women infected with HR HPV types and those with mixed infections had lower regression rates for lesions than did women infected with LR types (HR=0.526, 95% CI 0.33-0.84, for HR types and HR=0.378, 95% CI 0.20-0.69, for mixed infections). Furthermore, women over 30 years had a higher lesion regression rate than did women under 30 years (HR1.53, 95% CI 1.03-2.27). The study showed that the median time for lesion regression was 9 months while the median time for HPV clearance was 12 months. Conclusions: In the studied population, the type of infection and the age of the women are critical factors for the regression of cervical lesions.
Electrocardiographic Abnormalities and QTc Interval in Patients Undergoing Hemodialysis.
Directory of Open Access Journals (Sweden)
Yuxin Nie
abnormalities were found in this group. In multiple regression analyses, serum Ca2+ concentration before HD and LAD were independent variables of QTc interval prolongation. UA, ferritin, and interventricular septum were independent variables of ΔQTc.Prolonged QT interval is very common in HD patients and is associated with several risk factors. An appropriate concentration of dialysate electrolytes should be chosen depending on patients' clinical conditions.
DEFF Research Database (Denmark)
Salovaara-Moring, Inka
2016-01-01
practice. In particular, mapping environmental damage, endangered species, and human-made disasters has become one focal point for environmental knowledge production. This type of digital map has been highlighted as a processual turn in critical cartography, whereas in related computational journalism...... of a geo-visualization within information mapping that enhances embodiment in the experience of the information. InfoAmazonia is defined as a digitally created map-space within which journalistic practice can be seen as dynamic, performative interactions between journalists, ecosystems, space, and species...
Strange distributionally chaotic triangular maps
International Nuclear Information System (INIS)
Paganoni, L.; Smital, J.
2005-01-01
The notion of distributional chaos was introduced by Schweizer, Smital [Measures of chaos and a spectral decompostion of dynamical systems on the interval. Trans. Amer. Math. Soc. 344;1994:737-854] for continuous maps of the interval. For continuous maps of a compact metric space three mutually nonequivalent versions of distributional chaos, DC1-DC3, can be considered. In this paper we study distributional chaos in the class T m of triangular maps of the square which are monotone on the fibres; such maps must have zero topological entropy. The main results: (i) There is an F-bar T m such that F-bar DC2 and F vertical bar Rec(F)-bar DC3. (ii) If no ω-limit set of an F-bar T m contains two minimal subsets then F-bar DC1. This completes recent results obtained by Forti et al. [Dynamics of homeomorphisms on minimal sets generated by triangular mappings. Bull Austral Math Soc 59;1999:1-20], Smital, Stefankova [Distributional chaos for triangular maps, Chaos, Solitons and Fractals 21;2004:1125-8], and Balibrea et al. [The three versions of distributional chaos. Chaos, Solitons and Fractals 23;2005:1581-3]. The paper contributes to the solution of a long-standing open problem by Sharkovsky concerning classification of triangular maps
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Regularized Label Relaxation Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu
2018-04-01
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Hypotensive Response Magnitude and Duration in Hypertensives: Continuous and Interval Exercise
Directory of Open Access Journals (Sweden)
Raphael Santos Teodoro de Carvalho
2015-03-01
Full Text Available Background: Although exercise training is known to promote post-exercise hypotension, there is currently no consistent argument about the effects of manipulating its various components (intensity, duration, rest periods, types of exercise, training methods on the magnitude and duration of hypotensive response. Objective: To compare the effect of continuous and interval exercises on hypotensive response magnitude and duration in hypertensive patients by using ambulatory blood pressure monitoring (ABPM. Methods: The sample consisted of 20 elderly hypertensives. Each participant underwent three ABPM sessions: one control ABPM, without exercise; one ABPM after continuous exercise; and one ABPM after interval exercise. Systolic blood pressure (SBP, diastolic blood pressure (DBP, mean arterial pressure (MAP, heart rate (HR and double product (DP were monitored to check post-exercise hypotension and for comparison between each ABPM. Results: ABPM after continuous exercise and after interval exercise showed post-exercise hypotension and a significant reduction (p < 0.05 in SBP, DBP, MAP and DP for 20 hours as compared with control ABPM. Comparing ABPM after continuous and ABPM after interval exercise, a significant reduction (p < 0.05 in SBP, DBP, MAP and DP was observed in the latter. Conclusion: Continuous and interval exercise trainings promote post-exercise hypotension with reduction in SBP, DBP, MAP and DP in the 20 hours following exercise. Interval exercise training causes greater post-exercise hypotension and lower cardiovascular overload as compared with continuous exercise.
Visualizing the Logistic Map with a Microcontroller
Serna, Juan D.; Joshi, Amitabh
2012-01-01
The logistic map is one of the simplest nonlinear dynamical systems that clearly exhibits the route to chaos. In this paper, we explore the evolution of the logistic map using an open-source microcontroller connected to an array of light-emitting diodes (LEDs). We divide the one-dimensional domain interval [0,1] into ten equal parts, an associate…
Technology & Learning, 2005
2005-01-01
Concept maps are graphical ways of working with ideas and presenting information. They reveal patterns and relationships and help students to clarify their thinking, and to process, organize and prioritize. Displaying information visually--in concept maps, word webs, or diagrams--stimulates creativity. Being able to think logically teaches…
Bonellie, Sandra R
2012-10-01
To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother. Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.
Directory of Open Access Journals (Sweden)
Maira S. Oliveira
2014-05-01
Full Text Available The electrocardiography (ECG QT interval is influenced by fluctuations in heart rate (HR what may lead to misinterpretation of its length. Considering that alterations in QT interval length reflect abnormalities of the ventricular repolarisation which predispose to occurrence of arrhythmias, this variable must be properly evaluated. The aim of this work is to determine which method of correcting the QT interval is the most appropriate for dogs regarding different ranges of normal HR (different breeds. Healthy adult dogs (n=130; German Shepherd, Boxer, Pit Bull Terrier, and Poodle were submitted to ECG examination and QT intervals were determined in triplicates from the bipolar limb II lead and corrected for the effects of HR through the application of three published formulae involving quadratic, cubic or linear regression. The mean corrected QT values (QTc obtained using the diverse formulae were significantly different (ρ<0.05, while those derived according to the equation QTcV = QT + 0.087(1- RR were the most consistent (linear regression. QTcV values were strongly correlated (r=0.83 with the QT interval and showed a coefficient of variation of 8.37% and a 95% confidence interval of 0.22-0.23 s. Owing to its simplicity and reliability, the QTcV was considered the most appropriate to be used for the correction of QT interval in dogs.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Quantifying uncertainty on sediment loads using bootstrap confidence intervals
Slaets, Johanna I. F.; Piepho, Hans-Peter; Schmitter, Petra; Hilger, Thomas; Cadisch, Georg
2017-01-01
Load estimates are more informative than constituent concentrations alone, as they allow quantification of on- and off-site impacts of environmental processes concerning pollutants, nutrients and sediment, such as soil fertility loss, reservoir sedimentation and irrigation channel siltation. While statistical models used to predict constituent concentrations have been developed considerably over the last few years, measures of uncertainty on constituent loads are rarely reported. Loads are the product of two predictions, constituent concentration and discharge, integrated over a time period, which does not make it straightforward to produce a standard error or a confidence interval. In this paper, a linear mixed model is used to estimate sediment concentrations. A bootstrap method is then developed that accounts for the uncertainty in the concentration and discharge predictions, allowing temporal correlation in the constituent data, and can be used when data transformations are required. The method was tested for a small watershed in Northwest Vietnam for the period 2010-2011. The results showed that confidence intervals were asymmetric, with the highest uncertainty in the upper limit, and that a load of 6262 Mg year-1 had a 95 % confidence interval of (4331, 12 267) in 2010 and a load of 5543 Mg an interval of (3593, 8975) in 2011. Additionally, the approach demonstrated that direct estimates from the data were biased downwards compared to bootstrap median estimates. These results imply that constituent loads predicted from regression-type water quality models could frequently be underestimating sediment yields and their environmental impact.
Analyzing hospitalization data: potential limitations of Poisson regression.
Weaver, Colin G; Ravani, Pietro; Oliver, Matthew J; Austin, Peter C; Quinn, Robert R
2015-08-01
Poisson regression is commonly used to analyze hospitalization data when outcomes are expressed as counts (e.g. number of days in hospital). However, data often violate the assumptions on which Poisson regression is based. More appropriate extensions of this model, while available, are rarely used. We compared hospitalization data between 206 patients treated with hemodialysis (HD) and 107 treated with peritoneal dialysis (PD) using Poisson regression and compared results from standard Poisson regression with those obtained using three other approaches for modeling count data: negative binomial (NB) regression, zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression. We examined the appropriateness of each model and compared the results obtained with each approach. During a mean 1.9 years of follow-up, 183 of 313 patients (58%) were never hospitalized (indicating an excess of 'zeros'). The data also displayed overdispersion (variance greater than mean), violating another assumption of the Poisson model. Using four criteria, we determined that the NB and ZINB models performed best. According to these two models, patients treated with HD experienced similar hospitalization rates as those receiving PD {NB rate ratio (RR): 1.04 [bootstrapped 95% confidence interval (CI): 0.49-2.20]; ZINB summary RR: 1.21 (bootstrapped 95% CI 0.60-2.46)}. Poisson and ZIP models fit the data poorly and had much larger point estimates than the NB and ZINB models [Poisson RR: 1.93 (bootstrapped 95% CI 0.88-4.23); ZIP summary RR: 1.84 (bootstrapped 95% CI 0.88-3.84)]. We found substantially different results when modeling hospitalization data, depending on the approach used. Our results argue strongly for a sound model selection process and improved reporting around statistical methods used for modeling count data. © The Author 2015. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Probability Distribution for Flowing Interval Spacing
International Nuclear Information System (INIS)
Kuzio, S.
2001-01-01
The purpose of this analysis is to develop a probability distribution for flowing interval spacing. A flowing interval is defined as a fractured zone that transmits flow in the Saturated Zone (SZ), as identified through borehole flow meter surveys (Figure 1). This analysis uses the term ''flowing interval spacing'' as opposed to fractured spacing, which is typically used in the literature. The term fracture spacing was not used in this analysis because the data used identify a zone (or a flowing interval) that contains fluid-conducting fractures but does not distinguish how many or which fractures comprise the flowing interval. The flowing interval spacing is measured between the midpoints of each flowing interval. Fracture spacing within the SZ is defined as the spacing between fractures, with no regard to which fractures are carrying flow. The Development Plan associated with this analysis is entitled, ''Probability Distribution for Flowing Interval Spacing'', (CRWMS M and O 2000a). The parameter from this analysis may be used in the TSPA SR/LA Saturated Zone Flow and Transport Work Direction and Planning Documents: (1) ''Abstraction of Matrix Diffusion for SZ Flow and Transport Analyses'' (CRWMS M and O 1999a) and (2) ''Incorporation of Heterogeneity in SZ Flow and Transport Analyses'', (CRWMS M and O 1999b). A limitation of this analysis is that the probability distribution of flowing interval spacing may underestimate the effect of incorporating matrix diffusion processes in the SZ transport model because of the possible overestimation of the flowing interval spacing. Larger flowing interval spacing results in a decrease in the matrix diffusion processes. This analysis may overestimate the flowing interval spacing because the number of fractures that contribute to a flowing interval cannot be determined from the data. Because each flowing interval probably has more than one fracture contributing to a flowing interval, the true flowing interval spacing could be
Correct Bayesian and frequentist intervals are similar
International Nuclear Information System (INIS)
Atwood, C.L.
1986-01-01
This paper argues that Bayesians and frequentists will normally reach numerically similar conclusions, when dealing with vague data or sparse data. It is shown that both statistical methodologies can deal reasonably with vague data. With sparse data, in many important practical cases Bayesian interval estimates and frequentist confidence intervals are approximately equal, although with discrete data the frequentist intervals are somewhat longer. This is not to say that the two methodologies are equally easy to use: The construction of a frequentist confidence interval may require new theoretical development. Bayesians methods typically require numerical integration, perhaps over many variables. Also, Bayesian can easily fall into the trap of over-optimism about their amount of prior knowledge. But in cases where both intervals are found correctly, the two intervals are usually not very different. (orig.)
Moss, Donald B
2006-01-01
The author uses the metaphor of mapping to illuminate a structural feature of racist thought, locating the degraded object along vertical and horizontal axes. These axes establish coordinates of hierarchy and of distance. With the coordinates in place, racist thought begins to seem grounded in natural processes. The other's identity becomes consolidated, and parochialism results. The use of this kind of mapping is illustrated via two patient vignettes. The author presents Freud's (1905, 1927) views in relation to such a "mapping" process, as well as Adorno's (1951) and Baldwin's (1965). Finally, the author conceptualizes the crucial status of primitivity in the workings of racist thought.
Directory of Open Access Journals (Sweden)
Anjan Mukherjee
2016-08-01
Full Text Available In this paper we introduce the concept of restricted interval valued neutrosophic sets (RIVNS in short. Some basic operations and properties of RIVNS are discussed. The concept of restricted interval valued neutrosophic topology is also introduced together with restricted interval valued neutrosophic finer and restricted interval valued neutrosophic coarser topology. We also define restricted interval valued neutrosophic interior and closer of a restricted interval valued neutrosophic set. Some theorems and examples are cites. Restricted interval valued neutrosophic subspace topology is also studied.
Fast metabolite identification with Input Output Kernel Regression
Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho
2016-01-01
Motivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. Availability and implementation: Contact: celine.brouard@aalto.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307628
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Semiparametric regression during 2003–2007
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Conditional prediction intervals of wind power generation
DEFF Research Database (Denmark)
Pinson, Pierre; Kariniotakis, Georges
2010-01-01
A generic method for the providing of prediction intervals of wind power generation is described. Prediction intervals complement the more common wind power point forecasts, by giving a range of potential outcomes for a given probability, their so-called nominal coverage rate. Ideally they inform...... on the characteristics of prediction errors for providing conditional interval forecasts. By simultaneously generating prediction intervals with various nominal coverage rates, one obtains full predictive distributions of wind generation. Adapted resampling is applied here to the case of an onshore Danish wind farm...... to the case of a large number of wind farms in Europe and Australia among others is finally discussed....
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
... greatly advanced genetics research. The improved quality of genetic data has reduced the time required to identify a ... cases, a matter of months or even weeks. Genetic mapping data generated by the HGP's laboratories is freely accessible ...
Neural net generated seismic facies map and attribute facies map
International Nuclear Information System (INIS)
Addy, S.K.; Neri, P.
1998-01-01
The usefulness of 'seismic facies maps' in the analysis of an Upper Wilcox channel system in a 3-D survey shot by CGG in 1995 in Lavaca county in south Texas was discussed. A neural net-generated seismic facies map is a quick hydrocarbon exploration tool that can be applied regionally as well as on a prospect scale. The new technology is used to classify a constant interval parallel to a horizon in a 3-D seismic volume based on the shape of the wiggle traces using a neural network technology. The tool makes it possible to interpret sedimentary features of a petroleum deposit. The same technology can be used in regional mapping by making 'attribute facies maps' in which various forms of amplitude attributes, phase attributes or frequency attributes can be used
An introduction to using Bayesian linear regression with clinical data.
Baldwin, Scott A; Larson, Michael J
2017-11-01
Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Two SPSS programs for interpreting multiple regression results.
Lorenzo-Seva, Urbano; Ferrando, Pere J; Chico, Eliseo
2010-02-01
When multiple regression is used in explanation-oriented designs, it is very important to determine both the usefulness of the predictor variables and their relative importance. Standardized regression coefficients are routinely provided by commercial programs. However, they generally function rather poorly as indicators of relative importance, especially in the presence of substantially correlated predictors. We provide two user-friendly SPSS programs that implement currently recommended techniques and recent developments for assessing the relevance of the predictors. The programs also allow the user to take into account the effects of measurement error. The first program, MIMR-Corr.sps, uses a correlation matrix as input, whereas the second program, MIMR-Raw.sps, uses the raw data and computes bootstrap confidence intervals of different statistics. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from http://brm.psychonomic-journals.org/content/supplemental.
Pullin, A N; Pairis-Garcia, M D; Campbell, B J; Campler, M R; Proudfoot, K L
2017-11-01
When considering methodologies for collecting behavioral data, continuous sampling provides the most complete and accurate data set whereas instantaneous sampling can provide similar results and also increase the efficiency of data collection. However, instantaneous time intervals require validation to ensure accurate estimation of the data. Therefore, the objective of this study was to validate scan sampling intervals for lambs housed in a feedlot environment. Feeding, lying, standing, drinking, locomotion, and oral manipulation were measured on 18 crossbred lambs housed in an indoor feedlot facility for 14 h (0600-2000 h). Data from continuous sampling were compared with data from instantaneous scan sampling intervals of 5, 10, 15, and 20 min using a linear regression analysis. Three criteria determined if a time interval accurately estimated behaviors: 1) ≥ 0.90, 2) slope not statistically different from 1 ( > 0.05), and 3) intercept not statistically different from 0 ( > 0.05). Estimations for lying behavior were accurate up to 20-min intervals, whereas feeding and standing behaviors were accurate only at 5-min intervals (i.e., met all 3 regression criteria). Drinking, locomotion, and oral manipulation demonstrated poor associations () for all tested intervals. The results from this study suggest that a 5-min instantaneous sampling interval will accurately estimate lying, feeding, and standing behaviors for lambs housed in a feedlot, whereas continuous sampling is recommended for the remaining behaviors. This methodology will contribute toward the efficiency, accuracy, and transparency of future behavioral data collection in lamb behavior research.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Spontaneous regression of a congenital melanocytic nevus
Directory of Open Access Journals (Sweden)
Amiya Kumar Nath
2011-01-01
Full Text Available Congenital melanocytic nevus (CMN may rarely regress which may also be associated with a halo or vitiligo. We describe a 10-year-old girl who presented with CMN on the left leg since birth, which recently started to regress spontaneously with associated depigmentation in the lesion and at a distant site. Dermoscopy performed at different sites of the regressing lesion demonstrated loss of epidermal pigments first followed by loss of dermal pigments. Histopathology and Masson-Fontana stain demonstrated lymphocytic infiltration and loss of pigment production in the regressing area. Immunohistochemistry staining (S100 and HMB-45, however, showed that nevus cells were present in the regressing areas.
Predicting restoration of kidney function during CRRT-free intervals
Directory of Open Access Journals (Sweden)
Heise Daniel
2012-01-01
Full Text Available Abstract Background Renal failure is common in critically ill patients and frequently requires continuous renal replacement therapy (CRRT. CRRT is discontinued at regular intervals for routine changes of the disposable equipment or for replacing clogged filter membrane assemblies. The present study was conducted to determine if the necessity to continue CRRT could be predicted during the CRRT-free period. Materials and methods In the period from 2003 to 2006, 605 patients were treated with CRRT in our ICU. A total of 222 patients with 448 CRRT-free intervals had complete data sets and were used for analysis. Of the total CRRT-free periods, 225 served as an evaluation group. Twenty-nine parameters with an assumed influence on kidney function were analyzed with regard to their potential to predict the restoration of kidney function during the CRRT-free interval. Using univariate analysis and logistic regression, a prospective index was developed and validated in the remaining 223 CRRT-free periods to establish its prognostic strength. Results Only three parameters showed an independent influence on the restoration of kidney function during CRRT-free intervals: the number of previous CRRT cycles (medians in the two outcome groups: 1 vs. 2, the "Sequential Organ Failure Assessment"-score (means in the two outcome groups: 8.3 vs. 9.2 and urinary output after the cessation of CRRT (medians in two outcome groups: 66 ml/h vs. 10 ml/h. The prognostic index, which was calculated from these three variables, showed a satisfactory potential to predict the kidney function during the CRRT-free intervals; Receiver operating characteristic (ROC analysis revealed an area under the curve of 0.798. Conclusion Restoration of kidney function during CRRT-free periods can be predicted with an index calculated from three variables. Prospective trials in other hospitals must clarify whether our results are generally transferable to other patient populations.
Energy compensation after sprint- and high-intensity interval training.
Schubert, Matthew M; Palumbo, Elyse; Seay, Rebekah F; Spain, Katie K; Clarke, Holly E
2017-01-01
Many individuals lose less weight than expected in response to exercise interventions when considering the increased energy expenditure of exercise (ExEE). This is due to energy compensation in response to ExEE, which may include increases in energy intake (EI) and decreases in non-exercise physical activity (NEPA). We examined the degree of energy compensation in healthy young men and women in response to interval training. Data were examined from a prior study in which 24 participants (mean age, BMI, & VO2max = 28 yrs, 27.7 kg•m-2, and 32 mL∙kg-1∙min-1) completed either 4 weeks of sprint-interval training or high-intensity interval training. Energy compensation was calculated from changes in body composition (air displacement plethysmography) and exercise energy expenditure was calculated from mean heart rate based on the heart rate-VO2 relationship. Differences between high (≥ 100%) and low (high levels of energy compensation gained fat mass, lost fat-free mass, and had lower change scores for VO2max and NEPA. Linear regression results indicated that lower levels of energy compensation were associated with increases in ΔVO2max (p interval training. In agreement with prior work, increases in ΔVO2max and ΔNEPA were associated with lower energy compensation. Future studies should focus on identifying if a dose-response relationship for energy compensation exists in response to interval training, and what underlying mechanisms and participant traits contribute to the large variation between individuals.
Intermediate and advanced topics in multilevel logistic regression analysis.
Austin, Peter C; Merlo, Juan
2017-09-10
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Interpretation of Confidence Interval Facing the Conflict
Andrade, Luisa; Fernández, Felipe
2016-01-01
As literature has reported, it is usual that university students in statistics courses, and even statistics teachers, interpret the confidence level associated with a confidence interval as the probability that the parameter value will be between the lower and upper interval limits. To confront this misconception, class activities have been…
Interval logic. Proof theory and theorem proving
DEFF Research Database (Denmark)
Rasmussen, Thomas Marthedal
2002-01-01
of a direction of an interval, and present a sound and complete Hilbert proof system for it. Because of its generality, SIL can conveniently act as a general formalism in which other interval logics can be encoded. We develop proof theory for SIL including both a sequent calculus system and a labelled natural...
Risk factors for QTc interval prolongation
Heemskerk, Charlotte P.M.; Pereboom, Marieke; van Stralen, Karlijn; Berger, Florine A.; van den Bemt, Patricia M.L.A.; Kuijper, Aaf F.M.; van der Hoeven, Ruud T M; Mantel-Teeuwisse, Aukje K.; Becker, Matthijs L
2018-01-01
Purpose: Prolongation of the QTc interval may result in Torsade de Pointes, a ventricular arrhythmia. Numerous risk factors for QTc interval prolongation have been described, including the use of certain drugs. In clinical practice, there is much debate about the management of the risks involved. In
Interval Forecast for Smooth Transition Autoregressive Model ...
African Journals Online (AJOL)
In this paper, we propose a simple method for constructing interval forecast for smooth transition autoregressive (STAR) model. This interval forecast is based on bootstrapping the residual error of the estimated STAR model for each forecast horizon and computing various Akaike information criterion (AIC) function. This new ...
Confidence Interval Approximation For Treatment Variance In ...
African Journals Online (AJOL)
In a random effects model with a single factor, variation is partitioned into two as residual error variance and treatment variance. While a confidence interval can be imposed on the residual error variance, it is not possible to construct an exact confidence interval for the treatment variance. This is because the treatment ...
New interval forecast for stationary autoregressive models ...
African Journals Online (AJOL)
In this paper, we proposed a new forecasting interval for stationary Autoregressive, AR(p) models using the Akaike information criterion (AIC) function. Ordinarily, the AIC function is used to determine the order of an AR(p) process. In this study however, AIC forecast interval compared favorably with the theoretical forecast ...
Is Posidonia oceanica regression a general feature in the Mediterranean Sea?
Directory of Open Access Journals (Sweden)
M. BONACORSI
2013-03-01
Full Text Available Over the last few years, a widespread regression of Posidonia oceanica meadows has been noticed in the Mediterranean Sea. However, the magnitude of this decline is still debated. The objectives of this study are (i to assess the spatio-temporal evolution of Posidonia oceanica around Cap Corse (Corsica over time comparing available ancient maps (from 1960 with a new (2011 detailed map realized combining different techniques (aerial photographs, SSS, ROV, scuba diving; (ii evaluate the reliability of ancient maps; (iii discuss observed regression of the meadows in relation to human pressure along the 110 km of coast. Thus, the comparison with previous data shows that, apart from sites clearly identified with the actual evolution, there is a relative stability of the surfaces occupied by the seagrass Posidonia oceanica. The recorded differences seem more related to changes in mapping techniques. These results confirm that in areas characterized by a moderate anthropogenic impact, the Posidonia oceanica meadow has no significant regression and that the changes due to the evolution of mapping techniques are not negligible. However, others facts should be taken into account before extrapolating to the Mediterranean Sea (e.g. actually mapped surfaces and assessing the amplitude of the actual regression.
The number of subjects per variable required in linear regression analyses.
Austin, Peter C; Steyerberg, Ewout W
2015-06-01
To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Hypotensive response magnitude and duration in hypertensives: continuous and interval exercise.
Carvalho, Raphael Santos Teodoro de; Pires, Cássio Mascarenhas Robert; Junqueira, Gustavo Cardoso; Freitas, Dayana; Marchi-Alves, Leila Maria
2015-03-01
Although exercise training is known to promote post-exercise hypotension, there is currently no consistent argument about the effects of manipulating its various components (intensity, duration, rest periods, types of exercise, training methods) on the magnitude and duration of hypotensive response. To compare the effect of continuous and interval exercises on hypotensive response magnitude and duration in hypertensive patients by using ambulatory blood pressure monitoring (ABPM). The sample consisted of 20 elderly hypertensives. Each participant underwent three ABPM sessions: one control ABPM, without exercise; one ABPM after continuous exercise; and one ABPM after interval exercise. Systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), heart rate (HR) and double product (DP) were monitored to check post-exercise hypotension and for comparison between each ABPM. ABPM after continuous exercise and after interval exercise showed post-exercise hypotension and a significant reduction (p ABPM. Comparing ABPM after continuous and ABPM after interval exercise, a significant reduction (p < 0.05) in SBP, DBP, MAP and DP was observed in the latter. Continuous and interval exercise trainings promote post-exercise hypotension with reduction in SBP, DBP, MAP and DP in the 20 hours following exercise. Interval exercise training causes greater post-exercise hypotension and lower cardiovascular overload as compared with continuous exercise.
Seafloor mapping of large areas using multibeam system - Indian experience
Digital Repository Service at National Institute of Oceanography (India)
Kodagali, V.N.; KameshRaju, K.A; Ramprasad, T.
averaged and merged to produce large area maps. Maps were generated in the scale of 1 mil. and 1.5 mil covering area of about 2 mil. sq.km in single map. Also, depth contour interval were generated. A computer program was developed to convert the depth data...
A Comparison of Advanced Regression Algorithms for Quantifying Urban Land Cover
Directory of Open Access Journals (Sweden)
Akpona Okujeni
2014-07-01
Full Text Available Quantitative methods for mapping sub-pixel land cover fractions are gaining increasing attention, particularly with regard to upcoming hyperspectral satellite missions. We evaluated five advanced regression algorithms combined with synthetically mixed training data for quantifying urban land cover from HyMap data at 3.6 and 9 m spatial resolution. Methods included support vector regression (SVR, kernel ridge regression (KRR, artificial neural networks (NN, random forest regression (RFR and partial least squares regression (PLSR. Our experiments demonstrate that both kernel methods SVR and KRR yield high accuracies for mapping complex urban surface types, i.e., rooftops, pavements, grass- and tree-covered areas. SVR and KRR models proved to be stable with regard to the spatial and spectral differences between both images and effectively utilized the higher complexity of the synthetic training mixtures for improving estimates for coarser resolution data. Observed deficiencies mainly relate to known problems arising from spectral similarities or shadowing. The remaining regressors either revealed erratic (NN or limited (RFR and PLSR performances when comprehensively mapping urban land cover. Our findings suggest that the combination of kernel-based regression methods, such as SVR and KRR, with synthetically mixed training data is well suited for quantifying urban land cover from imaging spectrometer data at multiple scales.
Expressing Intervals in Automated Service Negotiation
Clark, Kassidy P.; Warnier, Martijn; van Splunter, Sander; Brazier, Frances M. T.
During automated negotiation of services between autonomous agents, utility functions are used to evaluate the terms of negotiation. These terms often include intervals of values which are prone to misinterpretation. It is often unclear if an interval embodies a continuum of real numbers or a subset of natural numbers. Furthermore, it is often unclear if an agent is expected to choose only one value, multiple values, a sub-interval or even multiple sub-intervals. Additional semantics are needed to clarify these issues. Normally, these semantics are stored in a domain ontology. However, ontologies are typically domain specific and static in nature. For dynamic environments, in which autonomous agents negotiate resources whose attributes and relationships change rapidly, semantics should be made explicit in the service negotiation. This paper identifies issues that are prone to misinterpretation and proposes a notation for expressing intervals. This notation is illustrated using an example in WS-Agreement.
Reviewing interval cancers: Time well spent?
International Nuclear Information System (INIS)
Gower-Thomas, Kate; Fielder, Hilary M.P.; Branston, Lucy; Greening, Sarah; Beer, Helen; Rogers, Cerilan
2002-01-01
OBJECTIVES: To categorize interval cancers, and thus identify false-negatives, following prevalent and incident screens in the Welsh breast screening programme. SETTING: Breast Test Wales (BTW) Llandudno, Cardiff and Swansea breast screening units. METHODS: Five hundred and sixty interval breast cancers identified following negative mammographic screening between 1989 and 1997 were reviewed by eight screening radiologists. The blind review was achieved by mixing the screening films of women who subsequently developed an interval cancer with screen negative films of women who did not develop cancer, in a ratio of 4:1. Another radiologist used patients' symptomatic films to record a reference against which the reviewers' reports of the screening films were compared. Interval cancers were categorized as 'true', 'occult', 'false-negative' or 'unclassified' interval cancers or interval cancers with minimal signs, based on the National Health Service breast screening programme (NHSBSP) guidelines. RESULTS: Of the classifiable interval films, 32% were false-negatives, 55% were true intervals and 12% occult. The proportion of false-negatives following incident screens was half that following prevalent screens (P = 0.004). Forty percent of the seed films were recalled by the panel. CONCLUSIONS: Low false-negative interval cancer rates following incident screens (18%) versus prevalent screens (36%) suggest that lower cancer detection rates at incident screens may have resulted from fewer cancers than expected being present, rather than from a failure to detect tumours. The panel method for categorizing interval cancers has significant flaws as the results vary markedly with different protocol and is no more accurate than other, quicker and more timely methods. Gower-Thomas, K. et al. (2002)
Stochastic development regression on non-linear manifolds
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
We introduce a regression model for data on non-linear manifolds. The model describes the relation between a set of manifold valued observations, such as shapes of anatomical objects, and Euclidean explanatory variables. The approach is based on stochastic development of Euclidean diffusion...... processes to the manifold. Defining the data distribution as the transition distribution of the mapped stochastic process, parameters of the model, the non-linear analogue of design matrix and intercept, are found via maximum likelihood. The model is intrinsically related to the geometry encoded...
Modelling and mapping the topsoil organic carbon content for Tanzania
Kempen, Bas; Kaaya, Abel; Ngonyani Mhaiki, Consolatha; Kiluvia, Shani; Ruiperez-Gonzalez, Maria; Batjes, Niels; Dalsgaard, Soren
2014-05-01
Soil organic carbon (SOC), held in soil organic matter, is a key indicator of soil health and plays an important role in the global carbon cycle. The soil can act as a net source or sink of carbon depending on land use and management. Deforestation and forest degradation lead to the release of vast amounts of carbon from the soil in the form of greenhouse gasses, especially in tropical countries. Tanzania has a high deforestation rate: it is estimated that the country loses 1.1% of its total forested area annually. During 2010-2013 Tanzania has been a pilot country under the UN-REDD programme. This programme has supported Tanzania in its initial efforts towards reducing greenhouse gas emission from forest degradation and deforestation and towards preserving soil carbon stocks. Formulation and implementation of the national REDD strategy requires detailed information on the five carbon pools among these the SOC pool. The spatial distribution of SOC contents and stocks was not available for Tanzania. The initial aim of this research, was therefore to develop high-resolution maps of the SOC content for the country. The mapping exercise was carried out in a collaborative effort with four Tanzanian institutes and data from the Africa Soil Information Service initiative (AfSIS). The mapping exercise was provided with over 3200 field observations on SOC from four sources; this is the most comprehensive soil dataset collected in Tanzania so far. The main source of soil samples was the National Forest Monitoring and Assessment (NAFORMA). The carbon maps were generated by means of digital soil mapping using regression-kriging. Maps at 250 m spatial resolution were developed for four depth layers: 0-10 cm, 10-20 cm, 20-30 cm, and 0-30 cm. A total of 37 environmental GIS data layers were prepared for use as covariates in the regression model. These included vegetation indices, terrain parameters, surface temperature, spectral reflectances, a land cover map and a small
Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression
Yang, Aiyuan; Yan, Chunxia; Zhu, Feng; Zhao, Zhongmeng; Cao, Zhi
2013-01-01
Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds. PMID:23984382
Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression
Directory of Open Access Journals (Sweden)
Xuanping Zhang
2013-01-01
Full Text Available Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR, which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds.
CIMP status of interval colon cancers: another piece to the puzzle.
Arain, Mustafa A; Sawhney, Mandeep; Sheikh, Shehla; Anway, Ruth; Thyagarajan, Bharat; Bond, John H; Shaukat, Aasma
2010-05-01
Colon cancers diagnosed in the interval after a complete colonoscopy may occur due to limitations of colonoscopy or due to the development of new tumors, possibly reflecting molecular and environmental differences in tumorigenesis resulting in rapid tumor growth. In a previous study from our group, interval cancers (colon cancers diagnosed within 5 years of a complete colonoscopy) were almost four times more likely to demonstrate microsatellite instability (MSI) than non-interval cancers. In this study we extended our molecular analysis to compare the CpG island methylator phenotype (CIMP) status of interval and non-interval colorectal cancers and investigate the relationship between the CIMP and MSI pathways in the pathogenesis of interval cancers. We searched our institution's cancer registry for interval cancers, defined as colon cancers that developed within 5 years of a complete colonoscopy. These were frequency matched in a 1:2 ratio by age and sex to patients with non-interval cancers (defined as colon cancers diagnosed on a patient's first recorded colonoscopy). Archived cancer specimens for all subjects were retrieved and tested for CIMP gene markers. The MSI status of subjects identified between 1989 and 2004 was known from our previous study. Tissue specimens of newly identified cases and controls (between 2005 and 2006) were tested for MSI. There were 1,323 cases of colon cancer diagnosed over the 17-year study period, of which 63 were identified as having interval cancer and matched to 131 subjects with non-interval cancer. Study subjects were almost all Caucasian men. CIMP was present in 57% of interval cancers compared to 33% of non-interval cancers (P=0.004). As shown previously, interval cancers were more likely than non-interval cancers to occur in the proximal colon (63% vs. 39%; P=0.002), and have MSI 29% vs. 11%, P=0.004). In multivariable logistic regression model, proximal location (odds ratio (OR) 1.85; 95% confidence interval (CI) 1
Aagten-Murphy, David; Cappagli, Giulia; Burr, David
2014-03-01
Expert musicians are able to time their actions accurately and consistently during a musical performance. We investigated how musical expertise influences the ability to reproduce auditory intervals and how this generalises across different techniques and sensory modalities. We first compared various reproduction strategies and interval length, to examine the effects in general and to optimise experimental conditions for testing the effect of music, and found that the effects were robust and consistent across different paradigms. Focussing on a 'ready-set-go' paradigm subjects reproduced time intervals drawn from distributions varying in total length (176, 352 or 704 ms) or in the number of discrete intervals within the total length (3, 5, 11 or 21 discrete intervals). Overall, Musicians performed more veridical than Non-Musicians, and all subjects reproduced auditory-defined intervals more accurately than visually-defined intervals. However, Non-Musicians, particularly with visual stimuli, consistently exhibited a substantial and systematic regression towards the mean interval. When subjects judged intervals from distributions of longer total length they tended to regress more towards the mean, while the ability to discriminate between discrete intervals within the distribution had little influence on subject error. These results are consistent with a Bayesian model that minimizes reproduction errors by incorporating a central tendency prior weighted by the subject's own temporal precision relative to the current distribution of intervals. Finally a strong correlation was observed between all durations of formal musical training and total reproduction errors in both modalities (accounting for 30% of the variance). Taken together these results demonstrate that formal musical training improves temporal reproduction, and that this improvement transfers from audition to vision. They further demonstrate the flexibility of sensorimotor mechanisms in adapting to
INTERVAL OBSERVER FOR A BIOLOGICAL REACTOR MODEL
Directory of Open Access Journals (Sweden)
T. A. Kharkovskaia
2014-05-01
Full Text Available The method of an interval observer design for nonlinear systems with parametric uncertainties is considered. The interval observer synthesis problem for systems with varying parameters consists in the following. If there is the uncertainty restraint for the state values of the system, limiting the initial conditions of the system and the set of admissible values for the vector of unknown parameters and inputs, the interval existence condition for the estimations of the system state variables, containing the actual state at a given time, needs to be held valid over the whole considered time segment as well. Conditions of the interval observers design for the considered class of systems are shown. They are: limitation of the input and state, the existence of a majorizing function defining the uncertainty vector for the system, Lipschitz continuity or finiteness of this function, the existence of an observer gain with the suitable Lyapunov matrix. The main condition for design of such a device is cooperativity of the interval estimation error dynamics. An individual observer gain matrix selection problem is considered. In order to ensure the property of cooperativity for interval estimation error dynamics, a static transformation of coordinates is proposed. The proposed algorithm is demonstrated by computer modeling of the biological reactor. Possible applications of these interval estimation systems are the spheres of robust control, where the presence of various types of uncertainties in the system dynamics is assumed, biotechnology and environmental systems and processes, mechatronics and robotics, etc.
DEFF Research Database (Denmark)
Dehlholm, Christian; Brockhoff, Per B.; Bredie, Wender Laurentius Petrus
2012-01-01
by the practical testing environment. As a result of the changes, a reasonable assumption would be to question the consequences caused by the variations in method procedures. Here, the aim is to highlight the proven or hypothetic consequences of variations of Projective Mapping. Presented variations will include...... instructions and influence heavily the product placements and the descriptive vocabulary (Dehlholm et.al., 2012b). The type of assessors performing the method influences results with an extra aspect in Projective Mapping compared to more analytical tests, as the given spontaneous perceptions are much dependent......Projective Mapping (Risvik et.al., 1994) and its Napping (Pagès, 2003) variations have become increasingly popular in the sensory field for rapid collection of spontaneous product perceptions. It has been applied in variations which sometimes are caused by the purpose of the analysis and sometimes...
DEFF Research Database (Denmark)
Kirkeby, Carsten Thure; Hisham Beshara Halasa, Tariq; Gussmann, Maya Katrin
2017-01-01
the transmission rate. We use data from the two simulation models and vary the sampling intervals and the size of the population sampled. We devise two new methods to determine transmission rate, and compare these to the frequently used Poisson regression method in both epidemic and endemic situations. For most...... tested scenarios these new methods perform similar or better than Poisson regression, especially in the case of long sampling intervals. We conclude that transmission rate estimates are easily biased, which is important to take into account when using these rates in simulation models....
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Shephard, N.
2004-01-01
This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing...... the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities....
Statistical methods in regression and calibration analysis of chromosome aberration data
International Nuclear Information System (INIS)
Merkle, W.
1983-01-01
The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)
International Nuclear Information System (INIS)
Lee, S.K.; Vilela, P.; Willinsky, R.; TerBrugge, K.G.
2002-01-01
Spontaneous regression of cerebral arteriovenous malformation (AVM) is rare and poorly understood. We reviewed the clinical and angiographic findings in patients who had spontaneous regression of cerebral AVMs to determine whether common features were present. The clinical and angiographic findings of four cases from our series and 29 cases from the literature were retrospectively reviewed. The clinical and angiographic features analyzed were: age at diagnosis, initial presentation, venous drainage pattern, number of draining veins, location of the AVM, number of arterial feeders, clinical events during the interval period to thrombosis, and interval period to spontaneous thrombosis. Common clinical and angiographic features of spontaneous regression of cerebral AVMs are: intracranial hemorrhage as an initial presentation, small AVMs, and a single draining vein. Spontaneous regression of cerebral AVMs can not be predicted by clinical or angiographic features, therefore it should not be considered as an option in cerebral AVM management, despite its proven occurrence. (orig.)
Energy compensation after sprint- and high-intensity interval training.
Directory of Open Access Journals (Sweden)
Matthew M Schubert
Full Text Available Many individuals lose less weight than expected in response to exercise interventions when considering the increased energy expenditure of exercise (ExEE. This is due to energy compensation in response to ExEE, which may include increases in energy intake (EI and decreases in non-exercise physical activity (NEPA. We examined the degree of energy compensation in healthy young men and women in response to interval training.Data were examined from a prior study in which 24 participants (mean age, BMI, & VO2max = 28 yrs, 27.7 kg•m-2, and 32 mL∙kg-1∙min-1 completed either 4 weeks of sprint-interval training or high-intensity interval training. Energy compensation was calculated from changes in body composition (air displacement plethysmography and exercise energy expenditure was calculated from mean heart rate based on the heart rate-VO2 relationship. Differences between high (≥ 100% and low (< 100% levels of energy compensation were assessed. Linear regressions were utilized to determine associations between energy compensation and ΔVO2max, ΔEI, ΔNEPA, and Δresting metabolic rate.Very large individual differences in energy compensation were noted. In comparison to individuals with low levels of compensation, individuals with high levels of energy compensation gained fat mass, lost fat-free mass, and had lower change scores for VO2max and NEPA. Linear regression results indicated that lower levels of energy compensation were associated with increases in ΔVO2max (p < 0.001 and ΔNEPA (p < 0.001.Considerable variation exists in response to short-term, low dose interval training. In agreement with prior work, increases in ΔVO2max and ΔNEPA were associated with lower energy compensation. Future studies should focus on identifying if a dose-response relationship for energy compensation exists in response to interval training, and what underlying mechanisms and participant traits contribute to the large variation between individuals.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
DEFF Research Database (Denmark)
Salovaara-Moring, Inka
. In particular, mapping environmental damage, endangered species, and human made disasters has become one of the focal point of affective knowledge production. These ‘more-than-humangeographies’ practices include notions of species, space and territory, and movement towards a new political ecology. This type...... of digital cartographies has been highlighted as the ‘processual turn’ in critical cartography, whereas in related computational journalism it can be seen as an interactive and iterative process of mapping complex and fragile ecological developments. This paper looks at computer-assisted cartography as part...
Advanced Interval Management: A Benefit Analysis
Timer, Sebastian; Peters, Mark
2016-01-01
This document is the final report for the NASA Langley Research Center (LaRC)- sponsored task order 'Possible Benefits for Advanced Interval Management Operations.' Under this research project, Architecture Technology Corporation performed an analysis to determine the maximum potential benefit to be gained if specific Advanced Interval Management (AIM) operations were implemented in the National Airspace System (NAS). The motivation for this research is to guide NASA decision-making on which Interval Management (IM) applications offer the most potential benefit and warrant further research.
Generalized production planning problem under interval uncertainty
Directory of Open Access Journals (Sweden)
Samir A. Abass
2010-06-01
Full Text Available Data in many real life engineering and economical problems suffer from inexactness. Herein we assume that we are given some intervals in which the data can simultaneously and independently perturb. We consider the generalized production planning problem with interval data. The interval data are in both of the objective function and constraints. The existing results concerning the qualitative and quantitative analysis of basic notions in parametric production planning problem. These notions are the set of feasible parameters, the solvability set and the stability set of the first kind.
Reconstruction of dynamical systems from interspike intervals
International Nuclear Information System (INIS)
Sauer, T.
1994-01-01
Attractor reconstruction from interspike interval (ISI) data is described, in rough analogy with Taken's theorem for attractor reconstruction from time series. Assuming a generic integrate-and-fire model coupling the dynamical system to the spike train, there is a one-to-one correspondence between the system states and interspike interval vectors of sufficiently large dimension. The correspondence has an important implication: interspike intervals can be forecast from past history. We show that deterministically driven ISI series can be distinguished from stochastically driven ISI series on the basis of prediction error
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Ole E. Barndorff-Nielsen; Neil Shephard
2002-01-01
This paper analyses multivariate high frequency financial data using realised covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis and covariance. It will be based on a fixed interval of time (e.g. a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions and covariances change through time. In particular w...
DEFF Research Database (Denmark)
van Haelst, Ingrid M M; van Klei, Wilton A; Doodeman, Hieronymus J
2014-01-01
OBJECTIVE: To investigate the association between the use of a selective serotonin reuptake inhibitor (SSRI) and the occurrence of QT interval prolongation in an elderly surgical population. METHOD: A cross-sectional study was conducted among patients (> 60 years) scheduled for outpatient...... preanesthesia evaluation in the period 2007 until 2012. The index group included elderly users of an SSRI. The reference group of nonusers of antidepressants was matched to the index group on sex and year of scheduled surgery (ratio, 1:1). The primary outcome was the occurrence of QT interval prolongation shown...... on electrocardiogram. The QT interval was corrected for heart rate (QTc interval). The secondary outcome was the duration of the QTc interval. The outcomes were adjusted for confounding by using regression techniques. RESULTS: The index and reference groups included 397 users of an SSRI and 397 nonusers, respectively...
Modeling Relationships Between Flight Crew Demographics and Perceptions of Interval Management
Remy, Benjamin; Wilson, Sara R.
2016-01-01
The Interval Management Alternative Clearances (IMAC) human-in-the-loop simulation experiment was conducted to assess interval management system performance and participants' acceptability and workload while performing three interval management clearance types. Twenty-four subject pilots and eight subject controllers flew ten high-density arrival scenarios into Denver International Airport during two weeks of data collection. This analysis examined the possible relationships between subject pilot demographics on reported perceptions of interval management in IMAC. Multiple linear regression models were created with a new software tool to predict subject pilot questionnaire item responses from demographic information. General patterns were noted across models that may indicate flight crew demographics influence perceptions of interval management.
Intact interval timing in circadian CLOCK mutants.
Cordes, Sara; Gallistel, C R
2008-08-28
While progress has been made in determining the molecular basis for the circadian clock, the mechanism by which mammalian brains time intervals measured in seconds to minutes remains a mystery. An obvious question is whether the interval-timing mechanism shares molecular machinery with the circadian timing mechanism. In the current study, we trained circadian CLOCK +/- and -/- mutant male mice in a peak-interval procedure with 10 and 20-s criteria. The mutant mice were more active than their wild-type littermates, but there were no reliable deficits in the accuracy or precision of their timing as compared with wild-type littermates. This suggests that expression of the CLOCK protein is not necessary for normal interval timing.
Nonparametric functional mapping of quantitative trait loci.
Yang, Jie; Wu, Rongling; Casella, George
2009-03-01
Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.
International Nuclear Information System (INIS)
2012-01-01
This report explains the energetic map of Uruguay as well as the different systems that delimits political frontiers in the region. The electrical system importance is due to the electricity, oil and derived , natural gas, potential study, biofuels, wind and solar energy
Speckmann, B.; Verbeek, K.A.B.
2010-01-01
Statistical data associated with geographic regions is nowadays globally available in large amounts and hence automated methods to visually display these data are in high demand. There are several well-established thematic map types for quantitative data on the ratio-scale associated with regions:
DEFF Research Database (Denmark)
Salovaara-Moring, Inka
towards a new political ecology. This type of digital cartographies has been highlighted as the ‘processual turn’ in critical cartography, whereas in related computational journalism it can be seen as an interactive and iterative process of mapping complex and fragile ecological developments. This paper...
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Recurrence interval analysis of trading volumes.
Ren, Fei; Zhou, Wei-Xing
2010-06-01
We study the statistical properties of the recurrence intervals τ between successive trading volumes exceeding a certain threshold q. The recurrence interval analysis is carried out for the 20 liquid Chinese stocks covering a period from January 2000 to May 2009, and two Chinese indices from January 2003 to April 2009. Similar to the recurrence interval distribution of the price returns, the tail of the recurrence interval distribution of the trading volumes follows a power-law scaling, and the results are verified by the goodness-of-fit tests using the Kolmogorov-Smirnov (KS) statistic, the weighted KS statistic and the Cramér-von Mises criterion. The measurements of the conditional probability distribution and the detrended fluctuation function show that both short-term and long-term memory effects exist in the recurrence intervals between trading volumes. We further study the relationship between trading volumes and price returns based on the recurrence interval analysis method. It is found that large trading volumes are more likely to occur following large price returns, and the comovement between trading volumes and price returns is more pronounced for large trading volumes.
Interval Size and Affect: An Ethnomusicological Perspective
Directory of Open Access Journals (Sweden)
Sarha Moore
2013-08-01
Full Text Available This commentary addresses Huron and Davis's question of whether "The Harmonic Minor Provides an Optimum Way of Reducing Average Melodic Interval Size, Consistent with Sad Affect Cues" within any non-Western musical cultures. The harmonic minor scale and other semitone-heavy scales, such as Bhairav raga and Hicaz makam, are featured widely in the musical cultures of North India and the Middle East. Do melodies from these genres also have a preponderance of semitone intervals and low incidence of the augmented second interval, as in Huron and Davis's sample? Does the presence of more semitone intervals in a melody affect its emotional connotations in different cultural settings? Are all semitone intervals equal in their effect? My own ethnographic research within these cultures reveals comparable connotations in melodies that linger on semitone intervals, centered on concepts of tension and metaphors of falling. However, across different musical cultures there may also be neutral or lively interpretations of these same pitch sets, dependent on context, manner of performance, and tradition. Small pitch movement may also be associated with social functions such as prayer or lullabies, and may not be described as "sad." "Sad," moreover may not connote the same affect cross-culturally.
Multifractal distribution of spike intervals for two oscillators coupled by unreliable pulses
International Nuclear Information System (INIS)
Kestler, Johannes; Kinzel, Wolfgang
2006-01-01
Two neurons coupled by unreliable synapses are modelled by leaky integrate-and-fire neurons and stochastic on-off synapses. The dynamics is mapped to an iterated function system. Numerical calculations yield a multifractal distribution of interspike intervals. The covering, information and correlation dimensions are calculated as a function of synaptic strength and transmission probability. (letter to the editor)
Translating QT interval prolongation from conscious dogs to humans.
Dubois, Vincent F S; Smania, Giovanni; Yu, Huixin; Graf, Ramona; Chain, Anne S Y; Danhof, Meindert; Della Pasqua, Oscar
2017-02-01
In spite of screening procedures in early drug development, uncertainty remains about the propensity of new chemical entities (NCEs) to prolong the QT/QTc interval. The evaluation of proarrhythmic activity using a comprehensive in vitro proarrhythmia assay does not fully account for pharmacokinetic-pharmacodynamic (PKPD) differences in vivo. In the present study, we evaluated the correlation between drug-specific parameters describing QT interval prolongation in dogs and in humans. Using estimates of the drug-specific parameter, data on the slopes of the PKPD relationships of nine compounds with varying QT-prolonging effects (cisapride, sotalol, moxifloxacin, carabersat, GSK945237, SB237376 and GSK618334, and two anonymized NCEs) were analysed. Mean slope estimates varied between -0.98 ms μM -1 and 6.1 ms μM -1 in dogs and -10 ms μM -1 and 90 ms μM -1 in humans, indicating a wide range of effects on the QT interval. Linear regression techniques were then applied to characterize the correlation between the parameter estimates across species. For compounds without a mixed ion channel block, a correlation was observed between the drug-specific parameter in dogs and humans (y = -1.709 + 11.6x; R 2 = 0.989). These results show that per unit concentration, the drug effect on the QT interval in humans is 11.6-fold larger than in dogs. Together with information about the expected therapeutic exposure, the evidence of a correlation between the compound-specific parameter in dogs and in humans represents an opportunity for translating preclinical safety data before progression into the clinic. Whereas further investigation is required to establish the generalizability of our findings, this approach can be used with clinical trial simulations to predict the probability of QT prolongation in humans. © 2016 The British Pharmacological Society.
Probability Distribution for Flowing Interval Spacing
International Nuclear Information System (INIS)
S. Kuzio
2004-01-01
Fracture spacing is a key hydrologic parameter in analyses of matrix diffusion. Although the individual fractures that transmit flow in the saturated zone (SZ) cannot be identified directly, it is possible to determine the fractured zones that transmit flow from flow meter survey observations. The fractured zones that transmit flow as identified through borehole flow meter surveys have been defined in this report as flowing intervals. The flowing interval spacing is measured between the midpoints of each flowing interval. The determination of flowing interval spacing is important because the flowing interval spacing parameter is a key hydrologic parameter in SZ transport modeling, which impacts the extent of matrix diffusion in the SZ volcanic matrix. The output of this report is input to the ''Saturated Zone Flow and Transport Model Abstraction'' (BSC 2004 [DIRS 170042]). Specifically, the analysis of data and development of a data distribution reported herein is used to develop the uncertainty distribution for the flowing interval spacing parameter for the SZ transport abstraction model. Figure 1-1 shows the relationship of this report to other model reports that also pertain to flow and transport in the SZ. Figure 1-1 also shows the flow of key information among the SZ reports. It should be noted that Figure 1-1 does not contain a complete representation of the data and parameter inputs and outputs of all SZ reports, nor does it show inputs external to this suite of SZ reports. Use of the developed flowing interval spacing probability distribution is subject to the limitations of the assumptions discussed in Sections 5 and 6 of this analysis report. The number of fractures in a flowing interval is not known. Therefore, the flowing intervals are assumed to be composed of one flowing zone in the transport simulations. This analysis may overestimate the flowing interval spacing because the number of fractures that contribute to a flowing interval cannot be
Anderson, Carl A.; McRae, Allan F.; Visscher, Peter M.
2006-01-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...
Estimating transmitted waves of floating breakwater using support vector regression model
Digital Repository Service at National Institute of Oceanography (India)
Mandal, S.; Hegde, A.V.; Kumar, V.; Patil, S.G.
is first mapped onto an m-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space (Ivanciuc Ovidiu 2007). Using mathematical notation, the linear model in the feature space f(x, w... regressive vector machines, Ocean Engineering Journal, Vol – 36, pp 339 – 347, 2009. 3. Ivanciuc Ovidiu, Applications of support vector machines in chemistry, Review in Computational Chemistry, Eds K. B. Lipkouitz and T. R. Cundari, Vol – 23...
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Stepwise versus Hierarchical Regression: Pros and Cons
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Suppression Situations in Multiple Linear Regression
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Gibrat’s law and quantile regressions
DEFF Research Database (Denmark)
Distante, Roberta; Petrella, Ivan; Santoro, Emiliano
2017-01-01
The nexus between firm growth, size and age in U.S. manufacturing is examined through the lens of quantile regression models. This methodology allows us to overcome serious shortcomings entailed by linear regression models employed by much of the existing literature, unveiling a number of important...
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Repeated Results Analysis for Middleware Regression Benchmarking
Czech Academy of Sciences Publication Activity Database
Bulej, Lubomír; Kalibera, T.; Tůma, P.
2005-01-01
Roč. 60, - (2005), s. 345-358 ISSN 0166-5316 R&D Projects: GA ČR GA102/03/0672 Institutional research plan: CEZ:AV0Z10300504 Keywords : middleware benchmarking * regression benchmarking * regression testing Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.756, year: 2005
Principles of Quantile Regression and an Application
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES
RUSCHENDORF, L; DEVALK, [No Value
We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Regression of environmental noise in LIGO data
International Nuclear Information System (INIS)
Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G
2015-01-01
We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)
Pathological assessment of liver fibrosis regression
Directory of Open Access Journals (Sweden)
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Small sample GEE estimation of regression parameters for longitudinal data.
Paul, Sudhir; Zhang, Xuemao
2014-09-28
Longitudinal (clustered) response data arise in many bio-statistical applications which, in general, cannot be assumed to be independent. Generalized estimating equation (GEE) is a widely used method to estimate marginal regression parameters for correlated responses. The advantage of the GEE is that the estimates of the regression parameters are asymptotically unbiased even if the correlation structure is misspecified, although their small sample properties are not known. In this paper, two bias adjusted GEE estimators of the regression parameters in longitudinal data are obtained when the number of subjects is small. One is based on a bias correction, and the other is based on a bias reduction. Simulations show that the performances of both the bias-corrected methods are similar in terms of bias, efficiency, coverage probability, average coverage length, impact of misspecification of correlation structure, and impact of cluster size on bias correction. Both these methods show superior properties over the GEE estimates for small samples. Further, analysis of data involving a small number of subjects also shows improvement in bias, MSE, standard error, and length of the confidence interval of the estimates by the two bias adjusted methods over the GEE estimates. For small to moderate sample sizes (N ≤50), either of the bias-corrected methods GEEBc and GEEBr can be used. However, the method GEEBc should be preferred over GEEBr, as the former is computationally easier. For large sample sizes, the GEE method can be used. Copyright © 2014 John Wiley & Sons, Ltd.
Interpregnancy interval and risk of autistic disorder.
Gunnes, Nina; Surén, Pål; Bresnahan, Michaeline; Hornig, Mady; Lie, Kari Kveim; Lipkin, W Ian; Magnus, Per; Nilsen, Roy Miodini; Reichborn-Kjennerud, Ted; Schjølberg, Synnve; Susser, Ezra Saul; Øyen, Anne-Siri; Stoltenberg, Camilla
2013-11-01
A recent California study reported increased risk of autistic disorder in children conceived within a year after the birth of a sibling. We assessed the association between interpregnancy interval and risk of autistic disorder using nationwide registry data on pairs of singleton full siblings born in Norway. We defined interpregnancy interval as the time from birth of the first-born child to conception of the second-born child in a sibship. The outcome of interest was autistic disorder in the second-born child. Analyses were restricted to sibships in which the second-born child was born in 1990-2004. Odds ratios (ORs) were estimated by fitting ordinary logistic models and logistic generalized additive models. The study sample included 223,476 singleton full-sibling pairs. In sibships with interpregnancy intervals autistic disorder, compared with 0.13% in the reference category (≥ 36 months). For interpregnancy intervals shorter than 9 months, the adjusted OR of autistic disorder in the second-born child was 2.18 (95% confidence interval 1.42-3.26). The risk of autistic disorder in the second-born child was also increased for interpregnancy intervals of 9-11 months in the adjusted analysis (OR = 1.71 [95% CI = 1.07-2.64]). Consistent with a previous report from California, interpregnancy intervals shorter than 1 year were associated with increased risk of autistic disorder in the second-born child. A possible explanation is depletion of micronutrients in mothers with closely spaced pregnancies.
Inter-Pregnancy Intervals and Maternal Morbidity: New Evidence from Rwanda
Habimana-Kabano, Ignace; Broekhuis, E.J.A.; Hooimeijer, P.
The effects of short and long pregnancy intervals on maternal morbidity have hardly been investigated. This research analyses these effects using logistic regression in two steps. First, data from the Rwanda Demographic and Health Survey 2010 are used to study delivery referrals to District
Change in Breast Cancer Screening Intervals Since the 2009 USPSTF Guideline.
Wernli, Karen J; Arao, Robert F; Hubbard, Rebecca A; Sprague, Brian L; Alford-Teaster, Jennifer; Haas, Jennifer S; Henderson, Louise; Hill, Deidre; Lee, Christoph I; Tosteson, Anna N A; Onega, Tracy
2017-08-01
In 2009, the U.S. Preventive Services Task Force (USPSTF) recommended biennial mammography for women aged 50-74 years and shared decision-making for women aged 40-49 years for breast cancer screening. We evaluated changes in mammography screening interval after the 2009 recommendations. We conducted a prospective cohort study of women aged 40-74 years who received 821,052 screening mammograms between 2006 and 2012 using data from the Breast Cancer Surveillance Consortium. We compared changes in screening intervals and stratified intervals based on whether the mammogram at the end of the interval occurred before or after the 2009 recommendation. Differences in mean interval length by woman-level characteristics were compared using linear regression. The mean interval (in months) minimally decreased after the 2009 USPSTF recommendations. Among women aged 40-49 years, the mean interval decreased from 17.2 months to 17.1 months (difference -0.16%, 95% confidence interval [CI] -0.30 to -0.01). Similar small reductions were seen for most age groups. The largest change in interval length in the post-USPSTF period was declines among women with a first-degree family history of breast cancer (difference -0.68%, 95% CI -0.82 to -0.54) or a 5-year breast cancer risk ≥2.5% (difference -0.58%, 95% CI -0.73 to -0.44). The 2009 USPSTF recommendation did not lengthen the average mammography interval among women routinely participating in mammography screening. Future studies should evaluate whether breast cancer screening intervals lengthen toward biennial intervals following new national 2016 breast cancer screening recommendations, particularly among women less than 50 years of age.
Spectral density regression for bivariate extremes
Castro Camilo, Daniela; de Carvalho, Miguel
2016-01-01
can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts
DEFF Research Database (Denmark)
Thuesen, Christian Langhoff; Koch, Christian
2011-01-01
By adopting a theoretical framework from strategic niche management research (SNM) this paper presents an analysis of the innovation system of the Danish Construction industry. The analysis shows a multifaceted landscape of innovation around an existing regime, built around existing ways of working...... and developed over generations. The regime is challenged from various niches and the socio-technical landscape through trends as globalization. Three niches (Lean Construction, BIM and System Deliveries) are subject to a detailed analysis showing partly incompatible rationales and various degrees of innovation...... potential. The paper further discusses how existing policymaking operates in a number of tensions one being between government and governance. Based on the concepts from SNM the paper introduces an innovation map in order to support the development of meta-governance policymaking. By mapping some...
DEFF Research Database (Denmark)
Gilje, Øystein; Frølunde, Lisbeth; Lindstrand, Fredrik
2010-01-01
This chapter concerns mapping patterns in regards to how young filmmakers (age 15 – 20) in the Scandinavian countries learn about filmmaking. To uncover the patterns, we present portraits of four young filmmakers who participated in the Scandinavian research project Making a filmmaker. The focus ...... is on their learning practices and how they create ‘learning paths’ in relation to resources in diverse learning contexts, whether formal, non-formal and informal contexts.......This chapter concerns mapping patterns in regards to how young filmmakers (age 15 – 20) in the Scandinavian countries learn about filmmaking. To uncover the patterns, we present portraits of four young filmmakers who participated in the Scandinavian research project Making a filmmaker. The focus...
Transmission line sag calculations using interval mathematics
Energy Technology Data Exchange (ETDEWEB)
Shaalan, H. [Institute of Electrical and Electronics Engineers, Washington, DC (United States)]|[US Merchant Marine Academy, Kings Point, NY (United States)
2007-07-01
Electric utilities are facing the need for additional generating capacity, new transmission systems and more efficient use of existing resources. As such, there are several uncertainties associated with utility decisions. These uncertainties include future load growth, construction times and costs, and performance of new resources. Regulatory and economic environments also present uncertainties. Uncertainty can be modeled based on a probabilistic approach where probability distributions for all of the uncertainties are assumed. Another approach to modeling uncertainty is referred to as unknown but bounded. In this approach, the upper and lower bounds on the uncertainties are assumed without probability distributions. Interval mathematics is a tool for the practical use and extension of the unknown but bounded concept. In this study, the calculation of transmission line sag was used as an example to demonstrate the use of interval mathematics. The objective was to determine the change in cable length, based on a fixed span and an interval of cable sag values for a range of temperatures. The resulting change in cable length was an interval corresponding to the interval of cable sag values. It was shown that there is a small change in conductor length due to variation in sag based on the temperature ranges used in this study. 8 refs.
Comparing spatial regression to random forests for large ...
Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. Our primary goal is predicting MMI at over 1.1 million perennial stream reaches across the USA. For spatial regression modeling, we develop two new methods to accommodate large data: (1) a procedure that estimates optimal Box-Cox transformations to linearize covariate relationships; and (2) a computationally efficient covariate selection routine that takes into account spatial autocorrelation. We show that our new methods lead to cross-validated performance similar to random forests, but that there is an advantage for spatial regression when quantifying the uncertainty of the predictions. Simulations are used to clarify advantages for each method. This research investigates different approaches for modeling and mapping national stream condition. We use MMI data from the EPA's National Rivers and Streams Assessment and predictors from StreamCat (Hill et al., 2015). Previous studies have focused on modeling the MMI condition classes (i.e., good, fair, and po
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Variable and subset selection in PLS regression
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Wu, Dane W.
2002-01-01
The year 2000 US presidential election between Al Gore and George Bush has been the most intriguing and controversial one in American history. The state of Florida was the trigger for the controversy, mainly, due to the use of the misleading "butterfly ballot". Using prediction (or confidence) intervals for least squares regression lines…
Existence test for asynchronous interval iterations
DEFF Research Database (Denmark)
Madsen, Kaj; Caprani, O.; Stauning, Ole
1997-01-01
In the search for regions that contain fixed points ofa real function of several variables, tests based on interval calculationscan be used to establish existence ornon-existence of fixed points in regions that are examined in the course ofthe search. The search can e.g. be performed...... as a synchronous (sequential) interval iteration:In each iteration step all components of the iterate are calculatedbased on the previous iterate. In this case it is straight forward to base simple interval existence and non-existencetests on the calculations done in each step of the iteration. The search can also...... on thecomponentwise calculations done in the course of the iteration. These componentwisetests are useful for parallel implementation of the search, sincethe tests can then be performed local to each processor and only when a test issuccessful do a processor communicate this result to other processors....
Nationwide Multicenter Reference Interval Study for 28 Common Biochemical Analytes in China.
Xia, Liangyu; Chen, Ming; Liu, Min; Tao, Zhihua; Li, Shijun; Wang, Liang; Cheng, Xinqi; Qin, Xuzhen; Han, Jianhua; Li, Pengchang; Hou, Li'an; Yu, Songlin; Ichihara, Kiyoshi; Qiu, Ling
2016-03-01
A nationwide multicenter study was conducted in the China to explore sources of variation of reference values and establish reference intervals for 28 common biochemical analytes, as a part of the International Federation of Clinical Chemistry and Laboratory Medicine, Committee on Reference Intervals and Decision Limits (IFCC/C-RIDL) global study on reference values. A total of 3148 apparently healthy volunteers were recruited in 6 cities covering a wide area in China. Blood samples were tested in 2 central laboratories using Beckman Coulter AU5800 chemistry analyzers. Certified reference materials and value-assigned serum panel were used for standardization of test results. Multiple regression analysis was performed to explore sources of variation. Need for partition of reference intervals was evaluated based on 3-level nested ANOVA. After secondary exclusion using the latent abnormal values exclusion method, reference intervals were derived by a parametric method using the modified Box-Cox formula. Test results of 20 analytes were made traceable to reference measurement procedures. By the ANOVA, significant sex-related and age-related differences were observed in 12 and 12 analytes, respectively. A small regional difference was observed in the results for albumin, glucose, and sodium. Multiple regression analysis revealed BMI-related changes in results of 9 analytes for man and 6 for woman. Reference intervals of 28 analytes were computed with 17 analytes partitioned by sex and/or age. In conclusion, reference intervals of 28 common chemistry analytes applicable to Chinese Han population were established by use of the latest methodology. Reference intervals of 20 analytes traceable to reference measurement procedures can be used as common reference intervals, whereas others can be used as the assay system-specific reference intervals in China.
Chosen interval methods for solving linear interval systems with special type of matrix
Szyszka, Barbara
2013-10-01
The paper is devoted to chosen direct interval methods for solving linear interval systems with special type of matrix. This kind of matrix: band matrix with a parameter, from finite difference problem is obtained. Such linear systems occur while solving one dimensional wave equation (Partial Differential Equations of hyperbolic type) by using the central difference interval method of the second order. Interval methods are constructed so as the errors of method are enclosed in obtained results, therefore presented linear interval systems contain elements that determining the errors of difference method. The chosen direct algorithms have been applied for solving linear systems because they have no errors of method. All calculations were performed in floating-point interval arithmetic.
Using the classical linear regression model in analysis of the dependences of conveyor belt life
Directory of Open Access Journals (Sweden)
Miriam Andrejiová
2013-12-01
Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.
Poisson regression for modeling count and frequency outcomes in trauma research.
Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T
2008-10-01
The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.
POSTMORTAL CHANGES AND ASSESSMENT OF POSTMORTEM INTERVAL
Directory of Open Access Journals (Sweden)
Edin Šatrović
2013-01-01
Full Text Available This paper describes in a simple way the changes that occur in the body after death.They develop in a specific order, and the speed of their development and their expression are strongly influenced by various endogenous and exogenous factors. The aim of the authors is to indicate the characteristics of the postmortem changes, and their significance in establishing time since death, which can be established precisely within 72 hours. Accurate evaluation of the age of the corpse based on the common changes is not possible with longer postmortem intervals, so the entomological findings become the most significant change on the corpse for determination of the postmortem interval (PMI.
A sequent calculus for signed interval logic
DEFF Research Database (Denmark)
Rasmussen, Thomas Marthedal
2001-01-01
We propose and discuss a complete sequent calculus formulation for Signed Interval Logic (SIL) with the chief purpose of improving proof support for SIL in practice. The main theoretical result is a simple characterization of the limit between decidability and undecidability of quantifier-free SIL....... We present a mechanization of SIL in the generic proof assistant Isabelle and consider techniques for automated reasoning. Many of the results and ideas of this report are also applicable to traditional (non-signed) interval logic and, hence, to Duration Calculus....
Interval Continuous Plant Identification from Value Sets
Directory of Open Access Journals (Sweden)
R. Hernández
2012-01-01
Full Text Available This paper shows how to obtain the values of the numerator and denominator Kharitonov polynomials of an interval plant from its value set at a given frequency. Moreover, it is proven that given a value set, all the assigned polynomials of the vertices can be determined if and only if there is a complete edge or a complete arc lying on a quadrant. This algorithm is nonconservative in the sense that if the value-set boundary of an interval plant is exactly known, and particularly its vertices, then the Kharitonov rectangles are exactly those used to obtain these value sets.
Motor Unit Interpulse Intervals During High Force Contractions.
Stock, Matt S; Thompson, Brennan J
2016-01-01
We examined the means, medians, and variability for motor-unit interpulse intervals (IPIs) during voluntary, high force contractions. Eight men (mean age = 22 years) attempted to perform isometric contractions at 90% of their maximal voluntary contraction force while bipolar surface electromyographic (EMG) signals were detected from the vastus lateralis and vastus medialis muscles. Surface EMG signal decomposition was used to determine the recruitment thresholds and IPIs of motor units that demonstrated accuracy levels ≥ 96.0%. Motor units with high recruitment thresholds demonstrated longer mean IPIs, but the coefficients of variation were similar across all recruitment thresholds. Polynomial regression analyses indicated that for both muscles, the relationship between the means and standard deviations of the IPIs was linear. The majority of IPI histograms were positively skewed. Although low-threshold motor units were associated with shorter IPIs, the variability among motor units with differing recruitment thresholds was comparable.
GENERALISED MODEL BASED CONFIDENCE INTERVALS IN TWO STAGE CLUSTER SAMPLING
Directory of Open Access Journals (Sweden)
Christopher Ouma Onyango
2010-09-01
Full Text Available Chambers and Dorfman (2002 constructed bootstrap confidence intervals in model based estimation for finite population totals assuming that auxiliary values are available throughout a target population and that the auxiliary values are independent. They also assumed that the cluster sizes are known throughout the target population. We now extend to two stage sampling in which the cluster sizes are known only for the sampled clusters, and we therefore predict the unobserved part of the population total. Jan and Elinor (2008 have done similar work, but unlike them, we use a general model, in which the auxiliary values are not necessarily independent. We demonstrate that the asymptotic properties of our proposed estimator and its coverage rates are better than those constructed under the model assisted local polynomial regression model.
On interval and cyclic interval edge colorings of (3,5)-biregular graphs
DEFF Research Database (Denmark)
Casselgren, Carl Johan; Petrosyan, Petros; Toft, Bjarne
2017-01-01
A proper edge coloring f of a graph G with colors 1,2,3,…,t is called an interval coloring if the colors on the edges incident to every vertex of G form an interval of integers. The coloring f is cyclic interval if for every vertex v of G, the colors on the edges incident to v either form an inte...
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Genetics Home Reference: caudal regression syndrome
... umbilical artery: Further support for a caudal regression-sirenomelia spectrum. Am J Med Genet A. 2007 Dec ... AK, Dickinson JE, Bower C. Caudal dysgenesis and sirenomelia-single centre experience suggests common pathogenic basis. Am ...
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
International Nuclear Information System (INIS)
Bogachev, Mikhail I; Bunde, Armin; Kireenkov, Igor S; Nifontov, Eugene M
2009-01-01
We study the statistics of return intervals between large heartbeat intervals (above a certain threshold Q) in 24 h records obtained from healthy subjects. We find that both the linear and the nonlinear long-term memory inherent in the heartbeat intervals lead to power-laws in the probability density function P Q (r) of the return intervals. As a consequence, the probability W Q (t; Δt) that at least one large heartbeat interval will occur within the next Δt heartbeat intervals, with an increasing elapsed number of intervals t after the last large heartbeat interval, follows a power-law. Based on these results, we suggest a method of obtaining a priori information about the occurrence of the next large heartbeat interval, and thus to predict it. We show explicitly that the proposed method, which exploits long-term memory, is superior to the conventional precursory pattern recognition technique, which focuses solely on short-term memory. We believe that our results can be straightforwardly extended to obtain more reliable predictions in other physiological signals like blood pressure, as well as in other complex records exhibiting multifractal behaviour, e.g. turbulent flow, precipitation, river flows and network traffic.
Circadian profile of QT interval and QT interval variability in 172 healthy volunteers
DEFF Research Database (Denmark)
Bonnemeier, Hendrik; Wiegand, Uwe K H; Braasch, Wiebke
2003-01-01
of sleep. QT and R-R intervals revealed a characteristic day-night-pattern. Diurnal profiles of QT interval variability exhibited a significant increase in the morning hours (6-9 AM; P ... lower at day- and nighttime. Aging was associated with an increase of QT interval mainly at daytime and a significant shift of the T wave apex towards the end of the T wave. The circadian profile of ventricular repolarization is strongly related to the mean R-R interval, however, there are significant...
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Computing multiple-output regression quantile regions
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Roč. 56, č. 4 (2012), s. 840-853 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
There is No Quantum Regression Theorem
International Nuclear Information System (INIS)
Ford, G.W.; OConnell, R.F.
1996-01-01
The Onsager regression hypothesis states that the regression of fluctuations is governed by macroscopic equations describing the approach to equilibrium. It is here asserted that this hypothesis fails in the quantum case. This is shown first by explicit calculation for the example of quantum Brownian motion of an oscillator and then in general from the fluctuation-dissipation theorem. It is asserted that the correct generalization of the Onsager hypothesis is the fluctuation-dissipation theorem. copyright 1996 The American Physical Society
Caudal regression syndrome : a case report
International Nuclear Information System (INIS)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun
1998-01-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging
Caudal regression syndrome : a case report
Energy Technology Data Exchange (ETDEWEB)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun [Chungang Gil Hospital, Incheon (Korea, Republic of)
1998-07-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging.
Spontaneous regression of metastatic Merkel cell carcinoma.
LENUS (Irish Health Repository)
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
Forecasting exchange rates: a robust regression approach
Preminger, Arie; Franck, Raphael
2005-01-01
The least squares estimation method as well as other ordinary estimation method for regression models can be severely affected by a small number of outliers, thus providing poor out-of-sample forecasts. This paper suggests a robust regression approach, based on the S-estimation method, to construct forecasting models that are less sensitive to data contamination by outliers. A robust linear autoregressive (RAR) and a robust neural network (RNN) models are estimated to study the predictabil...
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Post-processing through linear regression
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Post-processing through linear regression
Directory of Open Access Journals (Sweden)
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
Potentiometric-surface map, 1993, Yucca Mountain and vicinity, Nevada
International Nuclear Information System (INIS)
Tucci, P.; Burkhardt, D.J.
1995-01-01
The revised potentiometric surface map here, using mainly 1993 average water levels, updates earlier maps of this area. Water levels are contoured with 20-m intervals, with additional 0.5-m contours in the small-gradient area SE of Yucca Mountain. Water levels range from 728 m above sea level SE of Yucca to 1,034 m above sea level north of Yucca. Potentiometric levels in the deeper parts of the volcanic rock aquifer range from 730 to 785 m above sea level. The potentiometric surface can be divided into 3 regions: A small gradient area E and SE of Yucca, a moderate-gradient area on the west side of Yucca, and a large-gradient area to the N-NE of Yucca. Water levels from wells at Yucca were examined for yearly trends (1986-93) using linear least-squares regression. Of the 22 wells, three had significant positive trends. The trend in well UE-25 WT-3 may be influenced by monitoring equipment problems. Tends in USW WT-7 and USW WTS-10 are similar; both are located near a fault west of Yucca; however another well near that fault exhibited no significant trend
DEFF Research Database (Denmark)
Carruth, Susan
2015-01-01
by planners when aiming to construct resilient energy plans. It concludes that a graphical language has the potential to be a significant tool, flexibly facilitating cross-disciplinary communication and decision-making, while emphasising that its role is to support imaginative, resilient planning rather than...... the relationship between resilience and energy planning, suggesting that planning in, and with, time is a core necessity in this domain. It then reviews four examples of graphically mapping with time, highlighting some of the key challenges, before tentatively proposing a graphical language to be employed...
Diagnostic interval and mortality in colorectal cancer
DEFF Research Database (Denmark)
Tørring, Marie Louise; Frydenberg, Morten; Hamilton, William
2012-01-01
Objective To test the theory of a U-shaped association between time from the first presentation of symptoms in primary care to the diagnosis (the diagnostic interval) and mortality after diagnosis of colorectal cancer (CRC). Study Design and Setting Three population-based studies in Denmark...
Safety information on QT-interval prolongation
DEFF Research Database (Denmark)
Warnier, Miriam J; Holtkamp, Frank A; Rutten, Frans H
2014-01-01
Prolongation of the QT interval can predispose to fatal ventricular arrhythmias. Differences in QT-labeling language can result in miscommunication and suboptimal risk mitigation. We systematically compared the phraseology used to communicate on QT-prolonging properties of 144 drugs newly approve...
Interval scanning photomicrography of microbial cell populations.
Casida, L. E., Jr.
1972-01-01
A single reproducible area of the preparation in a fixed focal plane is photographically scanned at intervals during incubation. The procedure can be used for evaluating the aerobic or anaerobic growth of many microbial cells simultaneously within a population. In addition, the microscope is not restricted to the viewing of any one microculture preparation, since the slide cultures are incubated separately from the microscope.
Population based reference intervals for common blood ...
African Journals Online (AJOL)
Population based reference intervals for common blood haematological and biochemical parameters in the Akuapem north district. K.A Koram, M.M Addae, J.C Ocran, S Adu-amankwah, W.O Rogers, F.K Nkrumah ...
Changing reference intervals for haemoglobin in Denmark
DEFF Research Database (Denmark)
Ryberg-Nørholt, Judith; Frederiksen, Henrik; Nybo, Mads
2017-01-01
INTRODUCTION: Based on international experiences and altering demography the reference intervals (RI) for haemoglobin (Hb) concentrations in blood were changed in Denmark in 2013 from 113 - 161 g/L to 117 - 153 g/L for women and from 129 - 177 g/L to 134 - 170 g/L for men. The aim of this study w...
The Total Interval of a Graph.
1988-01-01
about them in a mathematical con- text. A thorough treatment of multiple interval representations, including applications, is given by Roberts [21...8217-. -- + .".-)’""- +_ .. ,_ _ CA6 46 operation applied to a member of .4 U 3 T U.) U.3 UU- T T i Figure 11.2.18 I Fieure 11.2.19 ,* This completes the proof
Coefficient Omega Bootstrap Confidence Intervals: Nonnormal Distributions
Padilla, Miguel A.; Divers, Jasmin
2013-01-01
The performance of the normal theory bootstrap (NTB), the percentile bootstrap (PB), and the bias-corrected and accelerated (BCa) bootstrap confidence intervals (CIs) for coefficient omega was assessed through a Monte Carlo simulation under conditions not previously investigated. Of particular interests were nonnormal Likert-type and binary items.…
Quinsy tonsillectomy or interval tonsillectomy - a prospective ...
African Journals Online (AJOL)
Fifty-one patients with peritonsillar abscesses were randomised to undergo either quinsy tonsillectomy (aT) or interval tonsillectomy (IT), and the two groups were compared. The QT group lost fewer (10,3 v. 17,9) working days and less blood during the operation (158,6 ml v. 205,7 ml); haemostasis was easier and the ...
Linear chord diagrams on two intervals
DEFF Research Database (Denmark)
Andersen, Jørgen Ellegaard; Penner, Robert; Reidys, Christian
generating function ${\\bf C}_g(z)=z^{2g}R_g(z)/(1-4z)^{3g-{1\\over 2}}$ for chords attached to a single interval is algebraic, for $g\\geq 1$, where the polynomial $R_g(z)$ with degree at most $g-1$ has integer coefficients and satisfies $R_g(1/4)\
Learned Interval Time Facilitates Associate Memory Retrieval
van de Ven, Vincent; Kochs, Sarah; Smulders, Fren; De Weerd, Peter
2017-01-01
The extent to which time is represented in memory remains underinvestigated. We designed a time paired associate task (TPAT) in which participants implicitly learned cue-time-target associations between cue-target pairs and specific cue-target intervals. During subsequent memory testing, participants showed increased accuracy of identifying…
Interval Appendicectomy and Management of Appendix Mass ...
African Journals Online (AJOL)
A wholly conservative management without interval appendicectomy was instituted for 13 patients diagnosed as having appendix mass between 1998 and 2002 in the University of Benin Teaching Hospital, Benin City, Nigeria. Within three days of admission, one patient developed clinical features of ruptured appendix and ...
HIV intertest interval among MSM in King County, Washington.
Katz, David A; Dombrowski, Julia C; Swanson, Fred; Buskin, Susan E; Golden, Matthew R; Stekler, Joanne D
2013-02-01
The authors examined temporal trends and correlates of HIV testing frequency among men who have sex with men (MSM) in King County, Washington. The authors evaluated data from MSM testing for HIV at the Public Health-Seattle & King County (PHSKC) STD Clinic and Gay City Health Project (GCHP) and testing history data from MSM in PHSKC HIV surveillance. The intertest interval (ITI) was defined as the number of days between the last negative HIV test and the current testing visit or first positive test. Correlates of the log(10)-transformed ITI were determined using generalised estimating equations linear regression. Between 2003 and 2010, the median ITI among MSM seeking HIV testing at the STD Clinic and GCHP were 215 (IQR: 124-409) and 257 (IQR: 148-503) days, respectively. In multivariate analyses, younger age, having only male partners and reporting ≥10 male sex partners in the last year were associated with shorter ITIs at both testing sites (pGCHP attendees, having a regular healthcare provider, seeking a test as part of a regular schedule and inhaled nitrite use in the last year were also associated with shorter ITIs (pGCHP (median 359 vs 255 days, p=0.02). Although MSM in King County appear to be testing at frequent intervals, further efforts are needed to reduce the time that HIV-infected persons are unaware of their status.
Mapping of the genomic regions controlling seed storability in soybean
Indian Academy of Sciences (India)
Composite interval mapping identified a total of three. QTLs on linkage ..... Soybean seeds decline in quality faster than seeds of other crops (Fabrizius et al. 1999). ... harvest and postharvest management practices (Lewis et al. 1998). Cho and ...
Hierarchical tone mapping for high dynamic range image visualization
Qiu, Guoping; Duan, Jiang
2005-07-01
In this paper, we present a computationally efficient, practically easy to use tone mapping techniques for the visualization of high dynamic range (HDR) images in low dynamic range (LDR) reproduction devices. The new method, termed hierarchical nonlinear linear (HNL) tone-mapping operator maps the pixels in two hierarchical steps. The first step allocates appropriate numbers of LDR display levels to different HDR intensity intervals according to the pixel densities of the intervals. The second step linearly maps the HDR intensity intervals to theirs allocated LDR display levels. In the developed HNL scheme, the assignment of LDR display levels to HDR intensity intervals is controlled by a very simple and flexible formula with a single adjustable parameter. We also show that our new operators can be used for the effective enhancement of ordinary images.
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
DEFF Research Database (Denmark)
Christensen, Steen; Moore, C.; Doherty, J.
2006-01-01
accurate and required a few hundred model calls to be computed. (b) The linearized regression-based interval (Cooley, 2004) required just over a hundred model calls and also appeared to be nearly correct. (c) The calibration-constrained Monte-Carlo interval (Doherty, 2003) was found to be narrower than......For a synthetic case we computed three types of individual prediction intervals for the location of the aquifer entry point of a particle that moves through a heterogeneous aquifer and ends up in a pumping well. (a) The nonlinear regression-based interval (Cooley, 2004) was found to be nearly...... the regression-based intervals but required about half a million model calls. It is unclear whether or not this type of prediction interval is accurate....
Relative Importance for Linear Regression in R: The Package relaimpo
Directory of Open Access Journals (Sweden)
Ulrike Gromping
2006-09-01
Full Text Available Relative importance is a topic that has seen a lot of interest in recent years, particularly in applied work. The R package relaimpo implements six different metrics for assessing relative importance of regressors in the linear model, two of which are recommended - averaging over orderings of regressors and a newly proposed metric (Feldman 2005 called pmvd. Apart from delivering the metrics themselves, relaimpo also provides (exploratory bootstrap confidence intervals. This paper offers a brief tutorial introduction to the package. The methods and relaimpo’s functionality are illustrated using the data set swiss that is generally available in R. The paper targets readers who have a basic understanding of multiple linear regression. For the background of more advanced aspects, references are provided.
Energy Technology Data Exchange (ETDEWEB)
Lu, Chun-Mei; Wang, Mei; Greulich-Bode, Karin M.; Weier, Jingly F.; Weier, Heinz-Ulli G.
2008-01-28
Several hybridization-based methods used to delineate single copy or repeated DNA sequences in larger genomic intervals take advantage of the increased resolution and sensitivity of free chromatin, i.e., chromatin released from interphase cell nuclei. Quantitative DNA fiber mapping (QDFM) differs from the majority of these methods in that it applies FISH to purified, clonal DNA molecules which have been bound with at least one end to a solid substrate. The DNA molecules are then stretched by the action of a receding meniscus at the water-air interface resulting in DNA molecules stretched homogeneously to about 2.3 kb/{micro}m. When non-isotopically, multicolor-labeled probes are hybridized to these stretched DNA fibers, their respective binding sites are visualized in the fluorescence microscope, their relative distance can be measured and converted into kilobase pairs (kb). The QDFM technique has found useful applications ranging from the detection and delineation of deletions or overlap between linked clones to the construction of high-resolution physical maps to studies of stalled DNA replication and transcription.
Directory of Open Access Journals (Sweden)
Sayed M. Arafat
2014-06-01
Full Text Available Land cover map of North Sinai was produced based on the FAO-Land Cover Classification System (LCCS of 2004. The standard FAO classification scheme provides a standardized system of classification that can be used to analyze spatial and temporal land cover variability in the study area. This approach also has the advantage of facilitating the integration of Sinai land cover mapping products to be included with the regional and global land cover datasets. The total study area is covering a total area of 20,310.4 km2 (203,104 hectare. The landscape classification was based on SPOT4 data acquired in 2011 using combined multispectral bands of 20 m spatial resolution. Geographic Information System (GIS was used to manipulate the attributed layers of classification in order to reach the maximum possible accuracy. GIS was also used to include all necessary information. The identified vegetative land cover classes of the study area are irrigated herbaceous crops, irrigated tree crops and rain fed tree crops. The non-vegetated land covers in the study area include bare rock, bare soils (stony, very stony and salt crusts, loose and shifting sands and sand dunes. The water bodies were classified as artificial perennial water bodies (fish ponds and irrigated canals and natural perennial water bodies as lakes (standing. The artificial surfaces include linear and non-linear features.
A molecular recombination map of Antirrhinum majus
Directory of Open Access Journals (Sweden)
Hudson Andrew
2010-12-01
Full Text Available Abstract Background Genetic recombination maps provide important frameworks for comparative genomics, identifying gene functions, assembling genome sequences and for breeding. The molecular recombination map currently available for the model eudicot Antirrhinum majus is the result of a cross with Antirrhinum molle, limiting its usefulness within A. majus. Results We created a molecular linkage map of A. majus based on segregation of markers in the F2 population of two inbred lab strains of A. majus. The resulting map consisted of over 300 markers in eight linkage groups, which could be aligned with a classical recombination map and the A. majus karyotype. The distribution of recombination frequencies and distorted transmission of parental alleles differed from those of a previous inter-species hybrid. The differences varied in magnitude and direction between chromosomes, suggesting that they had multiple causes. The map, which covered an estimated of 95% of the genome with an average interval of 2 cM, was used to analyze the distribution of a newly discovered family of MITE transposons and tested for its utility in positioning seven mutations that affect aspects of plant size. Conclusions The current map has an estimated interval of 1.28 Mb between markers. It shows a lower level of transmission ratio distortion and a longer length than the previous inter-species map, making it potentially more useful. The molecular recombination map further indicates that the IDLE MITE transposons are distributed throughout the genome and are relatively stable. The map proved effective in mapping classical morphological mutations of A. majus.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Is past life regression therapy ethical?
Andrade, Gabriel
2017-01-01
Past life regression therapy is used by some physicians in cases with some mental diseases. Anxiety disorders, mood disorders, and gender dysphoria have all been treated using life regression therapy by some doctors on the assumption that they reflect problems in past lives. Although it is not supported by psychiatric associations, few medical associations have actually condemned it as unethical. In this article, I argue that past life regression therapy is unethical for two basic reasons. First, it is not evidence-based. Past life regression is based on the reincarnation hypothesis, but this hypothesis is not supported by evidence, and in fact, it faces some insurmountable conceptual problems. If patients are not fully informed about these problems, they cannot provide an informed consent, and hence, the principle of autonomy is violated. Second, past life regression therapy has the great risk of implanting false memories in patients, and thus, causing significant harm. This is a violation of the principle of non-malfeasance, which is surely the most important principle in medical ethics.
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
Stochastic development regression on non-linear manifolds
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
We introduce a regression model for data on non-linear manifolds. The model describes the relation between a set of manifold valued observations, such as shapes of anatomical objects, and Euclidean explanatory variables. The approach is based on stochastic development of Euclidean diffusion...... processes to the manifold. Defining the data distribution as the transition distribution of the mapped stochastic process, parameters of the model, the non-linear analogue of design matrix and intercept, are found via maximum likelihood. The model is intrinsically related to the geometry encoded...... in the connection of the manifold. We propose an estimation procedure which applies the Laplace approximation of the likelihood function. A simulation study of the performance of the model is performed and the model is applied to a real dataset of Corpus Callosum shapes....
On Solving Lq-Penalized Regressions
Directory of Open Access Journals (Sweden)
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
Refractive regression after laser in situ keratomileusis.
Yan, Mabel K; Chang, John Sm; Chan, Tommy Cy
2018-04-26
Uncorrected refractive errors are a leading cause of visual impairment across the world. In today's society, laser in situ keratomileusis (LASIK) has become the most commonly performed surgical procedure to correct refractive errors. However, regression of the initially achieved refractive correction has been a widely observed phenomenon following LASIK since its inception more than two decades ago. Despite technological advances in laser refractive surgery and various proposed management strategies, post-LASIK regression is still frequently observed and has significant implications for the long-term visual performance and quality of life of patients. This review explores the mechanism of refractive regression after both myopic and hyperopic LASIK, predisposing risk factors and its clinical course. In addition, current preventative strategies and therapies are also reviewed. © 2018 Royal Australian and New Zealand College of Ophthalmologists.
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Interval Mathematics Applied to Critical Point Transitions
Directory of Open Access Journals (Sweden)
Benito A. Stradi
2012-03-01
Full Text Available The determination of critical points of mixtures is important for both practical and theoretical reasons in the modeling of phase behavior, especially at high pressure. The equations that describe the behavior of complex mixtures near critical points are highly nonlinear and with multiplicity of solutions to the critical point equations. Interval arithmetic can be used to reliably locate all the critical points of a given mixture. The method also verifies the nonexistence of a critical point if a mixture of a given composition does not have one. This study uses an interval Newton/Generalized Bisection algorithm that provides a mathematical and computational guarantee that all mixture critical points are located. The technique is illustrated using several example problems. These problems involve cubic equation of state models; however, the technique is general purpose and can be applied in connection with other nonlinear problems.
Strange distributionally chaotic triangular maps II
International Nuclear Information System (INIS)
Paganoni, L.; Smital, J.
2006-01-01
The notion of distributional chaos was introduced by Schweizer and Smital [Measures of chaos and a spectral decomposition of dynamical systems on the interval, Trans Am Math Soc 1994;344:737-854] for continuous maps of the interval. For continuous maps of a compact metric space three mutually non-equivalent versions of distributional chaos, DC1-DC3, can be considered. In this paper we study distributional chaos in the class T m of triangular maps of the square which are monotone on the fibres. The main results: (i) If F-bar T m has positive topological entropy then F is DC1, and hence, DC2 and DC3. This result is interesting since similar statement is not true for general triangular maps of the square [Smital and Stefankova, Distributional chaos for triangular maps, Chaos, Solitons and Fractals 2004;21:1125-8]. (ii) There are F 1 ,F 2 -bar T m which are not DC3, and such that not every recurrent point of F 1 is uniformly recurrent, while F 2 is Li and Yorke chaotic on the set of uniformly recurrent points. This, along with recent results by Forti et al. [Dynamics of homeomorphisms on minimal sets generated by triangular mappings, Bull Austral Math Soc 1999;59:1-20], among others, make possible to compile complete list of the implications between dynamical properties of maps in T m , solving a long-standing open problem by Sharkovsky
Understanding Confidence Intervals With Visual Representations
Navruz, Bilgin; Delen, Erhan
2014-01-01
In the present paper, we showed how confidence intervals (CIs) are valuable and useful in research studies when they are used in the correct form with correct interpretations. The sixth edition of the APA (2010) Publication Manual strongly recommended reporting CIs in research studies, and it was described as “the best reporting strategy” (p. 34). Misconceptions and correct interpretations of CIs were presented from several textbooks. In addition, limitations of the null hypothesis statistica...
On directional multiple-output quantile regression
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Robust median estimator in logisitc regression
Czech Academy of Sciences Publication Activity Database
Hobza, T.; Pardo, L.; Vajda, Igor
2008-01-01
Roč. 138, č. 12 (2008), s. 3822-3840 ISSN 0378-3758 R&D Projects: GA MŠk 1M0572 Grant - others:Instituto Nacional de Estadistica (ES) MPO FI - IM3/136; GA MŠk(CZ) MTM 2006-06872 Institutional research plan: CEZ:AV0Z10750506 Keywords : Logistic regression * Median * Robustness * Consistency and asymptotic normality * Morgenthaler * Bianco and Yohai * Croux and Hasellbroeck Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.679, year: 2008 http://library.utia.cas.cz/separaty/2008/SI/vajda-robust%20median%20estimator%20in%20logistic%20regression.pdf
Early diastolic time intervals during hypertensive pregnancy.
Spinelli, L; Ferro, G; Nappi, C; Farace, M J; Talarico, G; Cinquegrana, G; Condorelli, M
1987-10-01
Early diastolic time intervals have been assessed by means of the echopolycardiographic method in 17 pregnant women who developed hypertension during pregnancy (HP) and in 14 normal pregnant women (N). Systolic time intervals (STI), stroke volume (SV), ejection fraction (EF), and mean velocity of myocardial fiber shortening (VCF) were also evaluated. Recordings were performed in the left lateral decubitus (LLD) and then in the supine decubitus (SD). In LLD, isovolumic relaxation period (IRP) was prolonged in the hypertensive pregnant women compared with normal pregnant women (HP 51 +/- 12.5 ms, N 32.4 +/- 15 ms p less than 0.05), whereas time of the mitral valve maximum opening (DE) was not different in the groups. There was no difference in SV, EF, and mean VCF, whereas STI showed only a significant (p less than 0.05) lengthening of pre-ejection period (PEP) in HP. When the subjects shifted from the left lateral to the supine decubitus position, left ventricular ejection time index (LVETi) and SV decreased significantly (p less than 0.05) in both normotensive hypertensive pregnant women. IRP and PEP lengthened significantly (p less than 0.05) only in normals, whereas they were unchanged in HP. DE time did not vary in either group. In conclusion, hypertension superimposed on pregnancy induces lengthening of IRP, as well as of PEP, and minimizes the effects of the postural changes in preload on the above-mentioned time intervals.
QT interval prolongation associated with sibutramine treatment
Harrison-Woolrych, Mira; Clark, David W J; Hill, Geraldine R; Rees, Mark I; Skinner, Jonathan R
2006-01-01
Aims To investigate a possible association of sibutramine with QT interval prolongation. Methods Post-marketing surveillance using prescription event monitoring in the New Zealand Intensive Medicines Monitoring Programme (IMMP) identified a case of QT prolongation and associated cardiac arrest in a patient taking sibutramine for 25 days. This patient was further investigated, including genotyping for long QT syndrome. Other IMMP case reports suggesting arrhythmias associated with sibutramine were assessed and further reports were obtained from the World Health Organisation (WHO) adverse drug reactions database. Results The index case displayed a novel mutation in a cardiac potassium channel subunit gene, KCNQ1, which is likely to prolong cardiac membrane depolarization and increase susceptibility to long QT intervals. Assessment of further IMMP reports identified five additional patients who experienced palpitations associated with syncope or presyncopal symptoms, one of whom had a QTc at the upper limit of normal. Assessment of reports from the WHO database identified three reports of QT prolongation and one fatal case of torsade de pointes in a patient also taking cisapride. Conclusions This case series suggests that sibutramine may be associated with QT prolongation and related dysrhythmias. Further studies are required, but in the meantime we would recommend that sibutramine should be avoided in patients with long QT syndrome and in patients taking other medicines that may prolong the QT interval. PMID:16542208
Chen, Wei; Li, Hui; Hou, Enke; Wang, Shengquan; Wang, Guirong; Panahi, Mahdi; Li, Tao; Peng, Tao; Guo, Chen; Niu, Chao; Xiao, Lele; Wang, Jiale; Xie, Xiaoshen; Ahmad, Baharin Bin
2018-09-01
The aim of the current study was to produce groundwater spring potential maps using novel ensemble weights-of-evidence (WoE) with logistic regression (LR) and functional tree (FT) models. First, a total of 66 springs were identified by field surveys, out of which 70% of the spring locations were used for training the models and 30% of the spring locations were employed for the validation process. Second, a total of 14 affecting factors including aspect, altitude, slope, plan curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), sediment transport index (STI), lithology, normalized difference vegetation index (NDVI), land use, soil, distance to roads, and distance to streams was used to analyze the spatial relationship between these affecting factors and spring occurrences. Multicollinearity analysis and feature selection of the correlation attribute evaluation (CAE) method were employed to optimize the affecting factors. Subsequently, the novel ensembles of the WoE, LR, and FT models were constructed using the training dataset. Finally, the receiver operating characteristic (ROC) curves, standard error, confidence interval (CI) at 95%, and significance level P were employed to validate and compare the performance of three models. Overall, all three models performed well for groundwater spring potential evaluation. The prediction capability of the FT model, with the highest AUC values, the smallest standard errors, the narrowest CIs, and the smallest P values for the training and validation datasets, is better compared to those of other models. The groundwater spring potential maps can be adopted for the management of water resources and land use by planners and engineers. Copyright © 2018 Elsevier B.V. All rights reserved.
Randomness control of vehicular motion through a sequence of traffic signals at irregular intervals
International Nuclear Information System (INIS)
Nagatani, Takashi
2010-01-01
We study the regularization of irregular motion of a vehicle moving through the sequence of traffic signals with a disordered configuration. Each traffic signal is controlled by both cycle time and phase shift. The cycle time is the same for all signals, while the phase shift varies from signal to signal by synchronizing with intervals between a signal and the next signal. The nonlinear dynamic model of the vehicular motion is presented by the stochastic nonlinear map. The vehicle exhibits the very complex behavior with varying both cycle time and strength of irregular intervals. The irregular motion induced by the disordered configuration is regularized by adjusting the phase shift within the regularization regions.
Buffer Overflow Period in a MAP Queue
Directory of Open Access Journals (Sweden)
Andrzej Chydzinski
2007-01-01
Full Text Available The buffer overflow period in a queue with Markovian arrival process (MAP and general service time distribution is investigated. The results include distribution of the overflow period in transient and stationary regimes and the distribution of the number of cells lost during the overflow interval. All theorems are illustrated via numerical calculations.
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Multispectral colormapping using penalized least square regression
DEFF Research Database (Denmark)
Dissing, Bjørn Skovlund; Carstensen, Jens Michael; Larsen, Rasmus
2010-01-01
The authors propose a novel method to map a multispectral image into the device independent color space CIE-XYZ. This method provides a way to visualize multispectral images by predicting colorvalues from spectral values while maintaining interpretability and is tested on a light emitting diode...... that the interpretability improves significantly but comes at the cost of slightly worse predictability....
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Drzewiecki, Wojciech
2016-12-01
In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.
KELEŞ, Taliha; ALTUN, Murat
2016-01-01
Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Hierarchical Neural Regression Models for Customer Churn Prediction
Directory of Open Access Journals (Sweden)
Golshan Mohammadi
2013-01-01
Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Measurement Error in Education and Growth Regressions
Portela, Miguel; Alessie, Rob; Teulings, Coen
2010-01-01
The use of the perpetual inventory method for the construction of education data per country leads to systematic measurement error. This paper analyzes its effect on growth regressions. We suggest a methodology for correcting this error. The standard attenuation bias suggests that using these
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Regression Discontinuity Designs Based on Population Thresholds
DEFF Research Database (Denmark)
Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica
In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Piecewise linear regression splines with hyperbolic covariates
International Nuclear Information System (INIS)
Cologne, John B.; Sposto, Richard
1992-09-01
Consider the problem of fitting a curve to data that exhibit a multiphase linear response with smooth transitions between phases. We propose substituting hyperbolas as covariates in piecewise linear regression splines to obtain curves that are smoothly joined. The method provides an intuitive and easy way to extend the two-phase linear hyperbolic response model of Griffiths and Miller and Watts and Bacon to accommodate more than two linear segments. The resulting regression spline with hyperbolic covariates may be fit by nonlinear regression methods to estimate the degree of curvature between adjoining linear segments. The added complexity of fitting nonlinear, as opposed to linear, regression models is not great. The extra effort is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. We can also estimate the join points (the values of the abscissas where the linear segments would intersect if extrapolated) if their number and approximate locations may be presumed known. An example using data on changing age at menarche in a cohort of Japanese women illustrates the use of the method for exploratory data analysis. (author)
Targeting: Logistic Regression, Special Cases and Extensions
Directory of Open Access Journals (Sweden)
Helmut Schaeben
2014-12-01
Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Regression testing Ajax applications : Coping with dynamism
Roest, D.; Mesbah, A.; Van Deursen, A.
2009-01-01
Note: This paper is a pre-print of: Danny Roest, Ali Mesbah and Arie van Deursen. Regression Testing AJAX Applications: Coping with Dynamism. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10), Paris, France. IEEE Computer Society, 2010.
Group-wise partial least square regression
Camacho, José; Saccenti, Edoardo
2018-01-01
This paper introduces the group-wise partial least squares (GPLS) regression. GPLS is a new sparse PLS technique where the sparsity structure is defined in terms of groups of correlated variables, similarly to what is done in the related group-wise principal component analysis. These groups are
Functional data analysis of generalized regression quantiles
Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl
2013-01-01
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Finite Algorithms for Robust Linear Regression
DEFF Research Database (Denmark)
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Function approximation with polynomial regression slines
International Nuclear Information System (INIS)
Urbanski, P.
1996-01-01
Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Yet another look at MIDAS regression
Ph.H.B.F. Franses (Philip Hans)
2016-01-01
textabstractA MIDAS regression involves a dependent variable observed at a low frequency and independent variables observed at a higher frequency. This paper relates a true high frequency data generating process, where also the dependent variable is observed (hypothetically) at the high frequency,
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
Fast multi-output relevance vector regression
Ha, Youngmin
2017-01-01
This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V
Regression Equations for Birth Weight Estimation using ...
African Journals Online (AJOL)
In this study, Birth Weight has been estimated from anthropometric measurements of hand and foot. Linear regression equations were formed from each of the measured variables. These simple equations can be used to estimate Birth Weight of new born babies, in order to identify those with low birth weight and referred to ...
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
Highway, Suite 1204, Arlington, Va 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1...Navy submariners, reliability engineering, uncertainty quantification, and financial risk management . Superquantile, superquantile regression...Royset Carlos F. Borges Associate Professor of Operations Research Dissertation Supervisor Professor of Applied Mathematics Lyn R. Whitaker Javier
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
transformation of independent variables in polynomial regression ...
African Journals Online (AJOL)
Ada
preferable when possible to work with a simple functional form in transformed variables rather than with a more complicated form in the original variables. In this paper, it is shown that linear transformations applied to independent variables in polynomial regression models affect the t ratio and hence the statistical ...
Multiple Linear Regression: A Realistic Reflector.
Nutt, A. T.; Batsell, R. R.
Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…
Mapping Hurricane Rita inland storm tide
Berenbrock, Charles; Mason, Jr., Robert R.; Blanchard, Stephen F.; Simonovic, Slobodan P.
2009-01-01
Flood-inundation data are most useful for decision makers when presented in the context of maps of effected communities and (or) areas. But because the data are scarce and rarely cover the full extent of the flooding, interpolation and extrapolation of the information are needed. Many geographic information systems (GIS) provide various interpolation tools, but these tools often ignore the effects of the topographic and hydraulic features that influence flooding. A barrier mapping method was developed to improve maps of storm tide produced by Hurricane Rita. Maps were developed for the maximum storm tide and at 3-hour intervals from midnight (0000 hour) through noon (1200 hour) on September 24, 2005. The improved maps depict storm-tide elevations and the extent of flooding. The extent of storm-tide inundation from the improved maximum storm-tide map was compared to the extent of flood-inundation from a map prepared by the Federal Emergency Management Agency (FEMA). The boundaries from these two maps generally compared quite well especially along the Calcasieu River. Also a cross-section profile that parallels the Louisiana coast was developed from the maximum storm-tide map and included FEMA high-water marks.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
On the Use of Second-Order Descriptors To Predict Queueing Behavior of MAPs
DEFF Research Database (Denmark)
Andersen, Allan T.; Nielsen, Bo Friis
2002-01-01
The contributions of this paper are the following: We derive a formula for the IDI (Index of Dispersion for Intervals) for the Markovian Arrival Process (MAP). We show that two-state MAPs with identical fundamental rate, IDI and IDC (Index of Dispersion for Counts), define interval stationary poi...