WorldWideScience

Sample records for ratio outlier detection

  1. Detecting isotopic ratio outliers

    Science.gov (United States)

    Bayne, C. K.; Smith, D. H.

    An alternative method is proposed for improving isotopic ratio estimates. This method mathematically models pulse-count data and uses iterative reweighted Poisson regression to estimate model parameters to calculate the isotopic ratios. This computer-oriented approach provides theoretically better methods than conventional techniques to establish error limits and to identify outliers.

  2. Detecting isotopic ratio outliers

    International Nuclear Information System (INIS)

    Bayne, C.K.; Smith, D.H.

    1986-01-01

    An alternative method is proposed for improving isotopic ratio estimates. This method mathematically models pulse-count data and uses iterative reweighted Poisson regression to estimate model parameters to calculate the isotopic ratios. This computer-oriented approach provides theoretically better methods than conventional techniques to establish error limits and to identify outliers

  3. Detecting isotopic ratio outliers

    International Nuclear Information System (INIS)

    Bayne, C.K.; Smith, D.H.

    1985-01-01

    An alternative method is proposed for improving isotopic ratio estimates. This method mathematically models pulse-count data and uses iterative reweighted Poisson regression to estimate model parameters to calculate the isotopic ratios. This computer-oriented approach provides theoretically better methods than conventional techniques to establish error limits and to identify outliers. 6 refs., 3 figs., 3 tabs

  4. Outlier detection using autoencoders

    CERN Document Server

    Lyudchik, Olga

    2016-01-01

    Outlier detection is a crucial part of any data analysis applications. The goal of outlier detection is to separate a core of regular observations from some polluting ones, called “outliers”. We propose an outlier detection method using deep autoencoder. In our research the invented method was applied to detect outlier points in the MNIST dataset of handwriting digits. The experimental results show that the proposed method has a potential to be used for anomaly detection.

  5. Selection of tests for outlier detection

    NARCIS (Netherlands)

    Bossers, H.C.M.; Hurink, Johann L.; Smit, Gerardus Johannes Maria

    Integrated circuits are tested thoroughly in order to meet the high demands on quality. As an additional step, outlier detection is used to detect potential unreliable chips such that quality can be improved further. However, it is often unclear to which tests outlier detection should be applied and

  6. Statistical Outlier Detection for Jury Based Grading Systems

    DEFF Research Database (Denmark)

    Thompson, Mary Kathryn; Clemmensen, Line Katrine Harder; Rosas, Harvey

    2013-01-01

    This paper presents an algorithm that was developed to identify statistical outliers from the scores of grading jury members in a large project-based first year design course. The background and requirements for the outlier detection system are presented. The outlier detection algorithm...... and the follow-up procedures for score validation and appeals are described in detail. Finally, the impact of various elements of the outlier detection algorithm, their interactions, and the sensitivity of their numerical values are investigated. It is shown that the difference in the mean score produced...... by a grading jury before and after a suspected outlier is removed from the mean is the single most effective criterion for identifying potential outliers but that all of the criteria included in the algorithm have an effect on the outlier detection process....

  7. An improved data clustering algorithm for outlier detection

    Directory of Open Access Journals (Sweden)

    Anant Agarwal

    2016-12-01

    Full Text Available Data mining is the extraction of hidden predictive information from large databases. This is a technology with potential to study and analyze useful information present in data. Data objects which do not usually fit into the general behavior of the data are termed as outliers. Outlier Detection in databases has numerous applications such as fraud detection, customized marketing, and the search for terrorism. By definition, outliers are rare occurrences and hence represent a small portion of the data. However, the use of Outlier Detection for various purposes is not an easy task. This research proposes a modified PAM for detecting outliers. The proposed technique has been implemented in JAVA. The results produced by the proposed technique are found better than existing technique in terms of outliers detected and time complexity.

  8. Spatial Outlier Detection of CO2 Monitoring Data Based on Spatial Local Outlier Factor

    Directory of Open Access Journals (Sweden)

    Liu Xin

    2015-12-01

    Full Text Available Spatial local outlier factor (SLOF algorithm was adopted in this study for spatial outlier detection because of the limitations of the traditional static threshold detection. Based on the spatial characteristics of CO2 monitoring data obtained in the carbon capture and storage (CCS project, the K-Nearest Neighbour (KNN graph was constructed using the latitude and longitude information of the monitoring points to identify the spatial neighbourhood of the monitoring points. Then SLOF was adopted to calculate the outlier degrees of the monitoring points and the 3σ rule was employed to identify the spatial outlier. Finally, the selection of K value was analysed and the optimal one was selected. The results show that, compared with the static threshold method, the proposed algorithm has a higher detection precision. It can overcome the shortcomings of the static threshold method and improve the accuracy and diversity of local outlier detection, which provides a reliable reference for the safety assessment and warning of CCS monitoring.

  9. Outlier Detection and Explanation for Domain Experts

    DEFF Research Database (Denmark)

    Micenková, Barbora

    In many data exploratory tasks, extraordinary and rarely occurring patterns called outliers are more interesting than the prevalent ones. For example, they could represent frauds in insurance, intrusions in network and system monitoring, or motion in video surveillance. Decades of research have...... to poor overall performance. Furthermore, in many applications some labeled examples of outliers are available but not sufficient enough in number as training data for standard supervised learning methods. As such, this valuable information is typically ignored. We introduce a new paradigm for outlier...... detection where supervised and unsupervised information are combined to improve the performance while reducing the sensitivity to parameters of individual outlier detection algorithms. We do this by learning a new representation using the outliers from outputs of unsupervised outlier detectors as input...

  10. OUTLIER DETECTION IN PARTIAL ERRORS-IN-VARIABLES MODEL

    Directory of Open Access Journals (Sweden)

    JUN ZHAO

    Full Text Available The weighed total least square (WTLS estimate is very sensitive to the outliers in the partial EIV model. A new procedure for detecting outliers based on the data-snooping is presented in this paper. Firstly, a two-step iterated method of computing the WTLS estimates for the partial EIV model based on the standard LS theory is proposed. Secondly, the corresponding w-test statistics are constructed to detect outliers while the observations and coefficient matrix are contaminated with outliers, and a specific algorithm for detecting outliers is suggested. When the variance factor is unknown, it may be estimated by the least median squares (LMS method. At last, the simulated data and real data about two-dimensional affine transformation are analyzed. The numerical results show that the new test procedure is able to judge that the outliers locate in x component, y component or both components in coordinates while the observations and coefficient matrix are contaminated with outliers

  11. A Modified Approach for Detection of Outliers

    Directory of Open Access Journals (Sweden)

    Iftikhar Hussain Adil

    2015-04-01

    Full Text Available Tukey’s boxplot is very popular tool for detection of outliers. It reveals the location, spread and skewness of the data. It works nicely for detection of outliers when the data are symmetric. When the data are skewed it covers boundary away from the whisker on the compressed side while declares erroneous outliers on the extended side of the distribution. Hubert and Vandervieren (2008 made adjustment in Tukey’s technique to overcome this problem. However another problem arises that is the adjusted boxplot constructs the interval of critical values which even exceeds from the extremes of the data. In this situation adjusted boxplot is unable to detect outliers. This paper gives solution of this problem and proposed approach detects outliers properly. The validity of the technique has been checked by constructing fences around the true 95% values of different distributions. Simulation technique has been applied by drawing different sample size from chi square, beta and lognormal distributions. Fences constructed by the modified technique are close to the true 95% than adjusted boxplot which proves its superiority on the existing technique.

  12. Detection of Outliers in Regression Model for Medical Data

    Directory of Open Access Journals (Sweden)

    Stephen Raj S

    2017-07-01

    Full Text Available In regression analysis, an outlier is an observation for which the residual is large in magnitude compared to other observations in the data set. The detection of outliers and influential points is an important step of the regression analysis. Outlier detection methods have been used to detect and remove anomalous values from data. In this paper, we detect the presence of outliers in simple linear regression models for medical data set. Chatterjee and Hadi mentioned that the ordinary residuals are not appropriate for diagnostic purposes; a transformed version of them is preferable. First, we investigate the presence of outliers based on existing procedures of residuals and standardized residuals. Next, we have used the new approach of standardized scores for detecting outliers without the use of predicted values. The performance of the new approach was verified with the real-life data.

  13. Good and Bad Neighborhood Approximations for Outlier Detection Ensembles

    DEFF Research Database (Denmark)

    Kirner, Evelyn; Schubert, Erich; Zimek, Arthur

    2017-01-01

    Outlier detection methods have used approximate neighborhoods in filter-refinement approaches. Outlier detection ensembles have used artificially obfuscated neighborhoods to achieve diverse ensemble members. Here we argue that outlier detection models could be based on approximate neighborhoods...... in the first place, thus gaining in both efficiency and effectiveness. It depends, however, on the type of approximation, as only some seem beneficial for the task of outlier detection, while no (large) benefit can be seen for others. In particular, we argue that space-filling curves are beneficial...

  14. Stratification-Based Outlier Detection over the Deep Web.

    Science.gov (United States)

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

  15. Stratification-Based Outlier Detection over the Deep Web

    OpenAIRE

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S.; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribu...

  16. Using Person Fit Statistics to Detect Outliers in Survey Research

    Directory of Open Access Journals (Sweden)

    John M. Felt

    2017-05-01

    Full Text Available Context: When working with health-related questionnaires, outlier detection is important. However, traditional methods of outlier detection (e.g., boxplots can miss participants with “atypical” responses to the questions that otherwise have similar total (subscale scores. In addition to detecting outliers, it can be of clinical importance to determine the reason for the outlier status or “atypical” response.Objective: The aim of the current study was to illustrate how to derive person fit statistics for outlier detection through a statistical method examining person fit with a health-based questionnaire.Design and Participants: Patients treated for Cushing's syndrome (n = 394 were recruited from the Cushing's Support and Research Foundation's (CSRF listserv and Facebook page.Main Outcome Measure: Patients were directed to an online survey containing the CushingQoL (English version. A two-dimensional graded response model was estimated, and person fit statistics were generated using the Zh statistic.Results: Conventional outlier detections methods revealed no outliers reflecting extreme scores on the subscales of the CushingQoL. However, person fit statistics identified 18 patients with “atypical” response patterns, which would have been otherwise missed (Zh > |±2.00|.Conclusion: While the conventional methods of outlier detection indicated no outliers, person fit statistics identified several patients with “atypical” response patterns who otherwise appeared average. Person fit statistics allow researchers to delve further into the underlying problems experienced by these “atypical” patients treated for Cushing's syndrome. Annotated code is provided to aid other researchers in using this method.

  17. Spatial Outlier Detection of CO2 Monitoring Data Based on Spatial Local Outlier Factor

    OpenAIRE

    Liu Xin; Zhang Shaoliang; Zheng Pulin

    2015-01-01

    Spatial local outlier factor (SLOF) algorithm was adopted in this study for spatial outlier detection because of the limitations of the traditional static threshold detection. Based on the spatial characteristics of CO2 monitoring data obtained in the carbon capture and storage (CCS) project, the K-Nearest Neighbour (KNN) graph was constructed using the latitude and longitude information of the monitoring points to identify the spatial neighbourhood of the monitoring points. Then ...

  18. Outlier Detection Techniques For Wireless Sensor Networks: A Survey

    NARCIS (Netherlands)

    Zhang, Y.; Meratnia, Nirvana; Havinga, Paul J.M.

    2008-01-01

    In the field of wireless sensor networks, measurements that significantly deviate from the normal pattern of sensed data are considered as outliers. The potential sources of outliers include noise and errors, events, and malicious attacks on the network. Traditional outlier detection techniques are

  19. The good, the bad and the outliers: automated detection of errors and outliers from groundwater hydrographs

    Science.gov (United States)

    Peterson, Tim J.; Western, Andrew W.; Cheng, Xiang

    2018-03-01

    Suspicious groundwater-level observations are common and can arise for many reasons ranging from an unforeseen biophysical process to bore failure and data management errors. Unforeseen observations may provide valuable insights that challenge existing expectations and can be deemed outliers, while monitoring and data handling failures can be deemed errors, and, if ignored, may compromise trend analysis and groundwater model calibration. Ideally, outliers and errors should be identified but to date this has been a subjective process that is not reproducible and is inefficient. This paper presents an approach to objectively and efficiently identify multiple types of errors and outliers. The approach requires only the observed groundwater hydrograph, requires no particular consideration of the hydrogeology, the drivers (e.g. pumping) or the monitoring frequency, and is freely available in the HydroSight toolbox. Herein, the algorithms and time-series model are detailed and applied to four observation bores with varying dynamics. The detection of outliers was most reliable when the observation data were acquired quarterly or more frequently. Outlier detection where the groundwater-level variance is nonstationary or the absolute trend increases rapidly was more challenging, with the former likely to result in an under-estimation of the number of outliers and the latter an overestimation in the number of outliers.

  20. INCREMENTAL PRINCIPAL COMPONENT ANALYSIS BASED OUTLIER DETECTION METHODS FOR SPATIOTEMPORAL DATA STREAMS

    Directory of Open Access Journals (Sweden)

    A. Bhushan

    2015-07-01

    Full Text Available In this paper, we address outliers in spatiotemporal data streams obtained from sensors placed across geographically distributed locations. Outliers may appear in such sensor data due to various reasons such as instrumental error and environmental change. Real-time detection of these outliers is essential to prevent propagation of errors in subsequent analyses and results. Incremental Principal Component Analysis (IPCA is one possible approach for detecting outliers in such type of spatiotemporal data streams. IPCA has been widely used in many real-time applications such as credit card fraud detection, pattern recognition, and image analysis. However, the suitability of applying IPCA for outlier detection in spatiotemporal data streams is unknown and needs to be investigated. To fill this research gap, this paper contributes by presenting two new IPCA-based outlier detection methods and performing a comparative analysis with the existing IPCA-based outlier detection methods to assess their suitability for spatiotemporal sensor data streams.

  1. Detection of outliers in gas centrifuge experimental data

    International Nuclear Information System (INIS)

    Andrade, Monica C.V.; Nascimento, Claudio A.O.

    2005-01-01

    Isotope separation in a gas centrifuge is a very complex process. Development and optimization of a gas centrifuge requires experimentation. These data contain experimental errors, and like other experimental data, there may be some gross errors, also known as outliers. The detection of outliers in gas centrifuge experimental data may be quite complicated because there is not enough repetition for precise statistical determination and the physical equations may be applied only on the control of the mass flows. Moreover, the concentrations are poorly predicted by phenomenological models. This paper presents the application of a three-layer feed-forward neural network to the detection of outliers in a very extensive experiment for the analysis of the separation performance of a gas centrifuge. (author)

  2. Adjusted functional boxplots for spatio-temporal data visualization and outlier detection

    KAUST Repository

    Sun, Ying

    2011-10-24

    This article proposes a simulation-based method to adjust functional boxplots for correlations when visualizing functional and spatio-temporal data, as well as detecting outliers. We start by investigating the relationship between the spatio-temporal dependence and the 1.5 times the 50% central region empirical outlier detection rule. Then, we propose to simulate observations without outliers on the basis of a robust estimator of the covariance function of the data. We select the constant factor in the functional boxplot to control the probability of correctly detecting no outliers. Finally, we apply the selected factor to the functional boxplot of the original data. As applications, the factor selection procedure and the adjusted functional boxplots are demonstrated on sea surface temperatures, spatio-temporal precipitation and general circulation model (GCM) data. The outlier detection performance is also compared before and after the factor adjustment. © 2011 John Wiley & Sons, Ltd.

  3. Development of a methodology for the detection of hospital financial outliers using information systems.

    Science.gov (United States)

    Okada, Sachiko; Nagase, Keisuke; Ito, Ayako; Ando, Fumihiko; Nakagawa, Yoshiaki; Okamoto, Kazuya; Kume, Naoto; Takemura, Tadamasa; Kuroda, Tomohiro; Yoshihara, Hiroyuki

    2014-01-01

    Comparison of financial indices helps to illustrate differences in operations and efficiency among similar hospitals. Outlier data tend to influence statistical indices, and so detection of outliers is desirable. Development of a methodology for financial outlier detection using information systems will help to reduce the time and effort required, eliminate the subjective elements in detection of outlier data, and improve the efficiency and quality of analysis. The purpose of this research was to develop such a methodology. Financial outliers were defined based on a case model. An outlier-detection method using the distances between cases in multi-dimensional space is proposed. Experiments using three diagnosis groups indicated successful detection of cases for which the profitability and income structure differed from other cases. Therefore, the method proposed here can be used to detect outliers. Copyright © 2013 John Wiley & Sons, Ltd.

  4. A New Outlier Detection Method for Multidimensional Datasets

    KAUST Repository

    Abdel Messih, Mario A.

    2012-07-01

    This study develops a novel hybrid method for outlier detection (HMOD) that combines the idea of distance based and density based methods. The proposed method has two main advantages over most of the other outlier detection methods. The first advantage is that it works well on both dense and sparse datasets. The second advantage is that, unlike most other outlier detection methods that require careful parameter setting and prior knowledge of the data, HMOD is not very sensitive to small changes in parameter values within certain parameter ranges. The only required parameter to set is the number of nearest neighbors. In addition, we made a fully parallelized implementation of HMOD that made it very efficient in applications. Moreover, we proposed a new way of using the outlier detection for redundancy reduction in datasets where the confidence level that evaluates how accurate the less redundant dataset can be used to represent the original dataset can be specified by users. HMOD is evaluated on synthetic datasets (dense and mixed “dense and sparse”) and a bioinformatics problem of redundancy reduction of dataset of position weight matrices (PWMs) of transcription factor binding sites. In addition, in the process of assessing the performance of our redundancy reduction method, we developed a simple tool that can be used to evaluate the confidence level of reduced dataset representing the original dataset. The evaluation of the results shows that our method can be used in a wide range of problems.

  5. Outlier Detection with Space Transformation and Spectral Analysis

    DEFF Research Database (Denmark)

    Dang, Xuan-Hong; Micenková, Barbora; Assent, Ira

    2013-01-01

    which rely on notions of distances or densities, this approach introduces a novel concept based on local quadratic entropy for evaluating the similarity of a data object with its neighbors. This information theoretic quantity is used to regularize the closeness amongst data instances and subsequently......Detecting a small number of outliers from a set of data observations is always challenging. In this paper, we present an approach that exploits space transformation and uses spectral analysis in the newly transformed space for outlier detection. Unlike most existing techniques in the literature...... benefits the process of mapping data into a usually lower dimensional space. Outliers are then identified by spectral analysis of the eigenspace spanned by the set of leading eigenvectors derived from the mapping procedure. The proposed technique is purely data-driven and imposes no assumptions regarding...

  6. Detection of outliers in a gas centrifuge experimental data

    Directory of Open Access Journals (Sweden)

    M. C. V. Andrade

    2005-09-01

    Full Text Available Isotope separation with a gas centrifuge is a very complex process. Development and optimization of a gas centrifuge requires experimentation. These data contain experimental errors, and like other experimental data, there may be some gross errors, also known as outliers. The detection of outliers in gas centrifuge experimental data is quite complicated because there is not enough repetition for precise statistical determination and the physical equations may be applied only to control of the mass flow. Moreover, the concentrations are poorly predicted by phenomenological models. This paper presents the application of a three-layer feed-forward neural network to the detection of outliers in analysis of performed on a very extensive experiment.

  7. On the Evaluation of Outlier Detection and One-Class Classification Methods

    DEFF Research Database (Denmark)

    Swersky, Lorne; Marques, Henrique O.; Sander, Jörg

    2016-01-01

    It has been shown that unsupervised outlier detection methods can be adapted to the one-class classification problem. In this paper, we focus on the comparison of oneclass classification algorithms with such adapted unsupervised outlier detection methods, improving on previous comparison studies ...

  8. Detection of additive outliers in seasonal time series

    DEFF Research Database (Denmark)

    Haldrup, Niels; Montañés, Antonio; Sansó, Andreu

    The detection and location of additive outliers in integrated variables has attracted much attention recently because such outliers tend to affect unit root inference among other things. Most of these procedures have been developed for non-seasonal processes. However, the presence of seasonality......) to deal with data sampled at a seasonal frequency and the size and power properties are discussed. We also show that the presence of periodic heteroscedasticity will inflate the size of the tests and hence will tend to identify an excessive number of outliers. A modified Perron-Rodriguez test which allows...... periodically varying variances is suggested and it is shown to have excellent properties in terms of both power and size...

  9. Ensemble Learning Method for Outlier Detection and its Application to Astronomical Light Curves

    Science.gov (United States)

    Nun, Isadora; Protopapas, Pavlos; Sim, Brandon; Chen, Wesley

    2016-09-01

    Outlier detection is necessary for automated data analysis, with specific applications spanning almost every domain from financial markets to epidemiology to fraud detection. We introduce a novel mixture of the experts outlier detection model, which uses a dynamically trained, weighted network of five distinct outlier detection methods. After dimensionality reduction, individual outlier detection methods score each data point for “outlierness” in this new feature space. Our model then uses dynamically trained parameters to weigh the scores of each method, allowing for a finalized outlier score. We find that the mixture of experts model performs, on average, better than any single expert model in identifying both artificially and manually picked outliers. This mixture model is applied to a data set of astronomical light curves, after dimensionality reduction via time series feature extraction. Our model was tested using three fields from the MACHO catalog and generated a list of anomalous candidates. We confirm that the outliers detected using this method belong to rare classes, like Novae, He-burning, and red giant stars; other outlier light curves identified have no available information associated with them. To elucidate their nature, we created a website containing the light-curve data and information about these objects. Users can attempt to classify the light curves, give conjectures about their identities, and sign up for follow up messages about the progress made on identifying these objects. This user submitted data can be used further train of our mixture of experts model. Our code is publicly available to all who are interested.

  10. Elimination of some unknown parameters and its effect on outlier detection

    Directory of Open Access Journals (Sweden)

    Serif Hekimoglu

    Full Text Available Outliers in observation set badly affect all the estimated unknown parameters and residuals, that is because outlier detection has a great importance for reliable estimation results. Tests for outliers (e.g. Baarda's and Pope's tests are frequently used to detect outliers in geodetic applications. In order to reduce the computational time, sometimes elimination of some unknown parameters, which are not of interest, is performed. In this case, although the estimated unknown parameters and residuals do not change, the cofactor matrix of the residuals and the redundancies of the observations change. In this study, the effects of the elimination of the unknown parameters on tests for outliers have been investigated. We have proved that the redundancies in initial functional model (IFM are smaller than the ones in reduced functional model (RFM where elimination is performed. To show this situation, a horizontal control network was simulated and then many experiences were performed. According to simulation results, tests for outlier in IFM are more reliable than the ones in RFM.

  11. Application of median-equation approach for outlier detection in geodetic networks

    Directory of Open Access Journals (Sweden)

    Serif Hekimoglu

    Full Text Available In geodetic measurements some outliers may occur sometimes in data sets, depending on different reasons. There are two main approaches to detect outliers as Tests for outliers (Baarda's and Pope's Tests and robust methods (Danish method, Huber method etc.. These methods use the Least Squares Estimation (LSE. The outliers affect the LSE results, especially it smears the effects of the outliers on the good observations and sometimes wrong results may be obtained. To avoid these effects, a method that does not use LSE should be preferred. The median is a high breakdown point estimator and if it is applied for the outlier detection, reliable results can be obtained. In this study, a robust method which uses median with or as a treshould value on median residuals that are obtained from median equations is proposed. If the a priori variance of the observations is known, the reliability of the new approch is greater than the one in the case where the a priori variance is unknown.

  12. On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued

    DEFF Research Database (Denmark)

    Campos, G. O.; Zimek, A.; Sander, J.

    2016-01-01

    The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate...... are available online in the repository at: http://www.dbs.ifi.lmu.de/research/outlier-evaluation/...

  13. IVS Combination Center at BKG - Robust Outlier Detection and Weighting Strategies

    Science.gov (United States)

    Bachmann, S.; Lösler, M.

    2012-12-01

    Outlier detection plays an important role within the IVS combination. Even if the original data is the same for all contributing Analysis Centers (AC), the analyzed data shows differences due to analysis software characteristics. The treatment of outliers is thus a fine line between keeping data heterogeneity and elimination of real outliers. Robust outlier detection based on the Least Median Square (LMS) is used within the IVS combination. This method allows reliable outlier detection with a small number of input parameters. A similar problem arises for the weighting of the individual solutions within the combination process. The variance component estimation (VCE) is used to control the weighting factor for each AC. The Operator-Software-Impact (OSI) method takes into account that the analyzed data is strongly influenced by the software and the responsible operator. It allows to make the VCE more sensitive to the diverse input data. This method has already been set up within GNSS data analysis as well as the analysis of troposphere data. The benefit of an OSI realization within the VLBI combination and its potential in weighting factor determination has not been investigated before.

  14. Detecting Outlier Microarray Arrays by Correlation and Percentage of Outliers Spots

    Directory of Open Access Journals (Sweden)

    Song Yang

    2006-01-01

    Full Text Available We developed a quality assurance (QA tool, namely microarray outlier filter (MOF, and have applied it to our microarray datasets for the identification of problematic arrays. Our approach is based on the comparison of the arrays using the correlation coefficient and the number of outlier spots generated on each array to reveal outlier arrays. For a human universal reference (HUR dataset, which is used as a technical control in our standard hybridization procedure, 3 outlier arrays were identified out of 35 experiments. For a human blood dataset, 12 outlier arrays were identified from 185 experiments. In general, arrays from human blood samples displayed greater variation in their gene expression profiles than arrays from HUR samples. As a result, MOF identified two distinct patterns in the occurrence of outlier arrays. These results demonstrate that this methodology is a valuable QA practice to identify questionable microarray data prior to downstream analysis.

  15. Distance Based Method for Outlier Detection of Body Sensor Networks

    Directory of Open Access Journals (Sweden)

    Haibin Zhang

    2016-01-01

    Full Text Available We propose a distance based method for the outlier detection of body sensor networks. Firstly, we use a Kernel Density Estimation (KDE to calculate the probability of the distance to k nearest neighbors for diagnosed data. If the probability is less than a threshold, and the distance of this data to its left and right neighbors is greater than a pre-defined value, the diagnosed data is decided as an outlier. Further, we formalize a sliding window based method to improve the outlier detection performance. Finally, to estimate the KDE by training sensor readings with errors, we introduce a Hidden Markov Model (HMM based method to estimate the most probable ground truth values which have the maximum probability to produce the training data. Simulation results show that the proposed method possesses a good detection accuracy with a low false alarm rate.

  16. A new approach for assessing the state of environment using isometric log-ratio transformation and outlier detection for computation of mean PCDD/F patterns in biota.

    Science.gov (United States)

    Lehmann, René

    2015-01-01

    To assess the state of the environment, various compartments are examined as part of monitoring programs. Within monitoring, a special focus is on chemical pollution. One of the most toxic substances ever synthesized is the well-known dioxin 2,3,7,8-TCDD (2,3,7,8-tetra-chlor-dibenzo-dioxin). Other PCDD/F (polychlorinated-dibenzo-dioxin and furan) can act toxic too. They are ubiquitary and persistent in various environmental compartments. Assessing the state of environment requires knowledge of typical local patterns of PCDD/F for as many compartments as possible. For various species of wild animals and plants (so called biota), I present the mean local congenere profiles of ubiquitary PCDD/F contamination reflecting typical patterns and levels of environmental burden for various years. Trends in time series of means can indicate success or failure of a measure of PCDD/F reduction. For short time series of mean patterns, it can be hard to detect trends. A new approach regarding proportions of outliers in the corresponding annual cross-sectional data sets in parallel can help detect decreasing or increasing environmental burden and support analysis of time series. Further, in this article, the true structure of PCDD/F data in biota is revealed, that is, the compositional data structure. It prevents direct application of statistical standard procedures to the data rendering results of statistical analysis meaningless. Results indicate that the compositional data structure of PCDD/F in biota is of great interest and should be taken into account in future studies. Isometric log-ratio (ilr) transformation is used, providing data statistical standard procedures that can be applied too. Focusing on the identification of typical PCDD/F patterns in biota, outliers are removed from annual data since they represent an extraordinary situation in the environment. Identification of outliers yields two advantages. First, typical (mean) profiles and levels of PCDD/F contamination

  17. Algorithms for Speeding up Distance-Based Outlier Detection

    Data.gov (United States)

    National Aeronautics and Space Administration — The problem of distance-based outlier detection is difficult to solve efficiently in very large datasets because of potential quadratic time complexity. We address...

  18. Comparative Study of Outlier Detection Algorithms via Fundamental Analysis Variables: An Application on Firms Listed in Borsa Istanbul

    Directory of Open Access Journals (Sweden)

    Senol Emir

    2016-04-01

    Full Text Available In a data set, an outlier refers to a data point that is considerably different from the others. Detecting outliers provides useful application-specific insights and leads to choosing right prediction models. Outlier detection (also known as anomaly detection or novelty detection has been studied in statistics and machine learning for a long time. It is an essential preprocessing step of data mining process. In this study, outlier detection step in the data mining process is applied for identifying the top 20 outlier firms. Three outlier detection algorithms are utilized using fundamental analysis variables of firms listed in Borsa Istanbul for the 2011-2014 period. The results of each algorithm are presented and compared. Findings show that 15 different firms are identified by three different outlier detection methods. KCHOL and SAHOL have the greatest number of appearances with 12 observations among these firms. By investigating the results, it is concluded that each of three algorithms makes different outlier firm lists due to differences in their approaches for outlier detection.

  19. An Unbiased Distance-based Outlier Detection Approach for High-dimensional Data

    DEFF Research Database (Denmark)

    Nguyen, Hoang Vu; Gopalkrishnan, Vivekanand; Assent, Ira

    2011-01-01

    than a global property. Different from existing approaches, it is not grid-based and dimensionality unbiased. Thus, its performance is impervious to grid resolution as well as the curse of dimensionality. In addition, our approach ranks the outliers, allowing users to select the number of desired...... outliers, thus mitigating the issue of high false alarm rate. Extensive empirical studies on real datasets show that our approach efficiently and effectively detects outliers, even in high-dimensional spaces....

  20. A NOTE ON THE CONVENTIONAL OUTLIER DETECTION TEST PROCEDURES

    Directory of Open Access Journals (Sweden)

    JIANFENG GUO

    Full Text Available Under the assumption of that the variance-covariance matrix is fully populated, Baarda's w-test is turn out to be completely different from the standardized least-squares residual. Unfortunately, this is not generally recognized. In the limiting case of only one degree of freedom, all the three types of test statistics, including Gaussian normal test, Student's t-test and Pope's Tau-test, will be invalid for identification of outliers: (1 all the squares of the Gaussian normal test statistic coincide with the goodness-of-fit (global test statistic, even for correlated observations. Hence, the failure of the global test implies that all the observations will be flagged as outliers, and thus the Gaussian normal test is inconclusive for localization of outliers; (2 the absolute values of the Tau-test statistic are all exactly equal to one, no matter whether the observations are contaminated. Therefore, the Tau-test cannot work for outlier detection in this situation; and (3 Student's t-test statistics are undefined.

  1. A Distributed Algorithm for the Cluster-Based Outlier Detection Using Unsupervised Extreme Learning Machines

    Directory of Open Access Journals (Sweden)

    Xite Wang

    2017-01-01

    Full Text Available Outlier detection is an important data mining task, whose target is to find the abnormal or atypical objects from a given dataset. The techniques for detecting outliers have a lot of applications, such as credit card fraud detection and environment monitoring. Our previous work proposed the Cluster-Based (CB outlier and gave a centralized method using unsupervised extreme learning machines to compute CB outliers. In this paper, we propose a new distributed algorithm for the CB outlier detection (DACB. On the master node, we collect a small number of points from the slave nodes to obtain a threshold. On each slave node, we design a new filtering method that can use the threshold to efficiently speed up the computation. Furthermore, we also propose a ranking method to optimize the order of cluster scanning. At last, the effectiveness and efficiency of the proposed approaches are verified through a plenty of simulation experiments.

  2. Electricity Price Forecasting Based on AOSVR and Outlier Detection

    Institute of Scientific and Technical Information of China (English)

    Zhou Dianmin; Gao Lin; Gao Feng

    2005-01-01

    Electricity price is of the first consideration for all the participants in electric power market and its characteristics are related to both market mechanism and variation in the behaviors of market participants. It is necessary to build a real-time price forecasting model with adaptive capability; and because there are outliers in the price data, they should be detected and filtrated in training the forecasting model by regression method. In view of these points, this paper presents an electricity price forecasting method based on accurate on-line support vector regression (AOSVR) and outlier detection. Numerical testing results show that the method is effective in forecasting the electricity prices in electric power market.

  3. An Improved Semisupervised Outlier Detection Algorithm Based on Adaptive Feature Weighted Clustering

    Directory of Open Access Journals (Sweden)

    Tingquan Deng

    2016-01-01

    Full Text Available There exist already various approaches to outlier detection, in which semisupervised methods achieve encouraging superiority due to the introduction of prior knowledge. In this paper, an adaptive feature weighted clustering-based semisupervised outlier detection strategy is proposed. This method maximizes the membership degree of a labeled normal object to the cluster it belongs to and minimizes the membership degrees of a labeled outlier to all clusters. In consideration of distinct significance of features or components in a dataset in determining an object being an inlier or outlier, each feature is adaptively assigned different weights according to the deviation degrees between this feature of all objects and that of a certain cluster prototype. A series of experiments on a synthetic dataset and several real-world datasets are implemented to verify the effectiveness and efficiency of the proposal.

  4. Outlier Detection in GNSS Pseudo-Range/Doppler Measurements for Robust Localization

    Directory of Open Access Journals (Sweden)

    Salim Zair

    2016-04-01

    Full Text Available In urban areas or space-constrained environments with obstacles, vehicle localization using Global Navigation Satellite System (GNSS data is hindered by Non-Line Of Sight (NLOS and multipath receptions. These phenomena induce faulty data that disrupt the precise localization of the GNSS receiver. In this study, we detect the outliers among the observations, Pseudo-Range (PR and/or Doppler measurements, and we evaluate how discarding them improves the localization. We specify a contrario modeling for GNSS raw data to derive an algorithm that partitions the dataset between inliers and outliers. Then, only the inlier data are considered in the localization process performed either through a classical Particle Filter (PF or a Rao-Blackwellization (RB approach. Both localization algorithms exclusively use GNSS data, but they differ by the way Doppler measurements are processed. An experiment has been performed with a GPS receiver aboard a vehicle. Results show that the proposed algorithms are able to detect the ‘outliers’ in the raw data while being robust to non-Gaussian noise and to intermittent satellite blockage. We compare the performance results achieved either estimating only PR outliers or estimating both PR and Doppler outliers. The best localization is achieved using the RB approach coupled with PR-Doppler outlier estimation.

  5. Multivariate Functional Data Visualization and Outlier Detection

    KAUST Repository

    Dai, Wenlin

    2017-03-19

    This article proposes a new graphical tool, the magnitude-shape (MS) plot, for visualizing both the magnitude and shape outlyingness of multivariate functional data. The proposed tool builds on the recent notion of functional directional outlyingness, which measures the centrality of functional data by simultaneously considering the level and the direction of their deviation from the central region. The MS-plot intuitively presents not only levels but also directions of magnitude outlyingness on the horizontal axis or plane, and demonstrates shape outlyingness on the vertical axis. A dividing curve or surface is provided to separate non-outlying data from the outliers. Both the simulated data and the practical examples confirm that the MS-plot is superior to existing tools for visualizing centrality and detecting outliers for functional data.

  6. Multivariate Functional Data Visualization and Outlier Detection

    KAUST Repository

    Dai, Wenlin; Genton, Marc G.

    2017-01-01

    This article proposes a new graphical tool, the magnitude-shape (MS) plot, for visualizing both the magnitude and shape outlyingness of multivariate functional data. The proposed tool builds on the recent notion of functional directional outlyingness, which measures the centrality of functional data by simultaneously considering the level and the direction of their deviation from the central region. The MS-plot intuitively presents not only levels but also directions of magnitude outlyingness on the horizontal axis or plane, and demonstrates shape outlyingness on the vertical axis. A dividing curve or surface is provided to separate non-outlying data from the outliers. Both the simulated data and the practical examples confirm that the MS-plot is superior to existing tools for visualizing centrality and detecting outliers for functional data.

  7. Why General Outlier Detection Techniques Do Not Suffice For Wireless Sensor Networks?

    NARCIS (Netherlands)

    Zhang, Y.; Meratnia, Nirvana; Havinga, Paul J.M.

    2009-01-01

    Raw data collected in wireless sensor networks are often unreliable and inaccurate due to noise, faulty sensors and harsh environmental effects. Sensor data that significantly deviate from normal pattern of sensed data are often called outliers. Outlier detection in wireless sensor networks aims at

  8. Detection of Outliers in Panel Data of Intervention Effects Model Based on Variance of Remainder Disturbance

    Directory of Open Access Journals (Sweden)

    Yanfang Lyu

    2015-01-01

    Full Text Available The presence of outliers can result in seriously biased parameter estimates. In order to detect outliers in panel data models, this paper presents a modeling method to assess the intervention effects based on the variance of remainder disturbance using an arbitrary strictly positive twice continuously differentiable function. This paper also provides a Lagrange Multiplier (LM approach to detect and identify a general type of outlier. Furthermore, fixed effects models and random effects models are discussed to identify outliers and the corresponding LM test statistics are given. The LM test statistics for an individual-based model to detect outliers are given as a particular case. Finally, this paper performs an application using panel data and explains the advantages of the proposed method.

  9. System and Method for Outlier Detection via Estimating Clusters

    Science.gov (United States)

    Iverson, David J. (Inventor)

    2016-01-01

    An efficient method and system for real-time or offline analysis of multivariate sensor data for use in anomaly detection, fault detection, and system health monitoring is provided. Models automatically derived from training data, typically nominal system data acquired from sensors in normally operating conditions or from detailed simulations, are used to identify unusual, out of family data samples (outliers) that indicate possible system failure or degradation. Outliers are determined through analyzing a degree of deviation of current system behavior from the models formed from the nominal system data. The deviation of current system behavior is presented as an easy to interpret numerical score along with a measure of the relative contribution of each system parameter to any off-nominal deviation. The techniques described herein may also be used to "clean" the training data.

  10. Shape based kinetic outlier detection in real-time PCR

    Directory of Open Access Journals (Sweden)

    D'Atri Mario

    2010-04-01

    Full Text Available Abstract Background Real-time PCR has recently become the technique of choice for absolute and relative nucleic acid quantification. The gold standard quantification method in real-time PCR assumes that the compared samples have similar PCR efficiency. However, many factors present in biological samples affect PCR kinetic, confounding quantification analysis. In this work we propose a new strategy to detect outlier samples, called SOD. Results Richards function was fitted on fluorescence readings to parameterize the amplification curves. There was not a significant correlation between calculated amplification parameters (plateau, slope and y-coordinate of the inflection point and the Log of input DNA demonstrating that this approach can be used to achieve a "fingerprint" for each amplification curve. To identify the outlier runs, the calculated parameters of each unknown sample were compared to those of the standard samples. When a significant underestimation of starting DNA molecules was found, due to the presence of biological inhibitors such as tannic acid, IgG or quercitin, SOD efficiently marked these amplification profiles as outliers. SOD was subsequently compared with KOD, the current approach based on PCR efficiency estimation. The data obtained showed that SOD was more sensitive than KOD, whereas SOD and KOD were equally specific. Conclusion Our results demonstrated, for the first time, that outlier detection can be based on amplification shape instead of PCR efficiency. SOD represents an improvement in real-time PCR analysis because it decreases the variance of data thus increasing the reliability of quantification.

  11. Explaining outliers by subspace separability

    DEFF Research Database (Denmark)

    Micenková, Barbora; Ng, Raymond T.; Dang, Xuan-Hong

    2013-01-01

    Outliers are extraordinary objects in a data collection. Depending on the domain, they may represent errors, fraudulent activities or rare events that are subject of our interest. Existing approaches focus on detection of outliers or degrees of outlierness (ranking), but do not provide a possible...... with any existing outlier detection algorithm and it also includes a heuristic that gives a substantial speedup over the baseline strategy....

  12. Detection of Outliers and Imputing of Missing Values for Water Quality UV-VIS Absorbance Time Series

    OpenAIRE

    Plazas-Nossa, Leonardo; Ávila Angulo, Miguel Antonio; Torres, Andrés

    2017-01-01

    Context:The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, pre-processing to correct these anomalies is required to improve the analysis of monitoring data. The aim of this study is to propose a method to detect outliers as well as to fill-in the gaps in time series. Method:Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT) and the Inverse of F...

  13. Adaptive distributed outlier detection for WSNs.

    Science.gov (United States)

    De Paola, Alessandra; Gaglio, Salvatore; Lo Re, Giuseppe; Milazzo, Fabrizio; Ortolani, Marco

    2015-05-01

    The paradigm of pervasive computing is gaining more and more attention nowadays, thanks to the possibility of obtaining precise and continuous monitoring. Ease of deployment and adaptivity are typically implemented by adopting autonomous and cooperative sensory devices; however, for such systems to be of any practical use, reliability and fault tolerance must be guaranteed, for instance by detecting corrupted readings amidst the huge amount of gathered sensory data. This paper proposes an adaptive distributed Bayesian approach for detecting outliers in data collected by a wireless sensor network; our algorithm aims at optimizing classification accuracy, time complexity and communication complexity, and also considering externally imposed constraints on such conflicting goals. The performed experimental evaluation showed that our approach is able to improve the considered metrics for latency and energy consumption, with limited impact on classification accuracy.

  14. Outlier Detection in Structural Time Series Models

    DEFF Research Database (Denmark)

    Marczak, Martyna; Proietti, Tommaso

    investigate via Monte Carlo simulations how this approach performs for detecting additive outliers and level shifts in the analysis of nonstationary seasonal time series. The reference model is the basic structural model, featuring a local linear trend, possibly integrated of order two, stochastic seasonality......Structural change affects the estimation of economic signals, like the underlying growth rate or the seasonally adjusted series. An important issue, which has attracted a great deal of attention also in the seasonal adjustment literature, is its detection by an expert procedure. The general......–to–specific approach to the detection of structural change, currently implemented in Autometrics via indicator saturation, has proven to be both practical and effective in the context of stationary dynamic regression models and unit–root autoregressions. By focusing on impulse– and step–indicator saturation, we...

  15. Learning Outlier Ensembles

    DEFF Research Database (Denmark)

    Micenková, Barbora; McWilliams, Brian; Assent, Ira

    into the existing unsupervised algorithms. In this paper, we show how to use powerful machine learning approaches to combine labeled examples together with arbitrary unsupervised outlier scoring algorithms. We aim to get the best out of the two worlds—supervised and unsupervised. Our approach is also a viable......Years of research in unsupervised outlier detection have produced numerous algorithms to score data according to their exceptionality. wever, the nature of outliers heavily depends on the application context and different algorithms are sensitive to outliers of different nature. This makes it very...... difficult to assess suitability of a particular algorithm without a priori knowledge. On the other hand, in many applications, some examples of outliers exist or can be obtain edin addition to the vast amount of unlabeled data. Unfortunately, this extra knowledge cannot be simply incorporated...

  16. Adjusted functional boxplots for spatio-temporal data visualization and outlier detection

    KAUST Repository

    Sun, Ying; Genton, Marc G.

    2011-01-01

    This article proposes a simulation-based method to adjust functional boxplots for correlations when visualizing functional and spatio-temporal data, as well as detecting outliers. We start by investigating the relationship between the spatio

  17. Music Outlier Detection Using Multiple Sequence Alignment and Independent Ensembles

    NARCIS (Netherlands)

    Bountouridis, D.; Koops, Hendrik Vincent; Wiering, F.; Veltkamp, R.C.

    2016-01-01

    The automated retrieval of related music documents, such as cover songs or folk melodies belonging to the same tune, has been an important task in the field of Music Information Retrieval (MIR). Yet outlier detection, the process of identifying those documents that deviate significantly from the

  18. Pendeteksian Outlier pada Regresi Nonlinier dengan Metode statistik Likelihood Displacement

    Directory of Open Access Journals (Sweden)

    Siti Tabi'atul Hasanah

    2012-11-01

    Full Text Available Outlier is an observation that much different (extreme from the other observational data, or data can be interpreted that do not follow the general pattern of the model. Sometimes outliers provide information that can not be provided by other data. That's why outliers should not just be eliminated. Outliers can also be an influential observation. There are many methods that can be used to detect of outliers. In previous studies done on outlier detection of linear regression. Next will be developed detection of outliers in nonlinear regression. Nonlinear regression here is devoted to multiplicative nonlinear regression. To detect is use of statistical method likelihood displacement. Statistical methods abbreviated likelihood displacement (LD is a method to detect outliers by removing the suspected outlier data. To estimate the parameters are used to the maximum likelihood method, so we get the estimate of the maximum. By using LD method is obtained i.e likelihood displacement is thought to contain outliers. Further accuracy of LD method in detecting the outliers are shown by comparing the MSE of LD with the MSE from the regression in general. Statistic test used is Λ. Initial hypothesis was rejected when proved so is an outlier.

  19. Time Series Outlier Detection Based on Sliding Window Prediction

    Directory of Open Access Journals (Sweden)

    Yufeng Yu

    2014-01-01

    Full Text Available In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI, which can be calculated by the predicted value and confidence coefficient. The use of PCI as threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.

  20. [Outlier sample discriminating methods for building calibration model in melons quality detecting using NIR spectra].

    Science.gov (United States)

    Tian, Hai-Qing; Wang, Chun-Guang; Zhang, Hai-Jun; Yu, Zhi-Hong; Li, Jian-Kang

    2012-11-01

    Outlier samples strongly influence the precision of the calibration model in soluble solids content measurement of melons using NIR Spectra. According to the possible sources of outlier samples, three methods (predicted concentration residual test; Chauvenet test; leverage and studentized residual test) were used to discriminate these outliers respectively. Nine suspicious outliers were detected from calibration set which including 85 fruit samples. Considering the 9 suspicious outlier samples maybe contain some no-outlier samples, they were reclaimed to the model one by one to see whether they influence the model and prediction precision or not. In this way, 5 samples which were helpful to the model joined in calibration set again, and a new model was developed with the correlation coefficient (r) 0. 889 and root mean square errors for calibration (RMSEC) 0.6010 Brix. For 35 unknown samples, the root mean square errors prediction (RMSEP) was 0.854 degrees Brix. The performance of this model was more better than that developed with non outlier was eliminated from calibration set (r = 0.797, RMSEC= 0.849 degrees Brix, RMSEP = 1.19 degrees Brix), and more representative and stable with all 9 samples were eliminated from calibration set (r = 0.892, RMSEC = 0.605 degrees Brix, RMSEP = 0.862 degrees).

  1. Iterative Outlier Removal: A Method for Identifying Outliers in Laboratory Recalibration Studies.

    Science.gov (United States)

    Parrinello, Christina M; Grams, Morgan E; Sang, Yingying; Couper, David; Wruck, Lisa M; Li, Danni; Eckfeldt, John H; Selvin, Elizabeth; Coresh, Josef

    2016-07-01

    Extreme values that arise for any reason, including those through nonlaboratory measurement procedure-related processes (inadequate mixing, evaporation, mislabeling), lead to outliers and inflate errors in recalibration studies. We present an approach termed iterative outlier removal (IOR) for identifying such outliers. We previously identified substantial laboratory drift in uric acid measurements in the Atherosclerosis Risk in Communities (ARIC) Study over time. Serum uric acid was originally measured in 1990-1992 on a Coulter DACOS instrument using an uricase-based measurement procedure. To recalibrate previous measured concentrations to a newer enzymatic colorimetric measurement procedure, uric acid was remeasured in 200 participants from stored plasma in 2011-2013 on a Beckman Olympus 480 autoanalyzer. To conduct IOR, we excluded data points >3 SDs from the mean difference. We continued this process using the resulting data until no outliers remained. IOR detected more outliers and yielded greater precision in simulation. The original mean difference (SD) in uric acid was 1.25 (0.62) mg/dL. After 4 iterations, 9 outliers were excluded, and the mean difference (SD) was 1.23 (0.45) mg/dL. Conducting only one round of outlier removal (standard approach) would have excluded 4 outliers [mean difference (SD) = 1.22 (0.51) mg/dL]. Applying the recalibration (derived from Deming regression) from each approach to the original measurements, the prevalence of hyperuricemia (>7 mg/dL) was 28.5% before IOR and 8.5% after IOR. IOR is a useful method for removal of extreme outliers irrelevant to recalibrating laboratory measurements, and identifies more extraneous outliers than the standard approach. © 2016 American Association for Clinical Chemistry.

  2. Efficient estimation of dynamic density functions with an application to outlier detection

    KAUST Repository

    Qahtan, Abdulhakim Ali Ali; Zhang, Xiangliang; Wang, Suojin

    2012-01-01

    In this paper, we propose a new method to estimate the dynamic density over data streams, named KDE-Track as it is based on a conventional and widely used Kernel Density Estimation (KDE) method. KDE-Track can efficiently estimate the density with linear complexity by using interpolation on a kernel model, which is incrementally updated upon the arrival of streaming data. Both theoretical analysis and experimental validation show that KDE-Track outperforms traditional KDE and a baseline method Cluster-Kernels on estimation accuracy of the complex density structures in data streams, computing time and memory usage. KDE-Track is also demonstrated on timely catching the dynamic density of synthetic and real-world data. In addition, KDE-Track is used to accurately detect outliers in sensor data and compared with two existing methods developed for detecting outliers and cleaning sensor data. © 2012 ACM.

  3. Outlier detection by robust Mahalanobis distance in geological data obtained by INAA to provenance studies

    Energy Technology Data Exchange (ETDEWEB)

    Santos, Jose O. dos, E-mail: osmansantos@ig.com.br [Instituto Federal de Educacao, Ciencia e Tecnologia de Sergipe (IFS), Lagarto, SE (Brazil); Munita, Casimiro S., E-mail: camunita@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil); Soares, Emilio A.A., E-mail: easoares@ufan.edu.br [Universidade Federal do Amazonas (UFAM), Manaus, AM (Brazil). Dept. de Geociencias

    2013-07-01

    The detection of outlier in geochemical studies is one of the main difficulties in the interpretation of dataset because they can disturb the statistical method. The search for outliers in geochemical studies is usually based in the Mahalanobis distance (MD), since points in multivariate space that are a distance larger the some predetermined values from center of the data are considered outliers. However, the MD is very sensitive to the presence of discrepant samples. Many robust estimators for location and covariance have been introduced in the literature, such as Minimum Covariance Determinant (MCD) estimator. When MCD estimators are used to calculate the MD leads to the so-called Robust Mahalanobis Distance (RD). In this context, in this work RD was used to detect outliers in geological study of samples collected from confluence of Negro and Solimoes rivers. The purpose of this study was to study the contributions of the sediments deposited by the Solimoes and Negro rivers in the filling of the tectonic depressions at Parana do Ariau. For that 113 samples were analyzed by Instrumental Neutron Activation Analysis (INAA) in which were determined the concentration of As, Ba, Ce, Co, Cr, Cs, Eu, Fe, Hf, K, La, Lu, Na, Nd, Rb, Sb, Sc, Sm, U, Yb, Ta, Tb, Th and Zn. In the dataset was possible to construct the ellipse corresponding to robust Mahalanobis distance for each group of samples. The samples found outside of the tolerance ellipse were considered an outlier. The results showed that Robust Mahalanobis Distance was more appropriate for the identification of the outliers, once it is a more restrictive method. (author)

  4. Outlier detection by robust Mahalanobis distance in geological data obtained by INAA to provenance studies

    International Nuclear Information System (INIS)

    Santos, Jose O. dos; Munita, Casimiro S.; Soares, Emilio A.A.

    2013-01-01

    The detection of outlier in geochemical studies is one of the main difficulties in the interpretation of dataset because they can disturb the statistical method. The search for outliers in geochemical studies is usually based in the Mahalanobis distance (MD), since points in multivariate space that are a distance larger the some predetermined values from center of the data are considered outliers. However, the MD is very sensitive to the presence of discrepant samples. Many robust estimators for location and covariance have been introduced in the literature, such as Minimum Covariance Determinant (MCD) estimator. When MCD estimators are used to calculate the MD leads to the so-called Robust Mahalanobis Distance (RD). In this context, in this work RD was used to detect outliers in geological study of samples collected from confluence of Negro and Solimoes rivers. The purpose of this study was to study the contributions of the sediments deposited by the Solimoes and Negro rivers in the filling of the tectonic depressions at Parana do Ariau. For that 113 samples were analyzed by Instrumental Neutron Activation Analysis (INAA) in which were determined the concentration of As, Ba, Ce, Co, Cr, Cs, Eu, Fe, Hf, K, La, Lu, Na, Nd, Rb, Sb, Sc, Sm, U, Yb, Ta, Tb, Th and Zn. In the dataset was possible to construct the ellipse corresponding to robust Mahalanobis distance for each group of samples. The samples found outside of the tolerance ellipse were considered an outlier. The results showed that Robust Mahalanobis Distance was more appropriate for the identification of the outliers, once it is a more restrictive method. (author)

  5. An MEF-Based Localization Algorithm against Outliers in Wireless Sensor Networks.

    Science.gov (United States)

    Wang, Dandan; Wan, Jiangwen; Wang, Meimei; Zhang, Qiang

    2016-07-07

    Precise localization has attracted considerable interest in Wireless Sensor Networks (WSNs) localization systems. Due to the internal or external disturbance, the existence of the outliers, including both the distance outliers and the anchor outliers, severely decreases the localization accuracy. In order to eliminate both kinds of outliers simultaneously, an outlier detection method is proposed based on the maximum entropy principle and fuzzy set theory. Since not all the outliers can be detected in the detection process, the Maximum Entropy Function (MEF) method is utilized to tolerate the errors and calculate the optimal estimated locations of unknown nodes. Simulation results demonstrate that the proposed localization method remains stable while the outliers vary. Moreover, the localization accuracy is highly improved by wisely rejecting outliers.

  6. Detection of Outliers and Imputing of Missing Values for Water Quality UV-VIS Absorbance Time Series

    Directory of Open Access Journals (Sweden)

    Leonardo Plazas-Nossa

    2017-01-01

    Full Text Available Context: The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, data pre-processing is a necessary pre-requisite to monitoring data processing. Thus, the aim of this study is to propose a method that detects and removes outliers as well as fills gaps in time series. Method: Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT and the Inverse of Fast Fourier Transform (IFFT to complete the time series. Together, these tools were used to analyse a case study comprising three sites in Colombia ((i Bogotá D.C. Salitre-WWTP (Waste Water Treatment Plant, influent; (ii Bogotá D.C. Gibraltar Pumping Station (GPS; and, (iii Itagüí, San Fernando-WWTP, influent (Medellín metropolitan area analysed via UV-Vis (Ultraviolet and Visible spectra. Results: Outlier detection with the proposed method obtained promising results when window parameter values are small and self-similar, despite that the three time series exhibited different sizes and behaviours. The DFT allowed to process different length gaps having missing values. To assess the validity of the proposed method, continuous subsets (a section of the absorbance time series without outlier or missing values were removed from the original time series obtaining an average 12% error rate in the three testing time series. Conclusions: The application of the DFT and the IFFT, using the 10% most important harmonics of useful values, can be useful for its later use in different applications, specifically for time series of water quality and quantity in urban sewer systems. One potential application would be the analysis of dry weather interesting to rain events, a feat achieved by detecting values that correspond to unusual behaviour in a time series. Additionally, the result hints at the potential of the method in correcting other hydrologic time series.

  7. Open-Source Radiation Exposure Extraction Engine (RE3) with Patient-Specific Outlier Detection.

    Science.gov (United States)

    Weisenthal, Samuel J; Folio, Les; Kovacs, William; Seff, Ari; Derderian, Vana; Summers, Ronald M; Yao, Jianhua

    2016-08-01

    We present an open-source, picture archiving and communication system (PACS)-integrated radiation exposure extraction engine (RE3) that provides study-, series-, and slice-specific data for automated monitoring of computed tomography (CT) radiation exposure. RE3 was built using open-source components and seamlessly integrates with the PACS. RE3 calculations of dose length product (DLP) from the Digital imaging and communications in medicine (DICOM) headers showed high agreement (R (2) = 0.99) with the vendor dose pages. For study-specific outlier detection, RE3 constructs robust, automatically updating multivariable regression models to predict DLP in the context of patient gender and age, scan length, water-equivalent diameter (D w), and scanned body volume (SBV). As proof of concept, the model was trained on 811 CT chest, abdomen + pelvis (CAP) exams and 29 outliers were detected. The continuous variables used in the outlier detection model were scan length (R (2)  = 0.45), D w (R (2) = 0.70), SBV (R (2) = 0.80), and age (R (2) = 0.01). The categorical variables were gender (male average 1182.7 ± 26.3 and female 1047.1 ± 26.9 mGy cm) and pediatric status (pediatric average 710.7 ± 73.6 mGy cm and adult 1134.5 ± 19.3 mGy cm).

  8. Supervised Outlier Detection in Large-Scale Mvs Point Clouds for 3d City Modeling Applications

    Science.gov (United States)

    Stucker, C.; Richard, A.; Wegner, J. D.; Schindler, K.

    2018-05-01

    We propose to use a discriminative classifier for outlier detection in large-scale point clouds of cities generated via multi-view stereo (MVS) from densely acquired images. What makes outlier removal hard are varying distributions of inliers and outliers across a scene. Heuristic outlier removal using a specific feature that encodes point distribution often delivers unsatisfying results. Although most outliers can be identified correctly (high recall), many inliers are erroneously removed (low precision), too. This aggravates object 3D reconstruction due to missing data. We thus propose to discriminatively learn class-specific distributions directly from the data to achieve high precision. We apply a standard Random Forest classifier that infers a binary label (inlier or outlier) for each 3D point in the raw, unfiltered point cloud and test two approaches for training. In the first, non-semantic approach, features are extracted without considering the semantic interpretation of the 3D points. The trained model approximates the average distribution of inliers and outliers across all semantic classes. Second, semantic interpretation is incorporated into the learning process, i.e. we train separate inlieroutlier classifiers per semantic class (building facades, roof, ground, vegetation, fields, and water). Performance of learned filtering is evaluated on several large SfM point clouds of cities. We find that results confirm our underlying assumption that discriminatively learning inlier-outlier distributions does improve precision over global heuristics by up to ≍ 12 percent points. Moreover, semantically informed filtering that models class-specific distributions further improves precision by up to ≍ 10 percent points, being able to remove very isolated building, roof, and water points while preserving inliers on building facades and vegetation.

  9. Baseline Estimation and Outlier Identification for Halocarbons

    Science.gov (United States)

    Wang, D.; Schuck, T.; Engel, A.; Gallman, F.

    2017-12-01

    The aim of this paper is to build a baseline model for halocarbons and to statistically identify the outliers under specific conditions. In this paper, time series of regional CFC-11 and Chloromethane measurements was discussed, which taken over the last 4 years at two locations, including a monitoring station at northwest of Frankfurt am Main (Germany) and Mace Head station (Ireland). In addition to analyzing time series of CFC-11 and Chloromethane, more importantly, a statistical approach of outlier identification is also introduced in this paper in order to make a better estimation of baseline. A second-order polynomial plus harmonics are fitted to CFC-11 and chloromethane mixing ratios data. Measurements with large distance to the fitting curve are regard as outliers and flagged. Under specific requirement, the routine is iteratively adopted without the flagged measurements until no additional outliers are found. Both model fitting and the proposed outlier identification method are realized with the help of a programming language, Python. During the period, CFC-11 shows a gradual downward trend. And there is a slightly upward trend in the mixing ratios of Chloromethane. The concentration of chloromethane also has a strong seasonal variation, mostly due to the seasonal cycle of OH. The usage of this statistical method has a considerable effect on the results. This method efficiently identifies a series of outliers according to the standard deviation requirements. After removing the outliers, the fitting curves and trend estimates are more reliable.

  10. Exploring Outliers in Crowdsourced Ranking for QoE

    OpenAIRE

    Xu, Qianqian; Yan, Ming; Huang, Chendi; Xiong, Jiechao; Huang, Qingming; Yao, Yuan

    2017-01-01

    Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years. In this paper, we propose some simple and fast algorithms for outlier detection and robust QoE evaluation based on the nonconvex optimization principle. Several iterative procedures are designed with or without knowing the number of outliers in samples. Theoretical analysis is given to show that such procedures can reach stati...

  11. Detection of outliers by neural network on the gas centrifuge experimental data of isotopic separation process

    International Nuclear Information System (INIS)

    Andrade, Monica de Carvalho Vasconcelos

    2004-01-01

    This work presents and discusses the neural network technique aiming at the detection of outliers on a set of gas centrifuge isotope separation experimental data. In order to evaluate the application of this new technique, the result obtained of the detection is compared to the result of the statistical analysis combined with the cluster analysis. This method for the detection of outliers presents a considerable potential in the field of data analysis and it is at the same time easier and faster to use and requests very less knowledge of the physics involved in the process. This work established a procedure for detecting experiments which are suspect to contain gross errors inside a data set where the usual techniques for identification of these errors cannot be applied or its use/demands an excessively long work. (author)

  12. Outlier Ranking via Subspace Analysis in Multiple Views of the Data

    DEFF Research Database (Denmark)

    Muller, Emmanuel; Assent, Ira; Iglesias, Patricia

    2012-01-01

    , a novel outlier ranking concept. Outrank exploits subspace analysis to determine the degree of outlierness. It considers different subsets of the attributes as individual outlier properties. It compares clustered regions in arbitrary subspaces and derives an outlierness score for each object. Its...... principled integration of multiple views into an outlierness measure uncovers outliers that are not detectable in the full attribute space. Our experimental evaluation demonstrates that Outrank successfully determines a high quality outlier ranking, and outperforms state-of-the-art outlierness measures....

  13. 42 CFR 412.84 - Payment for extraordinarily high-cost cases (cost outliers).

    Science.gov (United States)

    2010-10-01

    ... obtains accurate data with which to calculate either an operating or capital cost-to-charge ratio (or both... outlier payments will be based on operating and capital cost-to-charge ratios calculated based on a ratio... outliers). 412.84 Section 412.84 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF...

  14. A simple transformation independent method for outlier definition.

    Science.gov (United States)

    Johansen, Martin Berg; Christensen, Peter Astrup

    2018-04-10

    Definition and elimination of outliers is a key element for medical laboratories establishing or verifying reference intervals (RIs). Especially as inclusion of just a few outlying observations may seriously affect the determination of the reference limits. Many methods have been developed for definition of outliers. Several of these methods are developed for the normal distribution and often data require transformation before outlier elimination. We have developed a non-parametric transformation independent outlier definition. The new method relies on drawing reproducible histograms. This is done by using defined bin sizes above and below the median. The method is compared to the method recommended by CLSI/IFCC, which uses Box-Cox transformation (BCT) and Tukey's fences for outlier definition. The comparison is done on eight simulated distributions and an indirect clinical datasets. The comparison on simulated distributions shows that without outliers added the recommended method in general defines fewer outliers. However, when outliers are added on one side the proposed method often produces better results. With outliers on both sides the methods are equally good. Furthermore, it is found that the presence of outliers affects the BCT, and subsequently affects the determined limits of current recommended methods. This is especially seen in skewed distributions. The proposed outlier definition reproduced current RI limits on clinical data containing outliers. We find our simple transformation independent outlier detection method as good as or better than the currently recommended methods.

  15. Mining Outlier Data in Mobile Internet-Based Large Real-Time Databases

    Directory of Open Access Journals (Sweden)

    Xin Liu

    2018-01-01

    Full Text Available Mining outlier data guarantees access security and data scheduling of parallel databases and maintains high-performance operation of real-time databases. Traditional mining methods generate abundant interference data with reduced accuracy, efficiency, and stability, causing severe deficiencies. This paper proposes a new mining outlier data method, which is used to analyze real-time data features, obtain magnitude spectra models of outlier data, establish a decisional-tree information chain transmission model for outlier data in mobile Internet, obtain the information flow of internal outlier data in the information chain of a large real-time database, and cluster data. Upon local characteristic time scale parameters of information flow, the phase position features of the outlier data before filtering are obtained; the decision-tree outlier-classification feature-filtering algorithm is adopted to acquire signals for analysis and instant amplitude and to achieve the phase-frequency characteristics of outlier data. Wavelet transform threshold denoising is combined with signal denoising to analyze data offset, to correct formed detection filter model, and to realize outlier data mining. The simulation suggests that the method detects the characteristic outlier data feature response distribution, reduces response time, iteration frequency, and mining error rate, improves mining adaptation and coverage, and shows good mining outcomes.

  16. An Efficient Method for Detection of Outliers in Tracer Curves Derived from Dynamic Contrast-Enhanced Imaging

    Directory of Open Access Journals (Sweden)

    Linning Ye

    2018-01-01

    Full Text Available Presence of outliers in tracer concentration-time curves derived from dynamic contrast-enhanced imaging can adversely affect the analysis of the tracer curves by model-fitting. A computationally efficient method for detecting outliers in tracer concentration-time curves is presented in this study. The proposed method is based on a piecewise linear model and implemented using a robust clustering algorithm. The method is noniterative and all the parameters are automatically estimated. To compare the proposed method with existing Gaussian model based and robust regression-based methods, simulation studies were performed by simulating tracer concentration-time curves using the generalized Tofts model and kinetic parameters derived from different tissue types. Results show that the proposed method and the robust regression-based method achieve better detection performance than the Gaussian model based method. Compared with the robust regression-based method, the proposed method can achieve similar detection performance with much faster computation speed.

  17. Rapid eye movement sleep behavior disorder as an outlier detection problem

    DEFF Research Database (Denmark)

    Kempfner, Jacob; Sørensen, Gertrud Laura; Nikolic, M.

    2014-01-01

    OBJECTIVE: Idiopathic rapid eye movement (REM) sleep behavior disorder is a strong early marker of Parkinson's disease and is characterized by REM sleep without atonia and/or dream enactment. Because these measures are subject to individual interpretation, there is consequently need...... for quantitative methods to establish objective criteria. This study proposes a semiautomatic algorithm for the early detection of Parkinson's disease. This is achieved by distinguishing between normal REM sleep and REM sleep without atonia by considering muscle activity as an outlier detection problem. METHODS......: Sixteen healthy control subjects, 16 subjects with idiopathic REM sleep behavior disorder, and 16 subjects with periodic limb movement disorder were enrolled. Different combinations of five surface electromyographic channels, including the EOG, were tested. A muscle activity score was automatically...

  18. A statistical test for outlier identification in data envelopment analysis

    Directory of Open Access Journals (Sweden)

    Morteza Khodabin

    2010-09-01

    Full Text Available In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the presented method, each observation is deleted from the sample once and the resulting linear program is solved, leading to a distribution of efficiency estimates. Based on the achieved distribution, a pared test is designed to identify the potential outlier(s. We illustrate the method through a real data set. The method could be used in a first step, as an exploratory data analysis, before using any frontier estimation.

  19. A New Methodology Based on Imbalanced Classification for Predicting Outliers in Electricity Demand Time Series

    Directory of Open Access Journals (Sweden)

    Francisco Javier Duque-Pintor

    2016-09-01

    Full Text Available The occurrence of outliers in real-world phenomena is quite usual. If these anomalous data are not properly treated, unreliable models can be generated. Many approaches in the literature are focused on a posteriori detection of outliers. However, a new methodology to a priori predict the occurrence of such data is proposed here. Thus, the main goal of this work is to predict the occurrence of outliers in time series, by using, for the first time, imbalanced classification techniques. In this sense, the problem of forecasting outlying data has been transformed into a binary classification problem, in which the positive class represents the occurrence of outliers. Given that the number of outliers is much lower than the number of common values, the resultant classification problem is imbalanced. To create training and test sets, robust statistical methods have been used to detect outliers in both sets. Once the outliers have been detected, the instances of the dataset are labeled accordingly. Namely, if any of the samples composing the next instance are detected as an outlier, the label is set to one. As a study case, the methodology has been tested on electricity demand time series in the Spanish electricity market, in which most of the outliers were properly forecast.

  20. Evaluating Outlier Identification Tests: Mahalanobis "D" Squared and Comrey "Dk."

    Science.gov (United States)

    Rasmussen, Jeffrey Lee

    1988-01-01

    A Monte Carlo simulation was used to compare the Mahalanobis "D" Squared and the Comrey "Dk" methods of detecting outliers in data sets. Under the conditions investigated, the "D" Squared technique was preferable as an outlier removal statistic. (SLD)

  1. Outlier Detection in Urban Air Quality Sensor Networks

    NARCIS (Netherlands)

    van Zoest, V.M.; Stein, A.; Hoek, Gerard

    2018-01-01

    Low-cost urban air quality sensor networks are increasingly used to study the spatio-temporal variability in air pollutant concentrations. Recently installed low-cost urban sensors, however, are more prone to result in erroneous data than conventional monitors, e.g., leading to outliers. Commonly

  2. PEMODELAN ARIMA DAN DETEKSI OUTLIER DATA CURAH HUJAN SEBAGAI EVALUASI SISTEM RADIO GELOMBANG MILIMETER

    Directory of Open Access Journals (Sweden)

    Achmad Mauludiyanto

    2009-01-01

    Full Text Available The purpose of this paper is to provide the results of Arima modeling and outlier detection in the rainfall data in Surabaya. This paper explained about the steps in the formation of rainfall models, especially Box-Jenkins procedure for Arima modeling and outlier detection. Early stages of modeling stasioneritas Arima is the identification of data, both in mean and variance. Stasioneritas evaluation data in the variance can be done with Box-Cox transformation. Meanwhile, in the mean stasioneritas can be done with the plot data and forms of ACF. Identification of ACF and PACF of the stationary data is used to determine the order of allegations Arima model. The next stage is to estimate the parameters and diagnostic checks to see the suitability model. Process diagnostics check conducted to evaluate whether the residual model is eligible berdistribusi white noise and normal. Ljung-Box Test is a test that can be used to validate the white noise condition, while the Kolmogorov-Smirnov Test is an evaluation test for normal distribution. Residual normality test results showed that the residual model of Arima not white noise, and indicates the existence of outlier in the data. Thus, the next step taken is outlier detection to eliminate outlier effects and increase the accuracy of predictions of the model Arima. Arima modeling implementation and outlier detection is done by using MINITAB package and MATLAB. The research shows that the modeling Arima and outlier detection can reduce the prediction error as measured by the criteria Mean Square Error (MSE. Quantitatively, the decline in the value of MSE by incorporating outlier detection is 23.7%, with an average decline 6.5%.

  3. ZODET: software for the identification, analysis and visualisation of outlier genes in microarray expression data.

    Directory of Open Access Journals (Sweden)

    Daniel L Roden

    Full Text Available Complex human diseases can show significant heterogeneity between patients with the same phenotypic disorder. An outlier detection strategy was developed to identify variants at the level of gene transcription that are of potential biological and phenotypic importance. Here we describe a graphical software package (z-score outlier detection (ZODET that enables identification and visualisation of gross abnormalities in gene expression (outliers in individuals, using whole genome microarray data. Mean and standard deviation of expression in a healthy control cohort is used to detect both over and under-expressed probes in individual test subjects. We compared the potential of ZODET to detect outlier genes in gene expression datasets with a previously described statistical method, gene tissue index (GTI, using a simulated expression dataset and a publicly available monocyte-derived macrophage microarray dataset. Taken together, these results support ZODET as a novel approach to identify outlier genes of potential pathogenic relevance in complex human diseases. The algorithm is implemented using R packages and Java.The software is freely available from http://www.ucl.ac.uk/medicine/molecular-medicine/publications/microarray-outlier-analysis.

  4. A Note on optimal estimation in the presence of outliers

    Directory of Open Access Journals (Sweden)

    John N. Haddad

    2017-06-01

    Full Text Available Haddad, J. 2017. A Note on optimal estimation in the presence of outliers. Lebanese Science Journal, 18(1: 136-141. The basic estimation problem of the mean and standard deviation of a random normal process in the presence of an outlying observation is considered. The value of the outlier is taken as a constraint imposed on the maximization problem of the log likelihood. It is shown that the optimal solution of the maximization problem exists and expressions for the estimates are given. Applications to estimation in the presence of outliers and outlier detection are discussed and illustrated through a simulation study and analysis of trade data

  5. SU-F-T-97: Outlier Identification in Radiation Therapy Knowledge Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Sheng, Y [Duke University, Durham, NC (United States); Ge, Y [University of North Carolina at Charlotte, Charlotte, NC (United States); Yuan, L; Yin, F; Wu, Q [Duke University Medical Center, Durham, NC (United States); Li, T [Thomas Jefferson University, Philadelphia, PA (United States)

    2016-06-15

    Purpose: To investigate the impact of outliers on knowledge modeling in radiation therapy, and develop a systematic workflow for identifying and analyzing geometric and dosimetric outliers using pelvic cases. Methods: Four groups (G1-G4) of pelvic plans were included: G1 (37 prostate cases), G2 (37 prostate plus lymph node cases), and G3 (37 prostate bed cases) are all clinical IMRT cases. G4 are 10 plans outside G1 re-planned with dynamic-arc to simulate dosimetric outliers. The workflow involves 2 steps: 1. identify geometric outliers, assess impact and clean up; 2. identify dosimetric outliers, assess impact and clean up.1. A baseline model was trained with all G1 cases. G2/G3 cases were then individually added to the baseline model as geometric outliers. The impact on the model was assessed by comparing leverage statistic of inliers (G1) and outliers (G2/G3). Receiver-operating-characteristics (ROC) analysis was performed to determine optimal threshold. 2. A separate baseline model was trained with 32 G1 cases. Each G4 case (dosimetric outliers) was then progressively added to perturb this model. DVH predictions were performed using these perturbed models for remaining 5 G1 cases. Normal tissue complication probability (NTCP) calculated from predicted DVH were used to evaluate dosimetric outliers’ impact. Results: The leverage of inliers and outliers was significantly different. The Area-Under-Curve (AUC) for differentiating G2 from G1 was 0.94 (threshold: 0.22) for bladder; and 0.80 (threshold: 0.10) for rectum. For differentiating G3 from G1, the AUC (threshold) was 0.68 (0.09) for bladder, 0.76 (0.08) for rectum. Significant increase in NTCP started from models with 4 dosimetric outliers for bladder (p<0.05), and with only 1 dosimetric outlier for rectum (p<0.05). Conclusion: We established a systematic workflow for identifying and analyzing geometric and dosimetric outliers, and investigated statistical metrics for detecting. Results validated the

  6. A Note on the Vogelsang Test for Additive Outliers

    DEFF Research Database (Denmark)

    Haldrup, Niels; Sansó, Andreu

    The role of additive outliers in integrated time series has attractedsome attention recently and research shows that outlier detection shouldbe an integral part of unit root testing procedures. Recently, Vogelsang(1999) suggested an iterative procedure for the detection of multiple additiveoutliers...... in integrated time series. However, the procedure appearsto suffr from serious size distortions towards the finding of too manyoutliers as has been shown by Perron and Rodriguez (2003). In this notewe prove the inconsistency of the test in each step of the iterative procedureand hence alternative routes need...

  7. On damage detection in wind turbine gearboxes using outlier analysis

    Science.gov (United States)

    Antoniadou, Ifigeneia; Manson, Graeme; Dervilis, Nikolaos; Staszewski, Wieslaw J.; Worden, Keith

    2012-04-01

    The proportion of worldwide installed wind power in power systems increases over the years as a result of the steadily growing interest in renewable energy sources. Still, the advantages offered by the use of wind power are overshadowed by the high operational and maintenance costs, resulting in the low competitiveness of wind power in the energy market. In order to reduce the costs of corrective maintenance, the application of condition monitoring to gearboxes becomes highly important, since gearboxes are among the wind turbine components with the most frequent failure observations. While condition monitoring of gearboxes in general is common practice, with various methods having been developed over the last few decades, wind turbine gearbox condition monitoring faces a major challenge: the detection of faults under the time-varying load conditions prevailing in wind turbine systems. Classical time and frequency domain methods fail to detect faults under variable load conditions, due to the temporary effect that these faults have on vibration signals. This paper uses the statistical discipline of outlier analysis for the damage detection of gearbox tooth faults. A simplified two-degree-of-freedom gearbox model considering nonlinear backlash, time-periodic mesh stiffness and static transmission error, simulates the vibration signals to be analysed. Local stiffness reduction is used for the simulation of tooth faults and statistical processes determine the existence of intermittencies. The lowest level of fault detection, the threshold value, is considered and the Mahalanobis squared-distance is calculated for the novelty detection problem.

  8. Modeling of activation data in the BrainMapTM database: Detection of outliers

    DEFF Research Database (Denmark)

    Nielsen, Finn Årup; Hansen, Lars Kai

    2002-01-01

    models is identification of novelty, i.e., low probability database events. We rank the novelty of the outliers and investigate the cause for 21 of the most novel, finding several outliers that are entry and transcription errors or infrequent or non-conforming terminology. We briefly discuss the use...

  9. A generalized Grubbs-Beck test statistic for detecting multiple potentially influential low outliers in flood series

    Science.gov (United States)

    Cohn, T.A.; England, J.F.; Berenbrock, C.E.; Mason, R.R.; Stedinger, J.R.; Lamontagne, J.R.

    2013-01-01

    he Grubbs-Beck test is recommended by the federal guidelines for detection of low outliers in flood flow frequency computation in the United States. This paper presents a generalization of the Grubbs-Beck test for normal data (similar to the Rosner (1983) test; see also Spencer and McCuen (1996)) that can provide a consistent standard for identifying multiple potentially influential low flows. In cases where low outliers have been identified, they can be represented as “less-than” values, and a frequency distribution can be developed using censored-data statistical techniques, such as the Expected Moments Algorithm. This approach can improve the fit of the right-hand tail of a frequency distribution and provide protection from lack-of-fit due to unimportant but potentially influential low flows (PILFs) in a flood series, thus making the flood frequency analysis procedure more robust.

  10. Detecting outliers and learning complex structures with large spectroscopic surveys - a case study with APOGEE stars

    Science.gov (United States)

    Reis, Itamar; Poznanski, Dovi; Baron, Dalya; Zasowski, Gail; Shahaf, Sahar

    2018-05-01

    In this work, we apply and expand on a recently introduced outlier detection algorithm that is based on an unsupervised random forest. We use the algorithm to calculate a similarity measure for stellar spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE). We show that the similarity measure traces non-trivial physical properties and contains information about complex structures in the data. We use it for visualization and clustering of the data set, and discuss its ability to find groups of highly similar objects, including spectroscopic twins. Using the similarity matrix to search the data set for objects allows us to find objects that are impossible to find using their best-fitting model parameters. This includes extreme objects for which the models fail, and rare objects that are outside the scope of the model. We use the similarity measure to detect outliers in the data set, and find a number of previously unknown Be-type stars, spectroscopic binaries, carbon rich stars, young stars, and a few that we cannot interpret. Our work further demonstrates the potential for scientific discovery when combining machine learning methods with modern survey data.

  11. Calculation of climatic reference values and its use for automatic outlier detection in meteorological datasets

    Directory of Open Access Journals (Sweden)

    B. Téllez

    2008-04-01

    Full Text Available The climatic reference values for monthly and annual average air temperature and total precipitation in Catalonia – northeast of Spain – are calculated using a combination of statistical methods and geostatistical techniques of interpolation. In order to estimate the uncertainty of the method, the initial dataset is split into two parts that are, respectively, used for estimation and validation. The resulting maps are then used in the automatic outlier detection in meteorological datasets.

  12. Slowing ash mortality: a potential strategy to slam emerald ash borer in outlier sites

    Science.gov (United States)

    Deborah G. McCullough; Nathan W. Siegert; John Bedford

    2009-01-01

    Several isolated outlier populations of emerald ash borer (Agrilus planipennis Fairmaire) were discovered in 2008 and additional outliers will likely be found as detection surveys and public outreach activities...

  13. Enhanced Isotopic Ratio Outlier Analysis (IROA Peak Detection and Identification with Ultra-High Resolution GC-Orbitrap/MS: Potential Application for Investigation of Model Organism Metabolomes

    Directory of Open Access Journals (Sweden)

    Yunping Qiu

    2018-01-01

    Full Text Available Identifying non-annotated peaks may have a significant impact on the understanding of biological systems. In silico methodologies have focused on ESI LC/MS/MS for identifying non-annotated MS peaks. In this study, we employed in silico methodology to develop an Isotopic Ratio Outlier Analysis (IROA workflow using enhanced mass spectrometric data acquired with the ultra-high resolution GC-Orbitrap/MS to determine the identity of non-annotated metabolites. The higher resolution of the GC-Orbitrap/MS, together with its wide dynamic range, resulted in more IROA peak pairs detected, and increased reliability of chemical formulae generation (CFG. IROA uses two different 13C-enriched carbon sources (randomized 95% 12C and 95% 13C to produce mirror image isotopologue pairs, whose mass difference reveals the carbon chain length (n, which aids in the identification of endogenous metabolites. Accurate m/z, n, and derivatization information are obtained from our GC/MS workflow for unknown metabolite identification, and aids in silico methodologies for identifying isomeric and non-annotated metabolites. We were able to mine more mass spectral information using the same Saccharomyces cerevisiae growth protocol (Qiu et al. Anal. Chem 2016 with the ultra-high resolution GC-Orbitrap/MS, using 10% ammonia in methane as the CI reagent gas. We identified 244 IROA peaks pairs, which significantly increased IROA detection capability compared with our previous report (126 IROA peak pairs using a GC-TOF/MS machine. For 55 selected metabolites identified from matched IROA CI and EI spectra, using the GC-Orbitrap/MS vs. GC-TOF/MS, the average mass deviation for GC-Orbitrap/MS was 1.48 ppm, however, the average mass deviation was 32.2 ppm for the GC-TOF/MS machine. In summary, the higher resolution and wider dynamic range of the GC-Orbitrap/MS enabled more accurate CFG, and the coupling of accurate mass GC/MS IROA methodology with in silico fragmentation has great

  14. Enhanced Isotopic Ratio Outlier Analysis (IROA) Peak Detection and Identification with Ultra-High Resolution GC-Orbitrap/MS: Potential Application for Investigation of Model Organism Metabolomes.

    Science.gov (United States)

    Qiu, Yunping; Moir, Robyn D; Willis, Ian M; Seethapathy, Suresh; Biniakewitz, Robert C; Kurland, Irwin J

    2018-01-18

    Identifying non-annotated peaks may have a significant impact on the understanding of biological systems. In silico methodologies have focused on ESI LC/MS/MS for identifying non-annotated MS peaks. In this study, we employed in silico methodology to develop an Isotopic Ratio Outlier Analysis (IROA) workflow using enhanced mass spectrometric data acquired with the ultra-high resolution GC-Orbitrap/MS to determine the identity of non-annotated metabolites. The higher resolution of the GC-Orbitrap/MS, together with its wide dynamic range, resulted in more IROA peak pairs detected, and increased reliability of chemical formulae generation (CFG). IROA uses two different 13 C-enriched carbon sources (randomized 95% 12 C and 95% 13 C) to produce mirror image isotopologue pairs, whose mass difference reveals the carbon chain length (n), which aids in the identification of endogenous metabolites. Accurate m/z, n, and derivatization information are obtained from our GC/MS workflow for unknown metabolite identification, and aids in silico methodologies for identifying isomeric and non-annotated metabolites. We were able to mine more mass spectral information using the same Saccharomyces cerevisiae growth protocol (Qiu et al. Anal. Chem 2016) with the ultra-high resolution GC-Orbitrap/MS, using 10% ammonia in methane as the CI reagent gas. We identified 244 IROA peaks pairs, which significantly increased IROA detection capability compared with our previous report (126 IROA peak pairs using a GC-TOF/MS machine). For 55 selected metabolites identified from matched IROA CI and EI spectra, using the GC-Orbitrap/MS vs. GC-TOF/MS, the average mass deviation for GC-Orbitrap/MS was 1.48 ppm, however, the average mass deviation was 32.2 ppm for the GC-TOF/MS machine. In summary, the higher resolution and wider dynamic range of the GC-Orbitrap/MS enabled more accurate CFG, and the coupling of accurate mass GC/MS IROA methodology with in silico fragmentation has great potential in

  15. Outlier analysis

    CERN Document Server

    Aggarwal, Charu C

    2013-01-01

    With the increasing advances in hardware technology for data collection, and advances in software technology (databases) for data organization, computer scientists have increasingly participated in the latest advancements of the outlier analysis field. Computer scientists, specifically, approach this field based on their practical experiences in managing large amounts of data, and with far fewer assumptions- the data can be of any type, structured or unstructured, and may be extremely large.Outlier Analysis is a comprehensive exposition, as understood by data mining experts, statisticians and

  16. A Near-linear Time Approximation Algorithm for Angle-based Outlier Detection in High-dimensional Data

    DEFF Research Database (Denmark)

    Pham, Ninh Dang; Pagh, Rasmus

    2012-01-01

    projection-based technique that is able to estimate the angle-based outlier factor for all data points in time near-linear in the size of the data. Also, our approach is suitable to be performed in parallel environment to achieve a parallel speedup. We introduce a theoretical analysis of the quality...... neighbor are deteriorated in high-dimensional data. Following up on the work of Kriegel et al. (KDD '08), we investigate the use of angle-based outlier factor in mining high-dimensional outliers. While their algorithm runs in cubic time (with a quadratic time heuristic), we propose a novel random......Outlier mining in d-dimensional point sets is a fundamental and well studied data mining task due to its variety of applications. Most such applications arise in high-dimensional domains. A bottleneck of existing approaches is that implicit or explicit assessments on concepts of distance or nearest...

  17. Sparsity-weighted outlier FLOODing (OFLOOD) method: Efficient rare event sampling method using sparsity of distribution.

    Science.gov (United States)

    Harada, Ryuhei; Nakamura, Tomotake; Shigeta, Yasuteru

    2016-03-30

    As an extension of the Outlier FLOODing (OFLOOD) method [Harada et al., J. Comput. Chem. 2015, 36, 763], the sparsity of the outliers defined by a hierarchical clustering algorithm, FlexDice, was considered to achieve an efficient conformational search as sparsity-weighted "OFLOOD." In OFLOOD, FlexDice detects areas of sparse distribution as outliers. The outliers are regarded as candidates that have high potential to promote conformational transitions and are employed as initial structures for conformational resampling by restarting molecular dynamics simulations. When detecting outliers, FlexDice defines a rank in the hierarchy for each outlier, which relates to sparsity in the distribution. In this study, we define a lower rank (first ranked), a medium rank (second ranked), and the highest rank (third ranked) outliers, respectively. For instance, the first-ranked outliers are located in a given conformational space away from the clusters (highly sparse distribution), whereas those with the third-ranked outliers are nearby the clusters (a moderately sparse distribution). To achieve the conformational search efficiently, resampling from the outliers with a given rank is performed. As demonstrations, this method was applied to several model systems: Alanine dipeptide, Met-enkephalin, Trp-cage, T4 lysozyme, and glutamine binding protein. In each demonstration, the present method successfully reproduced transitions among metastable states. In particular, the first-ranked OFLOOD highly accelerated the exploration of conformational space by expanding the edges. In contrast, the third-ranked OFLOOD reproduced local transitions among neighboring metastable states intensively. For quantitatively evaluations of sampled snapshots, free energy calculations were performed with a combination of umbrella samplings, providing rigorous landscapes of the biomolecules. © 2015 Wiley Periodicals, Inc.

  18. Modeling Data Containing Outliers using ARIMA Additive Outlier (ARIMA-AO)

    Science.gov (United States)

    Saleh Ahmar, Ansari; Guritno, Suryo; Abdurakhman; Rahman, Abdul; Awi; Alimuddin; Minggi, Ilham; Arif Tiro, M.; Kasim Aidid, M.; Annas, Suwardi; Utami Sutiksno, Dian; Ahmar, Dewi S.; Ahmar, Kurniawan H.; Abqary Ahmar, A.; Zaki, Ahmad; Abdullah, Dahlan; Rahim, Robbi; Nurdiyanto, Heri; Hidayat, Rahmat; Napitupulu, Darmawan; Simarmata, Janner; Kurniasih, Nuning; Andretti Abdillah, Leon; Pranolo, Andri; Haviluddin; Albra, Wahyudin; Arifin, A. Nurani M.

    2018-01-01

    The aim this study is discussed on the detection and correction of data containing the additive outlier (AO) on the model ARIMA (p, d, q). The process of detection and correction of data using an iterative procedure popularized by Box, Jenkins, and Reinsel (1994). By using this method we obtained an ARIMA models were fit to the data containing AO, this model is added to the original model of ARIMA coefficients obtained from the iteration process using regression methods. In the simulation data is obtained that the data contained AO initial models are ARIMA (2,0,0) with MSE = 36,780, after the detection and correction of data obtained by the iteration of the model ARIMA (2,0,0) with the coefficients obtained from the regression Zt = 0,106+0,204Z t-1+0,401Z t-2-329X 1(t)+115X 2(t)+35,9X 3(t) and MSE = 19,365. This shows that there is an improvement of forecasting error rate data.

  19. A tandem regression-outlier analysis of a ligand cellular system for key structural modifications around ligand binding.

    Science.gov (United States)

    Lin, Ying-Ting

    2013-04-30

    A tandem technique of hard equipment is often used for the chemical analysis of a single cell to first isolate and then detect the wanted identities. The first part is the separation of wanted chemicals from the bulk of a cell; the second part is the actual detection of the important identities. To identify the key structural modifications around ligand binding, the present study aims to develop a counterpart of tandem technique for cheminformatics. A statistical regression and its outliers act as a computational technique for separation. A PPARγ (peroxisome proliferator-activated receptor gamma) agonist cellular system was subjected to such an investigation. Results show that this tandem regression-outlier analysis, or the prioritization of the context equations tagged with features of the outliers, is an effective regression technique of cheminformatics to detect key structural modifications, as well as their tendency of impact to ligand binding. The key structural modifications around ligand binding are effectively extracted or characterized out of cellular reactions. This is because molecular binding is the paramount factor in such ligand cellular system and key structural modifications around ligand binding are expected to create outliers. Therefore, such outliers can be captured by this tandem regression-outlier analysis.

  20. Methods of Detecting Outliers in A Regression Analysis Model. | Ogu ...

    African Journals Online (AJOL)

    A Boilers data with dependent variable Y (man-Hour) and four independent variables X1 (Boiler Capacity), X2 (Design Pressure), X3 (Boiler Type), X4 (Drum Type) were used. The analysis of the Boilers data reviewed an unexpected group of Outliers. The results from the findings showed that an observation can be outlying ...

  1. Outlier-resilient complexity analysis of heartbeat dynamics

    Science.gov (United States)

    Lo, Men-Tzung; Chang, Yi-Chung; Lin, Chen; Young, Hsu-Wen Vincent; Lin, Yen-Hung; Ho, Yi-Lwun; Peng, Chung-Kang; Hu, Kun

    2015-03-01

    Complexity in physiological outputs is believed to be a hallmark of healthy physiological control. How to accurately quantify the degree of complexity in physiological signals with outliers remains a major barrier for translating this novel concept of nonlinear dynamic theory to clinical practice. Here we propose a new approach to estimate the complexity in a signal by analyzing the irregularity of the sign time series of its coarse-grained time series at different time scales. Using surrogate data, we show that the method can reliably assess the complexity in noisy data while being highly resilient to outliers. We further apply this method to the analysis of human heartbeat recordings. Without removing any outliers due to ectopic beats, the method is able to detect a degradation of cardiac control in patients with congestive heart failure and a more degradation in critically ill patients whose life continuation relies on extracorporeal membrane oxygenator (ECMO). Moreover, the derived complexity measures can predict the mortality of ECMO patients. These results indicate that the proposed method may serve as a promising tool for monitoring cardiac function of patients in clinical settings.

  2. Examination of pulsed eddy current for inspection of second layer aircraft wing lap-joint structures using outlier detection methods

    Energy Technology Data Exchange (ETDEWEB)

    Butt, D.M., E-mail: Dennis.Butt@forces.gc.ca [Royal Military College of Canada, Dept. of Chemistry and Chemical Engineering, Kingston, Ontario (Canada); Underhill, P.R.; Krause, T.W., E-mail: Thomas.Krause@rmc.ca [Royal Military College of Canada, Dept. of Physics, Kingston, Ontario (Canada)

    2016-09-15

    Ageing aircraft are susceptible to fatigue cracks at bolt hole locations in multi-layer aluminum wing lap-joints due to cyclic loading conditions experienced during typical aircraft operation, Current inspection techniques require removal of fasteners to permit inspection of the second layer from within the bolt hole. Inspection from the top layer without fastener removal is desirable in order to minimize aircraft downtime while reducing the risk of collateral damage. The ability to detect second layer cracks without fastener removal has been demonstrated using a pulsed eddy current (PEC) technique. The technique utilizes a breakdown of the measured signal response into its principal components, each of which is multiplied by a representative factor known as a score. The reduced data set of scores, which represent the measured signal, are examined for outliers using cluster analysis methods in order to detect the presence of defects. However, the cluster analysis methodology is limited by the fact that a number of representative signals, obtained from fasteners where defects are not present, are required in order to perform classification of the data. Alternatively, blind outlier detection can be achieved without having to obtain representative defect-free signals, by using a modified smallest half-volume (MSHV) approach. Results obtained using this approach suggest that self-calibrating blind detection of cyclic fatigue cracks in second layer wing structures in the presence of ferrous fasteners is possible without prior knowledge of the sample under test and without the use of costly calibration standards. (author)

  3. Examination of pulsed eddy current for inspection of second layer aircraft wing lap-joint structures using outlier detection methods

    International Nuclear Information System (INIS)

    Butt, D.M.; Underhill, P.R.; Krause, T.W.

    2016-01-01

    Ageing aircraft are susceptible to fatigue cracks at bolt hole locations in multi-layer aluminum wing lap-joints due to cyclic loading conditions experienced during typical aircraft operation, Current inspection techniques require removal of fasteners to permit inspection of the second layer from within the bolt hole. Inspection from the top layer without fastener removal is desirable in order to minimize aircraft downtime while reducing the risk of collateral damage. The ability to detect second layer cracks without fastener removal has been demonstrated using a pulsed eddy current (PEC) technique. The technique utilizes a breakdown of the measured signal response into its principal components, each of which is multiplied by a representative factor known as a score. The reduced data set of scores, which represent the measured signal, are examined for outliers using cluster analysis methods in order to detect the presence of defects. However, the cluster analysis methodology is limited by the fact that a number of representative signals, obtained from fasteners where defects are not present, are required in order to perform classification of the data. Alternatively, blind outlier detection can be achieved without having to obtain representative defect-free signals, by using a modified smallest half-volume (MSHV) approach. Results obtained using this approach suggest that self-calibrating blind detection of cyclic fatigue cracks in second layer wing structures in the presence of ferrous fasteners is possible without prior knowledge of the sample under test and without the use of costly calibration standards. (author)

  4. An optimized outlier detection algorithm for jury-based grading of engineering design projects

    DEFF Research Database (Denmark)

    Thompson, Mary Kathryn; Espensen, Christina; Clemmensen, Line Katrine Harder

    2016-01-01

    This work characterizes and optimizes an outlier detection algorithm to identify potentially invalid scores produced by jury members while grading engineering design projects. The paper describes the original algorithm and the associated adjudication process in detail. The impact of the various...... (the base rule and the three additional conditions) play a role in the algorithm's performance and should be included in the algorithm. Because there is significant interaction between the base rule and the additional conditions, many acceptable combinations that balance the FPR and FNR can be found......, but no true optimum seems to exist. The performance of the best optimizations and the original algorithm are similar. Therefore, it should be possible to choose new coefficient values for jury populations in other cultures and contexts logically and empirically without a full optimization as long...

  5. A Global Photoionization Response to Prompt Emission and Outliers: Different Origin of Long Gamma-ray Bursts?

    Science.gov (United States)

    Wang, J.; Xin, L. P.; Qiu, Y. L.; Xu, D. W.; Wei, J. Y.

    2018-03-01

    By using the line ratio C IV λ1549/C II λ1335 as a tracer of the ionization ratio of the interstellar medium (ISM) illuminated by a long gamma-ray burst (LGRB), we identify a global photoionization response of the ionization ratio to the photon luminosity of the prompt emission assessed by either L iso/E peak or {L}iso}/{E}peak}2. The ionization ratio increases with both L iso/E peak and L iso/E 2 peak for a majority of the LGRBs in our sample, although there are a few outliers. The identified dependence of C IV/C II on {L}iso}/{E}peak}2 suggests that the scatter of the widely accepted Amati relation is related to the ionization ratio in the ISM. The outliers tend to have relatively high C IV/C II values as well as relatively high C IV λ1549/Si IV λ1403 ratios, which suggests an existence of Wolf–Rayet stars in the environment of these LGRBs. We finally argue that the outliers and the LGRBs following the identified C IV/C II‑L iso/E peak ({L}iso}/{E}peak}2) correlation might come from different progenitors with different local environments.

  6. Ranking Fragment Ions Based on Outlier Detection for Improved Label-Free Quantification in Data-Independent Acquisition LC-MS/MS

    Science.gov (United States)

    Bilbao, Aivett; Zhang, Ying; Varesio, Emmanuel; Luban, Jeremy; Strambio-De-Castillia, Caterina; Lisacek, Frédérique; Hopfgartner, Gérard

    2016-01-01

    Data-independent acquisition LC-MS/MS techniques complement supervised methods for peptide quantification. However, due to the wide precursor isolation windows, these techniques are prone to interference at the fragment ion level, which in turn is detrimental for accurate quantification. The “non-outlier fragment ion” (NOFI) ranking algorithm has been developed to assign low priority to fragment ions affected by interference. By using the optimal subset of high priority fragment ions these interfered fragment ions are effectively excluded from quantification. NOFI represents each fragment ion as a vector of four dimensions related to chromatographic and MS fragmentation attributes and applies multivariate outlier detection techniques. Benchmarking conducted on a well-defined quantitative dataset (i.e. the SWATH Gold Standard), indicates that NOFI on average is able to accurately quantify 11-25% more peptides than the commonly used Top-N library intensity ranking method. The sum of the area of the Top3-5 NOFIs produces similar coefficients of variation as compared to the library intensity method but with more accurate quantification results. On a biologically relevant human dendritic cell digest dataset, NOFI properly assigns low priority ranks to 85% of annotated interferences, resulting in sensitivity values between 0.92 and 0.80 against 0.76 for the Spectronaut interference detection algorithm. PMID:26412574

  7. Principal component analysis applied to Fourier transform infrared spectroscopy for the design of calibration sets for glycerol prediction models in wine and for the detection and classification of outlier samples.

    Science.gov (United States)

    Nieuwoudt, Helene H; Prior, Bernard A; Pretorius, Isak S; Manley, Marena; Bauer, Florian F

    2004-06-16

    Principal component analysis (PCA) was used to identify the main sources of variation in the Fourier transform infrared (FT-IR) spectra of 329 wines of various styles. The FT-IR spectra were gathered using a specialized WineScan instrument. The main sources of variation included the reducing sugar and alcohol content of the samples, as well as the stage of fermentation and the maturation period of the wines. The implications of the variation between the different wine styles for the design of calibration models with accurate predictive abilities were investigated using glycerol calibration in wine as a model system. PCA enabled the identification and interpretation of samples that were poorly predicted by the calibration models, as well as the detection of individual samples in the sample set that had atypical spectra (i.e., outlier samples). The Soft Independent Modeling of Class Analogy (SIMCA) approach was used to establish a model for the classification of the outlier samples. A glycerol calibration for wine was developed (reducing sugar content 8% v/v) with satisfactory predictive ability (SEP = 0.40 g/L). The RPD value (ratio of the standard deviation of the data to the standard error of prediction) was 5.6, indicating that the calibration is suitable for quantification purposes. A calibration for glycerol in special late harvest and noble late harvest wines (RS 31-147 g/L, alcohol > 11.6% v/v) with a prediction error SECV = 0.65 g/L, was also established. This study yielded an analytical strategy that combined the careful design of calibration sets with measures that facilitated the early detection and interpretation of poorly predicted samples and outlier samples in a sample set. The strategy provided a powerful means of quality control, which is necessary for the generation of accurate prediction data and therefore for the successful implementation of FT-IR in the routine analytical laboratory.

  8. Analyzing contentious relationships and outlier genes in phylogenomics.

    Science.gov (United States)

    Walker, Joseph F; Brown, Joseph W; Smith, Stephen A

    2018-06-08

    Recent studies have demonstrated that conflict is common among gene trees in phylogenomic studies, and that less than one percent of genes may ultimately drive species tree inference in supermatrix analyses. Here, we examined two datasets where supermatrix and coalescent-based species trees conflict. We identified two highly influential "outlier" genes in each dataset. When removed from each dataset, the inferred supermatrix trees matched the topologies obtained from coalescent analyses. We also demonstrate that, while the outlier genes in the vertebrate dataset have been shown in a previous study to be the result of errors in orthology detection, the outlier genes from a plant dataset did not exhibit any obvious systematic error and therefore may be the result of some biological process yet to be determined. While topological comparisons among a small set of alternate topologies can be helpful in discovering outlier genes, they can be limited in several ways, such as assuming all genes share the same topology. Coalescent species tree methods relax this assumption but do not explicitly facilitate the examination of specific edges. Coalescent methods often also assume that conflict is the result of incomplete lineage sorting (ILS). Here we explored a framework that allows for quickly examining alternative edges and support for large phylogenomic datasets that does not assume a single topology for all genes. For both datasets, these analyses provided detailed results confirming the support for coalescent-based topologies. This framework suggests that we can improve our understanding of the underlying signal in phylogenomic datasets by asking more targeted edge-based questions.

  9. Outlier identification and visualization for Pb concentrations in urban soils and its implications for identification of potential contaminated land

    International Nuclear Information System (INIS)

    Zhang Chaosheng; Tang Ya; Luo Lin; Xu Weilin

    2009-01-01

    Outliers in urban soil geochemical databases may imply potential contaminated land. Different methodologies which can be easily implemented for the identification of global and spatial outliers were applied for Pb concentrations in urban soils of Galway City in Ireland. Due to its strongly skewed probability feature, a Box-Cox transformation was performed prior to further analyses. The graphic methods of histogram and box-and-whisker plot were effective in identification of global outliers at the original scale of the dataset. Spatial outliers could be identified by a local indicator of spatial association of local Moran's I, cross-validation of kriging, and a geographically weighted regression. The spatial locations of outliers were visualised using a geographical information system. Different methods showed generally consistent results, but differences existed. It is suggested that outliers identified by statistical methods should be confirmed and justified using scientific knowledge before they are properly dealt with. - Outliers in urban geochemical databases can be detected to provide guidance for identification of potential contaminated land.

  10. The outlier sample effects on multivariate statistical data processing geochemical stream sediment survey (Moghangegh region, North West of Iran)

    International Nuclear Information System (INIS)

    Ghanbari, Y.; Habibnia, A.; Memar, A.

    2009-01-01

    In geochemical stream sediment surveys in Moghangegh Region in north west of Iran, sheet 1:50,000, 152 samples were collected and after the analyze and processing of data, it revealed that Yb, Sc, Ni, Li, Eu, Cd, Co, as contents in one sample is far higher than other samples. After detecting this sample as an outlier sample, the effect of this sample on multivariate statistical data processing for destructive effects of outlier sample in geochemical exploration was investigated. Pearson and Spear man correlation coefficient methods and cluster analysis were used for multivariate studies and the scatter plot of some elements together the regression profiles are given in case of 152 and 151 samples and the results are compared. After investigation of multivariate statistical data processing results, it was realized that results of existence of outlier samples may appear as the following relations between elements: - true relation between two elements, which have no outlier frequency in the outlier sample. - false relation between two elements which one of them has outlier frequency in the outlier sample. - complete false relation between two elements which both have outlier frequency in the outlier sample

  11. Quality assurance using outlier detection on an automatic segmentation method for the cerebellar peduncles

    Science.gov (United States)

    Li, Ke; Ye, Chuyang; Yang, Zhen; Carass, Aaron; Ying, Sarah H.; Prince, Jerry L.

    2016-03-01

    Cerebellar peduncles (CPs) are white matter tracts connecting the cerebellum to other brain regions. Automatic segmentation methods of the CPs have been proposed for studying their structure and function. Usually the performance of these methods is evaluated by comparing segmentation results with manual delineations (ground truth). However, when a segmentation method is run on new data (for which no ground truth exists) it is highly desirable to efficiently detect and assess algorithm failures so that these cases can be excluded from scientific analysis. In this work, two outlier detection methods aimed to assess the performance of an automatic CP segmentation algorithm are presented. The first one is a univariate non-parametric method using a box-whisker plot. We first categorize automatic segmentation results of a dataset of diffusion tensor imaging (DTI) scans from 48 subjects as either a success or a failure. We then design three groups of features from the image data of nine categorized failures for failure detection. Results show that most of these features can efficiently detect the true failures. The second method—supervised classification—was employed on a larger DTI dataset of 249 manually categorized subjects. Four classifiers—linear discriminant analysis (LDA), logistic regression (LR), support vector machine (SVM), and random forest classification (RFC)—were trained using the designed features and evaluated using a leave-one-out cross validation. Results show that the LR performs worst among the four classifiers and the other three perform comparably, which demonstrates the feasibility of automatically detecting segmentation failures using classification methods.

  12. Improving Electronic Sensor Reliability by Robust Outlier Screening

    Directory of Open Access Journals (Sweden)

    Federico Cuesta

    2013-10-01

    Full Text Available Electronic sensors are widely used in different application areas, and in some of them, such as automotive or medical equipment, they must perform with an extremely low defect rate. Increasing reliability is paramount. Outlier detection algorithms are a key component in screening latent defects and decreasing the number of customer quality incidents (CQIs. This paper focuses on new spatial algorithms (Good Die in a Bad Cluster with Statistical Bins (GDBC SB and Bad Bin in a Bad Cluster (BBBC and an advanced outlier screening method, called Robust Dynamic Part Averaging Testing (RDPAT, as well as two practical improvements, which significantly enhance existing algorithms. Those methods have been used in production in Freescale® Semiconductor probe factories around the world for several years. Moreover, a study was conducted with production data of 289,080 dice with 26 CQIs to determine and compare the efficiency and effectiveness of all these algorithms in identifying CQIs.

  13. Estimating the number of components and detecting outliers using Angle Distribution of Loading Subspaces (ADLS) in PCA analysis.

    Science.gov (United States)

    Liu, Y J; Tran, T; Postma, G; Buydens, L M C; Jansen, J

    2018-08-22

    Principal Component Analysis (PCA) is widely used in analytical chemistry, to reduce the dimensionality of a multivariate data set in a few Principal Components (PCs) that summarize the predominant patterns in the data. An accurate estimate of the number of PCs is indispensable to provide meaningful interpretations and extract useful information. We show how existing estimates for the number of PCs may fall short for datasets with considerable coherence, noise or outlier presence. We present here how Angle Distribution of the Loading Subspaces (ADLS) can be used to estimate the number of PCs based on the variability of loading subspace across bootstrap resamples. Based on comprehensive comparisons with other well-known methods applied on simulated dataset, we show that ADLS (1) may quantify the stability of a PCA model with several numbers of PCs simultaneously; (2) better estimate the appropriate number of PCs when compared with the cross-validation and scree plot methods, specifically for coherent data, and (3) facilitate integrated outlier detection, which we introduce in this manuscript. We, in addition, demonstrate how the analysis of different types of real-life spectroscopic datasets may benefit from these advantages of ADLS. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Analysis and detection of functional outliers in water quality parameters from different automated monitoring stations in the Nalón river basin (Northern Spain).

    Science.gov (United States)

    Piñeiro Di Blasi, J I; Martínez Torres, J; García Nieto, P J; Alonso Fernández, J R; Díaz Muñiz, C; Taboada, J

    2015-01-01

    The purposes and intent of the authorities in establishing water quality standards are to provide enhancement of water quality and prevention of pollution to protect the public health or welfare in accordance with the public interest for drinking water supplies, conservation of fish, wildlife and other beneficial aquatic life, and agricultural, industrial, recreational, and other reasonable and necessary uses as well as to maintain and improve the biological integrity of the waters. In this way, water quality controls involve a large number of variables and observations, often subject to some outliers. An outlier is an observation that is numerically distant from the rest of the data or that appears to deviate markedly from other members of the sample in which it occurs. An interesting analysis is to find those observations that produce measurements that are different from the pattern established in the sample. Therefore, identification of atypical observations is an important concern in water quality monitoring and a difficult task because of the multivariate nature of water quality data. Our study provides a new method for detecting outliers in water quality monitoring parameters, using turbidity, conductivity and ammonium ion as indicator variables. Until now, methods were based on considering the different parameters as a vector whose components were their concentration values. This innovative approach lies in considering water quality monitoring over time as continuous curves instead of discrete points, that is to say, the dataset of the problem are considered as a time-dependent function and not as a set of discrete values in different time instants. This new methodology, which is based on the concept of functional depth, was applied to the detection of outliers in water quality monitoring samples in the Nalón river basin with success. Results of this study were discussed here in terms of origin, causes, etc. Finally, the conclusions as well as advantages of

  15. Outlier identification and visualization for Pb concentrations in urban soils and its implications for identification of potential contaminated land

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Chaosheng, E-mail: chaosheng.zhang@nuigalway.i [School of Geography and Archaeology, National University of Ireland, Galway (Ireland); Tang Ya [Department of Environmental Sciences, Sichuan University, Chengdu, Sichuan 610065 (China); Luo Lin; Xu Weilin [State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu, Sichuan 610065 (China)

    2009-11-15

    Outliers in urban soil geochemical databases may imply potential contaminated land. Different methodologies which can be easily implemented for the identification of global and spatial outliers were applied for Pb concentrations in urban soils of Galway City in Ireland. Due to its strongly skewed probability feature, a Box-Cox transformation was performed prior to further analyses. The graphic methods of histogram and box-and-whisker plot were effective in identification of global outliers at the original scale of the dataset. Spatial outliers could be identified by a local indicator of spatial association of local Moran's I, cross-validation of kriging, and a geographically weighted regression. The spatial locations of outliers were visualised using a geographical information system. Different methods showed generally consistent results, but differences existed. It is suggested that outliers identified by statistical methods should be confirmed and justified using scientific knowledge before they are properly dealt with. - Outliers in urban geochemical databases can be detected to provide guidance for identification of potential contaminated land.

  16. Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Keita Mori

    2013-01-01

    Full Text Available Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.

  17. Optimum outlier model for potential improvement of environmental cleaning and disinfection.

    Science.gov (United States)

    Rupp, Mark E; Huerta, Tomas; Cavalieri, R J; Lyden, Elizabeth; Van Schooneveld, Trevor; Carling, Philip; Smith, Philip W

    2014-06-01

    The effectiveness and efficiency of 17 housekeepers in terminal cleaning 292 hospital rooms was evaluated through adenosine triphosphate detection. A subgroup of housekeepers was identified who were significantly more effective and efficient than their coworkers. These optimum outliers may be used in performance improvement to optimize environmental cleaning.

  18. Nonlinear Optimization-Based Device-Free Localization with Outlier Link Rejection

    Directory of Open Access Journals (Sweden)

    Wendong Xiao

    2015-04-01

    Full Text Available Device-free localization (DFL is an emerging wireless technique for estimating the location of target that does not have any attached electronic device. It has found extensive use in Smart City applications such as healthcare at home and hospitals, location-based services at smart spaces, city emergency response and infrastructure security. In DFL, wireless devices are used as sensors that can sense the target by transmitting and receiving wireless signals collaboratively. Many DFL systems are implemented based on received signal strength (RSS measurements and the location of the target is estimated by detecting the changes of the RSS measurements of the wireless links. Due to the uncertainty of the wireless channel, certain links may be seriously polluted and result in erroneous detection. In this paper, we propose a novel nonlinear optimization approach with outlier link rejection (NOOLR for RSS-based DFL. It consists of three key strategies, including: (1 affected link identification by differential RSS detection; (2 outlier link rejection via geometrical positional relationship among links; (3 target location estimation by formulating and solving a nonlinear optimization problem. Experimental results demonstrate that NOOLR is robust to the fluctuation of the wireless signals with superior localization accuracy compared with the existing Radio Tomographic Imaging (RTI approach.

  19. Construction of composite indices in presence of outliers

    OpenAIRE

    Mishra, SK

    2008-01-01

    Effects of outliers on mean, standard deviation and Pearson’s correlation coefficient are well known. The Principal Components analysis uses Pearson’s product moment correlation coefficients to construct composite indices from indicator variables and hence may be very sensitive to effects of outliers in data. Median, mean deviation and Bradley’s coefficient of absolute correlation are less susceptible to effects of outliers. This paper proposes a method to obtain composite indices by maximiza...

  20. Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points

    Directory of Open Access Journals (Sweden)

    Ettore Marubini

    2014-01-01

    Full Text Available This paper presents a robust two-stage procedure for identification of outlying observations in regression analysis. The exploratory stage identifies leverage points and vertical outliers through a robust distance estimator based on Minimum Covariance Determinant (MCD. After deletion of these points, the confirmatory stage carries out an Ordinary Least Squares (OLS analysis on the remaining subset of data and investigates the effect of adding back in the previously deleted observations. Cut-off points pertinent to different diagnostics are generated by bootstrapping and the cases are definitely labelled as good-leverage, bad-leverage, vertical outliers and typical cases. The procedure is applied to four examples.

  1. GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.

    Directory of Open Access Journals (Sweden)

    John Patrick Mpindi

    Full Text Available BACKGROUND: Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type ('outlier genes', a hallmark of potential oncogenes. METHODOLOGY: A new statistical method (the gene tissue index, GTI was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29 of these genes, and 17 of these 19 genes (90% showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target. CONCLUSIONS/SIGNIFICANCE: Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is

  2. A computational study on outliers in world music.

    Science.gov (United States)

    Panteli, Maria; Benetos, Emmanouil; Dixon, Simon

    2017-01-01

    The comparative analysis of world music cultures has been the focus of several ethnomusicological studies in the last century. With the advances of Music Information Retrieval and the increased accessibility of sound archives, large-scale analysis of world music with computational tools is today feasible. We investigate music similarity in a corpus of 8200 recordings of folk and traditional music from 137 countries around the world. In particular, we aim to identify music recordings that are most distinct compared to the rest of our corpus. We refer to these recordings as 'outliers'. We use signal processing tools to extract music information from audio recordings, data mining to quantify similarity and detect outliers, and spatial statistics to account for geographical correlation. Our findings suggest that Botswana is the country with the most distinct recordings in the corpus and China is the country with the most distinct recordings when considering spatial correlation. Our analysis includes a comparison of musical attributes and styles that contribute to the 'uniqueness' of the music of each country.

  3. Poland’s Trade with East Asia: An Outlier Approach

    Directory of Open Access Journals (Sweden)

    Tseng Shoiw-Mei

    2015-12-01

    Full Text Available Poland achieved an excellent reputation for economic transformation during the recent global recession. The European debt crisis, however, quickly forced the reorientation of Poland’s trade outside of the European Union (EU, especially toward the dynamic region of East Asia. This study analyzes time series data from 1999 to 2013 to detect outliers in order to determine the bilateral trade paths between Poland and each East Asian country during the events of Poland’s accession to the EU in 2004, the global financial crisis from 2008 to 2009, and the European debt crisis from 2010 to 2013. From the Polish standpoint, the results showed significantly clustering outliers in the above periods and in the general trade paths from dependence through distancing and improvement to the chance of approaching East Asian partners. This study also shows that not only China but also several other countries present an excellent opportunity for boosting bilateral trade, especially with regard to Poland’s exports.

  4. Portraying the Expression Landscapes of B-CellLymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

    Directory of Open Access Journals (Sweden)

    Lydia Hopp

    2013-12-01

    Full Text Available We present an analytic framework based on Self-Organizing Map (SOM machine learning to study large scale patient data sets. The potency of the approach is demonstrated in a case study using gene expression data of more than 200 mature aggressive B-cell lymphoma patients. The method portrays each sample with individual resolution, characterizes the subtypes, disentangles the expression patterns into distinct modules, extracts their functional context using enrichment techniques and enables investigation of the similarity relations between the samples. The method also allows to detect and to correct outliers caused by contaminations. Based on our analysis, we propose a refined classification of B-cell Lymphoma into four molecular subtypes which are characterized by differential functional and clinical characteristics.

  5. Comparison of tests for spatial heterogeneity on data with global clustering patterns and outliers

    Directory of Open Access Journals (Sweden)

    Hachey Mark

    2009-10-01

    Full Text Available Abstract Background The ability to evaluate geographic heterogeneity of cancer incidence and mortality is important in cancer surveillance. Many statistical methods for evaluating global clustering and local cluster patterns are developed and have been examined by many simulation studies. However, the performance of these methods on two extreme cases (global clustering evaluation and local anomaly (outlier detection has not been thoroughly investigated. Methods We compare methods for global clustering evaluation including Tango's Index, Moran's I, and Oden's I*pop; and cluster detection methods such as local Moran's I and SaTScan elliptic version on simulated count data that mimic global clustering patterns and outliers for cancer cases in the continental United States. We examine the power and precision of the selected methods in the purely spatial analysis. We illustrate Tango's MEET and SaTScan elliptic version on a 1987-2004 HIV and a 1950-1969 lung cancer mortality data in the United States. Results For simulated data with outlier patterns, Tango's MEET, Moran's I and I*pop had powers less than 0.2, and SaTScan had powers around 0.97. For simulated data with global clustering patterns, Tango's MEET and I*pop (with 50% of total population as the maximum search window had powers close to 1. SaTScan had powers around 0.7-0.8 and Moran's I has powers around 0.2-0.3. In the real data example, Tango's MEET indicated the existence of global clustering patterns in both the HIV and lung cancer mortality data. SaTScan found a large cluster for HIV mortality rates, which is consistent with the finding from Tango's MEET. SaTScan also found clusters and outliers in the lung cancer mortality data. Conclusion SaTScan elliptic version is more efficient for outlier detection compared with the other methods evaluated in this article. Tango's MEET and Oden's I*pop perform best in global clustering scenarios among the selected methods. The use of SaTScan for

  6. Quartile and Outlier Detection on Heterogeneous Clusters Using Distributed Radix Sort

    International Nuclear Information System (INIS)

    Meredith, Jeremy S.; Vetter, Jeffrey S.

    2011-01-01

    In the past few years, performance improvements in CPUs and memory technologies have outpaced those of storage systems. When extrapolated to the exascale, this trend places strict limits on the amount of data that can be written to disk for full analysis, resulting in an increased reliance on characterizing in-memory data. Many of these characterizations are simple, but require sorted data. This paper explores an example of this type of characterization - the identification of quartiles and statistical outliers - and presents a performance analysis of a distributed heterogeneous radix sort as well as an assessment of current architectural bottlenecks.

  7. Outlier Detection in Regression Using an Iterated One-Step Approximation to the Huber-Skip Estimator

    DEFF Research Database (Denmark)

    Johansen, Søren; Nielsen, Bent

    2013-01-01

    In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber-skip estima......In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber...... that the normalized estimation errors are tight and are close to a linear function of the kernel, thus providing a stochastic expansion of the estimators, which is the same as for the Huber-skip. This implies that the iterated estimator is a close approximation of the Huber-skip...

  8. Deteksi Outlier Transaksi Menggunakan Visualisasi-Olap Pada Data Warehouse Perguruan Tinggi Swasta

    Directory of Open Access Journals (Sweden)

    Gusti Ngurah Mega Nata

    2016-07-01

    Full Text Available Mendeteksi outlier pada data warehouse merupakan hal penting. Data pada data warehouse sudah diagregasi dan memiliki model multidimensional. Agregasi pada data warehouse dilakukan karena data warehouse digunakan untuk menganalisis data secara cepat pada top level manajemen. Sedangkan, model data multidimensional digunakan untuk melihat data dari berbagai dimensi objek bisnis. Jadi, Mendeteksi outlier pada data warehouse membutuhkan teknik yang dapat melihat outlier pada data yang sudah diagregasi dan dapat melihat dari berbagai dimensi objek bisnis. Mendeteksi outlier pada data warehouse akan menjadi tantangan baru.        Di lain hal, Visualisasi On-line Analytic process (OLAP merupakan tugas penting dalam menyajikan informasi trend (report pada data warehouse dalam bentuk visualisasi data. Pada penelitian ini, visualisasi OLAP digunakan untuk deteksi outlier transaksi. Maka, dalam penelitian ini melakukan analisis untuk mendeteksi outlier menggunakan visualisasi-OLAP. Operasi OLAP yang digunakan yaitu operasi drill-down. Jenis visualisasi yang akan digunakan yaitu visualisasi satu dimensi, dua dimensi dan multi dimensi menggunakan tool weave desktop. Pembangunan data warehouse dilakukan secara button-up. Studi kasus dilakukan pada perguruan tinggi swasta. Kasus yang diselesaikan yaitu mendeteksi outlier transaki pembayaran mahasiswa pada setiap semester. Deteksi outlier pada visualisasi data menggunakan satu tabel dimensional lebih mudah dianalisis dari pada deteksi outlier pada visualisasi data menggunakan dua atau multi tabel dimensional. Dengan kata lain semakin banyak tabel dimensi yang terlibat semakin sulit analisis deteksi outlier yang dilakukan. Kata kunci — Deteksi Outlier,  Visualisasi OLAP, Data warehouse

  9. Treatment of Outliers via Interpolation Method with Neural Network Forecast Performances

    Science.gov (United States)

    Wahir, N. A.; Nor, M. E.; Rusiman, M. S.; Gopal, K.

    2018-04-01

    Outliers often lurk in many datasets, especially in real data. Such anomalous data can negatively affect statistical analyses, primarily normality, variance, and estimation aspects. Hence, handling the occurrences of outliers require special attention. Therefore, it is important to determine the suitable ways in treating outliers so as to ensure that the quality of the analyzed data is indeed high. As such, this paper discusses an alternative method to treat outliers via linear interpolation method. In fact, assuming outlier as a missing value in the dataset allows the application of the interpolation method to interpolate the outliers thus, enabling the comparison of data series using forecast accuracy before and after outlier treatment. With that, the monthly time series of Malaysian tourist arrivals from January 1998 until December 2015 had been used to interpolate the new series. The results indicated that the linear interpolation method, which was comprised of improved time series data, displayed better results, when compared to the original time series data in forecasting from both Box-Jenkins and neural network approaches.

  10. A Geometrical-Statistical Approach to Outlier Removal for TDOA Measurements

    Science.gov (United States)

    Compagnoni, Marco; Pini, Alessia; Canclini, Antonio; Bestagini, Paolo; Antonacci, Fabio; Tubaro, Stefano; Sarti, Augusto

    2017-08-01

    The curse of outlier measurements in estimation problems is a well known issue in a variety of fields. Therefore, outlier removal procedures, which enables the identification of spurious measurements within a set, have been developed for many different scenarios and applications. In this paper, we propose a statistically motivated outlier removal algorithm for time differences of arrival (TDOAs), or equivalently range differences (RD), acquired at sensor arrays. The method exploits the TDOA-space formalism and works by only knowing relative sensor positions. As the proposed method is completely independent from the application for which measurements are used, it can be reliably used to identify outliers within a set of TDOA/RD measurements in different fields (e.g. acoustic source localization, sensor synchronization, radar, remote sensing, etc.). The proposed outlier removal algorithm is validated by means of synthetic simulations and real experiments.

  11. Segmentation by Large Scale Hypothesis Testing - Segmentation as Outlier Detection

    DEFF Research Database (Denmark)

    Darkner, Sune; Dahl, Anders Lindbjerg; Larsen, Rasmus

    2010-01-01

    a microscope and we show how the method can handle transparent particles with significant glare point. The method generalizes to other problems. THis is illustrated by applying the method to camera calibration images and MRI of the midsagittal plane for gray and white matter separation and segmentation......We propose a novel and efficient way of performing local image segmentation. For many applications a threshold of pixel intensities is sufficient but determine the appropriate threshold value can be difficult. In cases with large global intensity variation the threshold value has to be adapted...... locally. We propose a method based on large scale hypothesis testing with a consistent method for selecting an appropriate threshold for the given data. By estimating the background distribution we characterize the segment of interest as a set of outliers with a certain probability based on the estimated...

  12. The Space-Time Variation of Global Crop Yields, Detecting Simultaneous Outliers and Identifying the Teleconnections with Climatic Patterns

    Science.gov (United States)

    Najafi, E.; Devineni, N.; Pal, I.; Khanbilvardi, R.

    2017-12-01

    An understanding of the climate factors that influence the space-time variability of crop yields is important for food security purposes and can help us predict global food availability. In this study, we address how the crop yield trends of countries globally were related to each other during the last several decades and the main climatic variables that triggered high/low crop yields simultaneously across the world. Robust Principal Component Analysis (rPCA) is used to identify the primary modes of variation in wheat, maize, sorghum, rice, soybeans, and barley yields. Relations between these modes of variability and important climatic variables, especially anomalous sea surface temperature (SSTa), are examined from 1964 to 2010. rPCA is also used to identify simultaneous outliers in each year, i.e. systematic high/low crop yields across the globe. The results demonstrated spatiotemporal patterns of these crop yields and the climate-related events that caused them as well as the connection of outliers with weather extremes. We find that among climatic variables, SST has had the most impact on creating simultaneous crop yields variability and yield outliers in many countries. An understanding of this phenomenon can benefit global crop trade networks.

  13. Detection of outliers by neural network on the gas centrifuge experimental data of isotopic separation process; Aplicacao de redes neurais para deteccao de erros grosseiros em dados de processo de separacao de isotopos de uranio por ultracentrifugacao

    Energy Technology Data Exchange (ETDEWEB)

    Andrade, Monica de Carvalho Vasconcelos

    2004-07-01

    This work presents and discusses the neural network technique aiming at the detection of outliers on a set of gas centrifuge isotope separation experimental data. In order to evaluate the application of this new technique, the result obtained of the detection is compared to the result of the statistical analysis combined with the cluster analysis. This method for the detection of outliers presents a considerable potential in the field of data analysis and it is at the same time easier and faster to use and requests very less knowledge of the physics involved in the process. This work established a procedure for detecting experiments which are suspect to contain gross errors inside a data set where the usual techniques for identification of these errors cannot be applied or its use/demands an excessively long work. (author)

  14. Reduction of ZTD outliers through improved GNSS data processing and screening strategies

    Science.gov (United States)

    Stepniak, Katarzyna; Bock, Olivier; Wielgosz, Pawel

    2018-03-01

    Though Global Navigation Satellite System (GNSS) data processing has been significantly improved over the years, it is still commonly observed that zenith tropospheric delay (ZTD) estimates contain many outliers which are detrimental to meteorological and climatological applications. In this paper, we show that ZTD outliers in double-difference processing are mostly caused by sub-daily data gaps at reference stations, which cause disconnections of clusters of stations from the reference network and common mode biases due to the strong correlation between stations in short baselines. They can reach a few centimetres in ZTD and usually coincide with a jump in formal errors. The magnitude and sign of these biases are impossible to predict because they depend on different errors in the observations and on the geometry of the baselines. We elaborate and test a new baseline strategy which solves this problem and significantly reduces the number of outliers compared to the standard strategy commonly used for positioning (e.g. determination of national reference frame) in which the pre-defined network is composed of a skeleton of reference stations to which secondary stations are connected in a star-like structure. The new strategy is also shown to perform better than the widely used strategy maximizing the number of observations available in many GNSS programs. The reason is that observations are maximized before processing, whereas the final number of used observations can be dramatically lower because of data rejection (screening) during the processing. The study relies on the analysis of 1 year of GPS (Global Positioning System) data from a regional network of 136 GNSS stations processed using Bernese GNSS Software v.5.2. A post-processing screening procedure is also proposed to detect and remove a few outliers which may still remain due to short data gaps. It is based on a combination of range checks and outlier checks of ZTD and formal errors. The accuracy of the

  15. Robust data reconciliation and outlier detection with swarm intelligence in a thermal reactor power calculation

    Energy Technology Data Exchange (ETDEWEB)

    Valdetaro, Eduardo Damianik, E-mail: valdtar@eletronuclear.gov.br [ELETRONUCLEAR - ELETROBRAS, Angra dos Reis, RJ (Brazil). Angra 2 Operating Dept.; Coordenacao dos Programas de Pos-Graduacao de Engenharia (PEN/COPPE/UFRJ), RJ (Brazil). Programa de Engenharia Nuclear; Schirru, Roberto, E-mail: schirru@lmp.ufrj.br [Coordenacao dos Programas de Pos-Graduacao de Engenharia (PEN/COPPE/UFRJ), RJ (Brazil). Programa de Engenharia Nuclear

    2011-07-01

    In Nuclear power plants, Data Reconciliation (DR) and Gross Errors Detection (GED) are techniques of increasing interest and are primarily used to keep mass and energy balance into account, which brings outcomes as a direct and indirect financial benefits. Data reconciliation is formulated by a constrained minimization problem, where the constraints correspond to energy and mass balance model. Statistical methods are used combined with the minimization of quadratic error form. Solving nonlinear optimization problem using conventional methods can be troublesome, because a multimodal function with differentiated solutions introduces some difficulties to search an optimal solution. Many techniques were developed to solve Data Reconciliation and Outlier Detection, some of them use, for example, Quadratic Programming, Lagrange Multipliers, Mixed-Integer Non Linear Programming and others use evolutionary algorithms like Genetic Algorithms (GA) and recently the use of the Particle Swarm Optimization (PSO) showed to be a potential tool as a global optimization algorithm when applied to data reconciliation. Robust Statistics is also increasing in interest and it is being used when measured data are contaminated by random errors and one can not assume the error is normally distributed, situation which reflects real problems situation. The aim of this work is to present a brief comparison between the classical data reconciliation technique and the robust data reconciliation and gross error detection with swarm intelligence procedure in calculating the thermal reactor power for a simplified heat circuit diagram of a steam turbine plant using real data obtained from Angra 2 Nuclear power plant. The main objective is to test the potential of the robust DR and GED method in a integrated framework using swarm intelligence and the three part redescending estimator of Hampel when applied to a real process condition. The results evaluate the potential use of the robust technique in

  16. Robust data reconciliation and outlier detection with swarm intelligence in a thermal reactor power calculation

    International Nuclear Information System (INIS)

    Valdetaro, Eduardo Damianik; Coordenacao dos Programas de Pos-Graduacao de Engenharia; Schirru, Roberto

    2011-01-01

    In Nuclear power plants, Data Reconciliation (DR) and Gross Errors Detection (GED) are techniques of increasing interest and are primarily used to keep mass and energy balance into account, which brings outcomes as a direct and indirect financial benefits. Data reconciliation is formulated by a constrained minimization problem, where the constraints correspond to energy and mass balance model. Statistical methods are used combined with the minimization of quadratic error form. Solving nonlinear optimization problem using conventional methods can be troublesome, because a multimodal function with differentiated solutions introduces some difficulties to search an optimal solution. Many techniques were developed to solve Data Reconciliation and Outlier Detection, some of them use, for example, Quadratic Programming, Lagrange Multipliers, Mixed-Integer Non Linear Programming and others use evolutionary algorithms like Genetic Algorithms (GA) and recently the use of the Particle Swarm Optimization (PSO) showed to be a potential tool as a global optimization algorithm when applied to data reconciliation. Robust Statistics is also increasing in interest and it is being used when measured data are contaminated by random errors and one can not assume the error is normally distributed, situation which reflects real problems situation. The aim of this work is to present a brief comparison between the classical data reconciliation technique and the robust data reconciliation and gross error detection with swarm intelligence procedure in calculating the thermal reactor power for a simplified heat circuit diagram of a steam turbine plant using real data obtained from Angra 2 Nuclear power plant. The main objective is to test the potential of the robust DR and GED method in a integrated framework using swarm intelligence and the three part redescending estimator of Hampel when applied to a real process condition. The results evaluate the potential use of the robust technique in

  17. A Pareto scale-inflated outlier model and its Bayesian analysis

    OpenAIRE

    Scollnik, David P. M.

    2016-01-01

    This paper develops a Pareto scale-inflated outlier model. This model is intended for use when data from some standard Pareto distribution of interest is suspected to have been contaminated with a relatively small number of outliers from a Pareto distribution with the same shape parameter but with an inflated scale parameter. The Bayesian analysis of this Pareto scale-inflated outlier model is considered and its implementation using the Gibbs sampler is discussed. The paper contains three wor...

  18. Displaying an Outlier in Multivariate Data | Gordor | Journal of ...

    African Journals Online (AJOL)

    ... a multivariate data set is proposed. The technique involves the projection of the multidimensional data onto a single dimension called the outlier displaying component. When the observations are plotted on this component the outlier is appreciably revealed. Journal of Applied Science and Technology (JAST), Vol. 4, Nos.

  19. Latent Clustering Models for Outlier Identification in Telecom Data

    Directory of Open Access Journals (Sweden)

    Ye Ouyang

    2016-01-01

    Full Text Available Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or technical problems. Clustering models can help to identify issues by showing patterns in network data, which can quickly catch anomalies and highlight previously unseen outliers. In this article, we develop and compare clustering models for telecom data, focusing on those that include time-stamp information management. Two main models are introduced, solved in detail, and analyzed: Gaussian Probabilistic Latent Semantic Analysis (GPLSA and time-dependent Gaussian Mixture Models (time-GMM. These models are then compared with other different clustering models, such as Gaussian model and GMM (which do not contain time-stamp information. We perform computation on both sample and telecom traffic data to show that the efficiency and robustness of GPLSA make it the superior method to detect outliers and provide results automatically with low tuning parameters or expertise requirement.

  20. The high cost of low-acuity ICU outliers.

    Science.gov (United States)

    Dahl, Deborah; Wojtal, Greg G; Breslow, Michael J; Holl, Randy; Huguez, Debra; Stone, David; Korpi, Gloria

    2012-01-01

    Direct variable costs were determined on each hospital day for all patients with an intensive care unit (ICU) stay in four Phoenix-area hospital ICUs. Average daily direct variable cost in the four ICUs ranged from $1,436 to $1,759 and represented 69.4 percent and 45.7 percent of total hospital stay cost for medical and surgical patients, respectively. Daily ICU cost and length of stay (LOS) were higher in patients with higher ICU admission acuity of illness as measured by the APACHE risk prediction methodology; 16.2 percent of patients had an ICU stay in excess of six days, and these LOS outliers accounted for 56.7 percent of total ICU cost. While higher-acuity patients were more likely to be ICU LOS outliers, 11.1 percent of low-risk patients were outliers. The low-risk group included 69.4 percent of the ICU population and accounted for 47 percent of all LOS outliers. Low-risk LOS outliers accounted for 25.3 percent of ICU cost and incurred fivefold higher hospital stay costs and mortality rates. These data suggest that severity of illness is an important determinant of daily resource consumption and LOS, regardless of whether the patient arrives in the ICU with high acuity or develops complications that increase acuity. The finding that a substantial number of long-stay patients come into the ICU with low acuity and deteriorate after ICU admission is not widely recognized and represents an important opportunity to improve patient outcomes and lower costs. ICUs should consider adding low-risk LOS data to their quality and financial performance reports.

  1. The masking breakdown point of multivariate outlier identification rules

    OpenAIRE

    Becker, Claudia; Gather, Ursula

    1997-01-01

    In this paper, we consider one-step outlier identifiation rules for multivariate data, generalizing the concept of so-called alpha outlier identifiers, as presented in Davies and Gather (1993) for the case of univariate samples. We investigate, how the finite-sample breakdown points of estimators used in these identification rules influence the masking behaviour of the rules.

  2. Ground-based remote sensing of HDO/H2O ratio profiles: introduction and validation of an innovative retrieval approach

    Science.gov (United States)

    Schneider, M.; Hase, F.; Blumenstock, T.

    2006-10-01

    We propose an innovative approach for analysing ground-based FTIR spectra which allows us to detect variabilities of lower and middle/upper tropospheric HDO/H2O ratios. We show that the proposed method is superior to common approaches. We estimate that lower tropospheric HDO/H2O ratios can be detected with a noise to signal ratio of 15% and middle/upper tropospheric ratios with a noise to signal ratio of 50%. The method requires the inversion to be performed on a logarithmic scale and to introduce an inter-species constraint. While common methods calculate the isotope ratio posterior to an independent, optimal estimation of the HDO and H2O profile, the proposed approach is an optimal estimator for the ratio itself. We apply the innovative approach to spectra measured continuously during 15 months and present, for the first time, an annual cycle of tropospheric HDO/H2O ratio profiles as detected by ground-based measurements. Outliers in the detected middle/upper tropospheric ratios are interpreted by backward trajectories.

  3. The variance of length of stay and the optimal DRG outlier payments.

    Science.gov (United States)

    Felder, Stefan

    2009-09-01

    Prospective payment schemes in health care often include supply-side insurance for cost outliers. In hospital reimbursement, prospective payments for patient discharges, based on their classification into diagnosis related group (DRGs), are complemented by outlier payments for long stay patients. The outlier scheme fixes the length of stay (LOS) threshold, constraining the profit risk of the hospitals. In most DRG systems, this threshold increases with the standard deviation of the LOS distribution. The present paper addresses the adequacy of this DRG outlier threshold rule for risk-averse hospitals with preferences depending on the expected value and the variance of profits. It first shows that the optimal threshold solves the hospital's tradeoff between higher profit risk and lower premium loading payments. It then demonstrates for normally distributed truncated LOS that the optimal outlier threshold indeed decreases with an increase in the standard deviation.

  4. A method for separating seismo-ionospheric TEC outliers from heliogeomagnetic disturbances by using nu-SVR

    Energy Technology Data Exchange (ETDEWEB)

    Pattisahusiwa, Asis [Bandung Institute of Technology (Indonesia); Liong, The Houw; Purqon, Acep [Earth physics and complex systems research group, Bandung Institute of Technology (Indonesia)

    2015-09-30

    Seismo-Ionospheric is a study of ionosphere disturbances associated with seismic activities. In many previous researches, heliogeomagnetic or strong earthquake activities can caused the disturbances in the ionosphere. However, it is difficult to separate these disturbances based on related sources. In this research, we proposed a method to separate these disturbances/outliers by using nu-SVR with the world-wide GPS data. TEC data related to the 26th December 2004 Sumatra and the 11th March 2011 Honshu earthquakes had been analyzed. After analyzed TEC data in several location around the earthquake epicenter and compared with geomagnetic data, the method shows a good result in the average to detect the source of these outliers. This method is promising to use in the future research.

  5. What 'outliers' tell us about missed opportunities for tuberculosis control: a cross-sectional study of patients in Mumbai, India

    Directory of Open Access Journals (Sweden)

    Porter John DH

    2010-05-01

    Full Text Available Abstract Background India's Revised National Tuberculosis Control Programme (RNTCP is deemed highly successful in terms of detection and cure rates. However, some patients experience delays in accessing diagnosis and treatment. Patients falling between the 96th and 100th percentiles for these access indicators are often ignored as atypical 'outliers' when assessing programme performance. They may, however, provide clues to understanding why some patients never reach the programme. This paper examines the underlying vulnerabilities of patients with extreme values for delays in accessing the RNTCP in Mumbai city, India. Methods We conducted a cross-sectional study with 266 new sputum positive patients registered with the RNTCP in Mumbai. Patients were classified as 'outliers' if patient, provider and system delays were beyond the 95th percentile for the respective variable. Case profiles of 'outliers' for patient, provider and system delays were examined and compared with the rest of the sample to identify key factors responsible for delays. Results Forty-two patients were 'outliers' on one or more of the delay variables. All 'outliers' had a significantly lower per capita income than the remaining sample. The lack of economic resources was compounded by social, structural and environmental vulnerabilities. Longer patient delays were related to patients' perception of symptoms as non-serious. Provider delays were incurred as a result of private providers' failure to respond to tuberculosis in a timely manner. Diagnostic and treatment delays were minimal, however, analysis of the 'outliers' revealed the importance of social support in enabling access to the programme. Conclusion A proxy for those who fail to reach the programme, these case profiles highlight unique vulnerabilities that need innovative approaches by the RNTCP. The focus on 'outliers' provides a less resource- and time-intensive alternative to community-based studies for

  6. Identificación de outliers en muestras multivariantes

    OpenAIRE

    Pérez Díez de los Ríos, José Luis

    1987-01-01

    En esta memoria se analiza la problemática de las observaciones Outliers en nuestras Multivariantes describiéndose las distintas técnicas que existen en la actualidad para la identificación de Outliers en nuestras multidimensionales y poniéndose de manifiesto que la mayoría de ellas son generalizaciones de ideas desarrolladas para el caso univariante o técnicas basadas en representaciones graficas. Se aborda a continuación el denominado efecto de enmascaramiento que se puede presentar cuando...

  7. Robust volcano plot: identification of differential metabolites in the presence of outliers.

    Science.gov (United States)

    Kumar, Nishith; Hoque, Md Aminul; Sugimoto, Masahiro

    2018-04-11

    The identification of differential metabolites in metabolomics is still a big challenge and plays a prominent role in metabolomics data analyses. Metabolomics datasets often contain outliers because of analytical, experimental, and biological ambiguity, but the currently available differential metabolite identification techniques are sensitive to outliers. We propose a kernel weight based outlier-robust volcano plot for identifying differential metabolites from noisy metabolomics datasets. Two numerical experiments are used to evaluate the performance of the proposed technique against nine existing techniques, including the t-test and the Kruskal-Wallis test. Artificially generated data with outliers reveal that the proposed method results in a lower misclassification error rate and a greater area under the receiver operating characteristic curve compared with existing methods. An experimentally measured breast cancer dataset to which outliers were artificially added reveals that our proposed method produces only two non-overlapping differential metabolites whereas the other nine methods produced between seven and 57 non-overlapping differential metabolites. Our data analyses show that the performance of the proposed differential metabolite identification technique is better than that of existing methods. Thus, the proposed method can contribute to analysis of metabolomics data with outliers. The R package and user manual of the proposed method are available at https://github.com/nishithkumarpaul/Rvolcano .

  8. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study.

    Science.gov (United States)

    Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P

    2014-06-26

    To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.

  9. Outlier identification in urban soils and its implications for identification of potential contaminated land

    Science.gov (United States)

    Zhang, Chaosheng

    2010-05-01

    Outliers in urban soil geochemical databases may imply potential contaminated land. Different methodologies which can be easily implemented for the identification of global and spatial outliers were applied for Pb concentrations in urban soils of Galway City in Ireland. Due to its strongly skewed probability feature, a Box-Cox transformation was performed prior to further analyses. The graphic methods of histogram and box-and-whisker plot were effective in identification of global outliers at the original scale of the dataset. Spatial outliers could be identified by a local indicator of spatial association of local Moran's I, cross-validation of kriging, and a geographically weighted regression. The spatial locations of outliers were visualised using a geographical information system. Different methods showed generally consistent results, but differences existed. It is suggested that outliers identified by statistical methods should be confirmed and justified using scientific knowledge before they are properly dealt with.

  10. Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling.

    Science.gov (United States)

    Keshtkaran, Mohammad Reza; Yang, Zhi

    2017-06-01

    Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.

  11. Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling

    Science.gov (United States)

    Keshtkaran, Mohammad Reza; Yang, Zhi

    2017-06-01

    Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.

  12. Stoicism, the physician, and care of medical outliers

    Directory of Open Access Journals (Sweden)

    Papadimos Thomas J

    2004-12-01

    Full Text Available Abstract Background Medical outliers present a medical, psychological, social, and economic challenge to the physicians who care for them. The determinism of Stoic thought is explored as an intellectual basis for the pursuit of a correct mental attitude that will provide aid and comfort to physicians who care for medical outliers, thus fostering continued physician engagement in their care. Discussion The Stoic topics of good, the preferable, the morally indifferent, living consistently, and appropriate actions are reviewed. Furthermore, Zeno's cardinal virtues of Justice, Temperance, Bravery, and Wisdom are addressed, as are the Stoic passions of fear, lust, mental pain, and mental pleasure. These concepts must be understood by physicians if they are to comprehend and accept the Stoic view as it relates to having the proper attitude when caring for those with long-term and/or costly illnesses. Summary Practicing physicians, especially those that are hospital based, and most assuredly those practicing critical care medicine, will be emotionally challenged by the medical outlier. A Stoic approach to such a social and psychological burden may be of benefit.

  13. Pathway-based outlier method reveals heterogeneous genomic structure of autism in blood transcriptome.

    Science.gov (United States)

    Campbell, Malcolm G; Kohane, Isaac S; Kong, Sek Won

    2013-09-24

    Decades of research strongly suggest that the genetic etiology of autism spectrum disorders (ASDs) is heterogeneous. However, most published studies focus on group differences between cases and controls. In contrast, we hypothesized that the heterogeneity of the disorder could be characterized by identifying pathways for which individuals are outliers rather than pathways representative of shared group differences of the ASD diagnosis. Two previously published blood gene expression data sets--the Translational Genetics Research Institute (TGen) dataset (70 cases and 60 unrelated controls) and the Simons Simplex Consortium (Simons) dataset (221 probands and 191 unaffected family members)--were analyzed. All individuals of each dataset were projected to biological pathways, and each sample's Mahalanobis distance from a pooled centroid was calculated to compare the number of case and control outliers for each pathway. Analysis of a set of blood gene expression profiles from 70 ASD and 60 unrelated controls revealed three pathways whose outliers were significantly overrepresented in the ASD cases: neuron development including axonogenesis and neurite development (29% of ASD, 3% of control), nitric oxide signaling (29%, 3%), and skeletal development (27%, 3%). Overall, 50% of cases and 8% of controls were outliers in one of these three pathways, which could not be identified using group comparison or gene-level outlier methods. In an independently collected data set consisting of 221 ASD and 191 unaffected family members, outliers in the neurogenesis pathway were heavily biased towards cases (20.8% of ASD, 12.0% of control). Interestingly, neurogenesis outliers were more common among unaffected family members (Simons) than unrelated controls (TGen), but the statistical significance of this effect was marginal (Chi squared P < 0.09). Unlike group difference approaches, our analysis identified the samples within the case and control groups that manifested each expression

  14. Quality of Care at Hospitals Identified as Outliers in Publicly Reported Mortality Statistics for Percutaneous Coronary Intervention.

    Science.gov (United States)

    Waldo, Stephen W; McCabe, James M; Kennedy, Kevin F; Zigler, Corwin M; Pinto, Duane S; Yeh, Robert W

    2017-05-16

    Public reporting of percutaneous coronary intervention (PCI) outcomes may create disincentives for physicians to provide care for critically ill patients, particularly at institutions with worse clinical outcomes. We thus sought to evaluate the procedural management and in-hospital outcomes of patients treated for acute myocardial infarction before and after a hospital had been publicly identified as a negative outlier. Using state reports, we identified hospitals that were recognized as negative PCI outliers in 2 states (Massachusetts and New York) from 2002 to 2012. State hospitalization files were used to identify all patients with an acute myocardial infarction within these states. Procedural management and in-hospital outcomes were compared among patients treated at outlier hospitals before and after public report of outlier status. Patients at nonoutlier institutions were used to control for temporal trends. Among 86 hospitals, 31 were reported as outliers for excess mortality. Outlier facilities were larger, treating more patients with acute myocardial infarction and performing more PCIs than nonoutlier hospitals ( P fashion (interaction P =0.50) after public report of outlier status. The likelihood of in-hospital mortality decreased at outlier institutions (RR, 0.83; 95% CI, 0.81-0.85) after public report, and to a lesser degree at nonoutlier institutions (RR, 0.90; 95% CI, 0.87-0.92; interaction P <0.001). Among patients that underwent PCI, in-hospital mortality decreased at outlier institutions after public recognition of outlier status in comparison with prior (RR, 0.72; 9% CI, 0.66-0.79), a decline that exceeded the reduction at nonoutlier institutions (RR, 0.87; 95% CI, 0.80-0.96; interaction P <0.001). Large hospitals with higher clinical volume are more likely to be designated as negative outliers. The rates of percutaneous revascularization increased similarly at outlier and nonoutlier institutions after report of outlier status. After outlier

  15. Outliers and Extremes: Dragon-Kings or Dragon-Fools?

    Science.gov (United States)

    Schertzer, D. J.; Tchiguirinskaia, I.; Lovejoy, S.

    2012-12-01

    Geophysics seems full of monsters like Victor Hugo's Court of Miracles and monstrous extremes have been statistically considered as outliers with respect to more normal events. However, a characteristic magnitude separating abnormal events from normal ones would be at odd with the generic scaling behaviour of nonlinear systems, contrary to "fat tailed" probability distributions and self-organized criticality. More precisely, it can be shown [1] how the apparent monsters could be mere manifestations of a singular measure mishandled as a regular measure. Monstrous fluctuations are the rule, not outliers and they are more frequent than usually thought up to the point that (theoretical) statistical moments can easily be infinite. The empirical estimates of the latter are erratic and diverge with sample size. The corresponding physics is that intense small scale events cannot be smoothed out by upscaling. However, based on a few examples, it has also been argued [2] that one should consider "genuine" outliers of fat tailed distributions so monstrous that they can be called "dragon-kings". We critically analyse these arguments, e.g. finite sample size and statistical estimates of the largest events, multifractal phase transition vs. more classical phase transition. We emphasize the fact that dragon-kings are not needed in order that the largest events become predictable. This is rather reminiscent of the Feast of Fools picturesquely described by Victor Hugo. [1] D. Schertzer, I. Tchiguirinskaia, S. Lovejoy et P. Hubert (2010): No monsters, no miracles: in nonlinear sciences hydrology is not an outlier! Hydrological Sciences Journal, 55 (6) 965 - 979. [2] D. Sornette (2009): Dragon-Kings, Black Swans and the Prediction of Crises. International Journal of Terraspace Science and Engineering 1(3), 1-17.

  16. Raman fiber-optical method for colon cancer detection: Cross-validation and outlier identification approach

    Science.gov (United States)

    Petersen, D.; Naveed, P.; Ragheb, A.; Niedieker, D.; El-Mashtoly, S. F.; Brechmann, T.; Kötting, C.; Schmiegel, W. H.; Freier, E.; Pox, C.; Gerwert, K.

    2017-06-01

    Endoscopy plays a major role in early recognition of cancer which is not externally accessible and therewith in increasing the survival rate. Raman spectroscopic fiber-optical approaches can help to decrease the impact on the patient, increase objectivity in tissue characterization, reduce expenses and provide a significant time advantage in endoscopy. In gastroenterology an early recognition of malign and precursor lesions is relevant. Instantaneous and precise differentiation between adenomas as precursor lesions for cancer and hyperplastic polyps on the one hand and between high and low-risk alterations on the other hand is important. Raman fiber-optical measurements of colon biopsy samples taken during colonoscopy were carried out during a clinical study, and samples of adenocarcinoma (22), tubular adenomas (141), hyperplastic polyps (79) and normal tissue (101) from 151 patients were analyzed. This allows us to focus on the bioinformatic analysis and to set stage for Raman endoscopic measurements. Since spectral differences between normal and cancerous biopsy samples are small, special care has to be taken in data analysis. Using a leave-one-patient-out cross-validation scheme, three different outlier identification methods were investigated to decrease the influence of systematic errors, like a residual risk in misplacement of the sample and spectral dilution of marker bands (esp. cancerous tissue) and therewith optimize the experimental design. Furthermore other validations methods like leave-one-sample-out and leave-one-spectrum-out cross-validation schemes were compared with leave-one-patient-out cross-validation. High-risk lesions were differentiated from low-risk lesions with a sensitivity of 79%, specificity of 74% and an accuracy of 77%, cancer and normal tissue with a sensitivity of 79%, specificity of 83% and an accuracy of 81%. Additionally applied outlier identification enabled us to improve the recognition of neoplastic biopsy samples.

  17. Raman fiber-optical method for colon cancer detection: Cross-validation and outlier identification approach.

    Science.gov (United States)

    Petersen, D; Naveed, P; Ragheb, A; Niedieker, D; El-Mashtoly, S F; Brechmann, T; Kötting, C; Schmiegel, W H; Freier, E; Pox, C; Gerwert, K

    2017-06-15

    Endoscopy plays a major role in early recognition of cancer which is not externally accessible and therewith in increasing the survival rate. Raman spectroscopic fiber-optical approaches can help to decrease the impact on the patient, increase objectivity in tissue characterization, reduce expenses and provide a significant time advantage in endoscopy. In gastroenterology an early recognition of malign and precursor lesions is relevant. Instantaneous and precise differentiation between adenomas as precursor lesions for cancer and hyperplastic polyps on the one hand and between high and low-risk alterations on the other hand is important. Raman fiber-optical measurements of colon biopsy samples taken during colonoscopy were carried out during a clinical study, and samples of adenocarcinoma (22), tubular adenomas (141), hyperplastic polyps (79) and normal tissue (101) from 151 patients were analyzed. This allows us to focus on the bioinformatic analysis and to set stage for Raman endoscopic measurements. Since spectral differences between normal and cancerous biopsy samples are small, special care has to be taken in data analysis. Using a leave-one-patient-out cross-validation scheme, three different outlier identification methods were investigated to decrease the influence of systematic errors, like a residual risk in misplacement of the sample and spectral dilution of marker bands (esp. cancerous tissue) and therewith optimize the experimental design. Furthermore other validations methods like leave-one-sample-out and leave-one-spectrum-out cross-validation schemes were compared with leave-one-patient-out cross-validation. High-risk lesions were differentiated from low-risk lesions with a sensitivity of 79%, specificity of 74% and an accuracy of 77%, cancer and normal tissue with a sensitivity of 79%, specificity of 83% and an accuracy of 81%. Additionally applied outlier identification enabled us to improve the recognition of neoplastic biopsy samples. Copyright

  18. Moving standard deviation and moving sum of outliers as quality tools for monitoring analytical precision.

    Science.gov (United States)

    Liu, Jiakai; Tan, Chin Hon; Badrick, Tony; Loh, Tze Ping

    2018-02-01

    An increase in analytical imprecision (expressed as CV a ) can introduce additional variability (i.e. noise) to the patient results, which poses a challenge to the optimal management of patients. Relatively little work has been done to address the need for continuous monitoring of analytical imprecision. Through numerical simulations, we describe the use of moving standard deviation (movSD) and a recently described moving sum of outlier (movSO) patient results as means for detecting increased analytical imprecision, and compare their performances against internal quality control (QC) and the average of normal (AoN) approaches. The power of detecting an increase in CV a is suboptimal under routine internal QC procedures. The AoN technique almost always had the highest average number of patient results affected before error detection (ANPed), indicating that it had generally the worst capability for detecting an increased CV a . On the other hand, the movSD and movSO approaches were able to detect an increased CV a at significantly lower ANPed, particularly for measurands that displayed a relatively small ratio of biological variation to CV a. CONCLUSION: The movSD and movSO approaches are effective in detecting an increase in CV a for high-risk measurands with small biological variation. Their performance is relatively poor when the biological variation is large. However, the clinical risks of an increase in analytical imprecision is attenuated for these measurands as an increased analytical imprecision will only add marginally to the total variation and less likely to impact on the clinical care. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  19. 42 CFR 484.240 - Methodology used for the calculation of the outlier payment.

    Science.gov (United States)

    2010-10-01

    ... for each case-mix group. (b) The outlier threshold for each case-mix group is the episode payment... the same for all case-mix groups. (c) The outlier payment is a proportion of the amount of estimated...

  20. Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research

    Science.gov (United States)

    Bakker, Marjan; Wicherts, Jelte M.

    2014-01-01

    Background The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses. Methods and Findings We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (N = 92) with results from articles that reported no exclusion of outliers (N = 61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles. Conclusions We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses. PMID:25072606

  1. Patterns of Care for Biologic-Dosing Outliers and Nonoutliers in Biologic-Naive Patients with Rheumatoid Arthritis.

    Science.gov (United States)

    Delate, Thomas; Meyer, Roxanne; Jenkins, Daniel

    2017-08-01

    Although most biologic medications for patients with rheumatoid arthritis (RA) have recommended fixed dosing, actual biologic dosing may vary among real-world patients, since some patients can receive higher (high-dose outliers) or lower (low-dose outliers) doses than what is recommended in medication package inserts. To describe the patterns of care for biologic-dosing outliers and nonoutliers in biologic-naive patients with RA. This was a retrospective, longitudinal cohort study of patients with RA who were not pregnant and were aged ≥ 18 and 110% of the approved dose in the package insert at any time during the study period. Baseline patient profiles, treatment exposures, and outcomes were collected during the 180 days before and up to 2 years after biologic initiation and compared across index biologic outlier groups. Patients were followed for at least 1 year, with a subanalysis of those patients who remained as members for 2 years. This study included 434 RA patients with 1 year of follow-up and 372 RA patients with 2 years of follow-up. Overall, the vast majority of patients were female (≈75%) and had similar baseline characteristics. Approximately 10% of patients were outliers in both follow-up cohorts. ETN patients were least likely to become outliers, and ADA patients were most likely to become outliers. Of all outliers during the 1-year follow-up, patients were more likely to be a high-dose outlier (55%) than a low-dose outlier (45%). Median 1- and 2-year adjusted total biologic costs (based on wholesale acquisition costs) were higher for ADA and ETA nonoutliers than for IFX nonoutliers. Biologic persistence was highest for IFX patients. Charlson Comorbidity Index score, ETN and IFX index biologic, and treatment with a nonbiologic disease-modifying antirheumatic drug (DMARD) before biologic initiation were associated with becoming high- or low-dose outliers (c-statistic = 0.79). Approximately 1 in 10 study patients with RA was identified as a

  2. Outlier identification procedures for contingency tables using maximum likelihood and $L_1$ estimates

    NARCIS (Netherlands)

    Kuhnt, S.

    2004-01-01

    Observed cell counts in contingency tables are perceived as outliers if they have low probability under an anticipated loglinear Poisson model. New procedures for the identification of such outliers are derived using the classical maximum likelihood estimator and an estimator based on the L1 norm.

  3. RE-EXAMINING HIGH ABUNDANCE SLOAN DIGITAL SKY SURVEY MASS-METALLICITY OUTLIERS: HIGH N/O, EVOLVED WOLF-RAYET GALAXIES?

    International Nuclear Information System (INIS)

    Berg, Danielle A.; Skillman, Evan D.; Marble, Andrew R.

    2011-01-01

    We present new MMT spectroscopic observations of four dwarf galaxies representative of a larger sample observed by the Sloan Digital Sky Survey and identified by Peeples et al. as low-mass, high oxygen abundance outliers from the mass-metallicity relation. Peeples showed that these four objects (with metallicity estimates of 8.5 ≤ 12 + log(O/H) ≤ 8.8) have oxygen abundance offsets of 0.4-0.6 dex from the M B luminosity-metallicity relation. Our new observations extend the wavelength coverage to include the [O II] λλ3726, 3729 doublet, which adds leverage in oxygen abundance estimates and allows measurements of N/O ratios. All four spectra are low excitation, with relatively high N/O ratios (N/O ∼> 0.10), each of which tend to bias estimates based on strong emission lines toward high oxygen abundances. These spectra all fall in a regime where the 'standard' strong-line methods for metallicity determinations are not well calibrated either empirically or by photoionization modeling. By comparing our spectra directly to photoionization models, we estimate oxygen abundances in the range of 7.9 ≤ 12 + log (O/H) ≤ 8.4, consistent with the scatter of the mass-metallicity relation. We discuss the physical nature of these galaxies that leads to their unusual spectra (and previous classification as outliers), finding their low excitation, elevated N/O, and strong Balmer absorption are consistent with the properties expected from galaxies evolving past the 'Wolf-Rayet galaxy' phase. We compare our results to the 'main' sample of Peeples and conclude that they are outliers primarily due to enrichment of nitrogen relative to oxygen and not due to unusually high oxygen abundances for their masses or luminosities.

  4. On the identification of Dragon Kings among extreme-valued outliers

    Science.gov (United States)

    Riva, M.; Neuman, S. P.; Guadagnini, A.

    2013-07-01

    Extreme values of earth, environmental, ecological, physical, biological, financial and other variables often form outliers to heavy tails of empirical frequency distributions. Quite commonly such tails are approximated by stretched exponential, log-normal or power functions. Recently there has been an interest in distinguishing between extreme-valued outliers that belong to the parent population of most data in a sample and those that do not. The first type, called Gray Swans by Nassim Nicholas Taleb (often confused in the literature with Taleb's totally unknowable Black Swans), is drawn from a known distribution of the tails which can thus be extrapolated beyond the range of sampled values. However, the magnitudes and/or space-time locations of unsampled Gray Swans cannot be foretold. The second type of extreme-valued outliers, termed Dragon Kings by Didier Sornette, may in his view be sometimes predicted based on how other data in the sample behave. This intriguing prospect has recently motivated some authors to propose statistical tests capable of identifying Dragon Kings in a given random sample. Here we apply three such tests to log air permeability data measured on the faces of a Berea sandstone block and to synthetic data generated in a manner statistically consistent with these measurements. We interpret the measurements to be, and generate synthetic data that are, samples from α-stable sub-Gaussian random fields subordinated to truncated fractional Gaussian noise (tfGn). All these data have frequency distributions characterized by power-law tails with extreme-valued outliers about the tail edges.

  5. On the identification of Dragon Kings among extreme-valued outliers

    Directory of Open Access Journals (Sweden)

    M. Riva

    2013-07-01

    Full Text Available Extreme values of earth, environmental, ecological, physical, biological, financial and other variables often form outliers to heavy tails of empirical frequency distributions. Quite commonly such tails are approximated by stretched exponential, log-normal or power functions. Recently there has been an interest in distinguishing between extreme-valued outliers that belong to the parent population of most data in a sample and those that do not. The first type, called Gray Swans by Nassim Nicholas Taleb (often confused in the literature with Taleb's totally unknowable Black Swans, is drawn from a known distribution of the tails which can thus be extrapolated beyond the range of sampled values. However, the magnitudes and/or space–time locations of unsampled Gray Swans cannot be foretold. The second type of extreme-valued outliers, termed Dragon Kings by Didier Sornette, may in his view be sometimes predicted based on how other data in the sample behave. This intriguing prospect has recently motivated some authors to propose statistical tests capable of identifying Dragon Kings in a given random sample. Here we apply three such tests to log air permeability data measured on the faces of a Berea sandstone block and to synthetic data generated in a manner statistically consistent with these measurements. We interpret the measurements to be, and generate synthetic data that are, samples from α-stable sub-Gaussian random fields subordinated to truncated fractional Gaussian noise (tfGn. All these data have frequency distributions characterized by power-law tails with extreme-valued outliers about the tail edges.

  6. Factor-based forecasting in the presence of outliers

    DEFF Research Database (Denmark)

    Kristensen, Johannes Tang

    2014-01-01

    Macroeconomic forecasting using factor models estimated by principal components has become a popular research topic with many both theoretical and applied contributions in the literature. In this paper we attempt to address an often neglected issue in these models: The problem of outliers...... in the data. Most papers take an ad-hoc approach to this problem and simply screen datasets prior to estimation and remove anomalous observations. We investigate whether forecasting performance can be improved by using the original unscreened dataset and replacing principal components with a robust...... apply the estimator in a simulated real-time forecasting exercise to test its merits. We use a newly compiled dataset of US macroeconomic series spanning the period 1971:2–2012:10. Our findings suggest that the chosen treatment of outliers does affect forecasting performance and that in many cases...

  7. Outlier identification in colorectal surgery should separate elective and nonelective service components.

    Science.gov (United States)

    Byrne, Ben E; Mamidanna, Ravikrishna; Vincent, Charles A; Faiz, Omar D

    2014-09-01

    The identification of health care institutions with outlying outcomes is of great importance for reporting health care results and for quality improvement. Historically, elective surgical outcomes have received greater attention than nonelective results, although some studies have examined both. Differences in outlier identification between these patient groups have not been adequately explored. The aim of this study was to compare the identification of institutional outliers for mortality after elective and nonelective colorectal resection in England. This was a cohort study using routine administrative data. Ninety-day mortality was determined by using statutory records of death. Adjusted Trust-level mortality rates were calculated by using multiple logistic regression. High and low mortality outliers were identified and compared across funnel plots for elective and nonelective surgery. All English National Health Service Trusts providing colorectal surgery to an unrestricted patient population were studied. Adults admitted for colorectal surgery between April 2006 and March 2012 were included. Segmental colonic or rectal resection was performed. The primary outcome measured was 90-day mortality. Included were 195,118 patients, treated at 147 Trusts. Ninety-day mortality rates after elective and nonelective surgery were 4% and 18%. No unit with high outlying mortality for elective surgery was a high outlier for nonelective mortality and vice versa. Trust level, observed-to-expected mortality for elective and nonelective surgery, was moderately correlated (Spearman ρ = 0.50, pinstitutional mortality outlier after elective and nonelective colorectal surgery was not closely related. Therefore, mortality rates should be reported for both patient cohorts separately. This would provide a broad picture of the state of colorectal services and help direct research and quality improvement activities.

  8. Outlier robustness for wind turbine extrapolated extreme loads

    DEFF Research Database (Denmark)

    Natarajan, Anand; Verelst, David Robert

    2012-01-01

    . Stochastic identification of numerical artifacts in simulated loads is demonstrated using the method of principal component analysis. The extrapolation methodology is made robust to outliers through a weighted loads approach, whereby the eigenvalues of the correlation matrix obtained using the loads with its...

  9. Fuzzy Treatment of Candidate Outliers in Measurements

    Directory of Open Access Journals (Sweden)

    Giampaolo E. D'Errico

    2012-01-01

    Full Text Available Robustness against the possible occurrence of outlying observations is critical to the performance of a measurement process. Open questions relevant to statistical testing for candidate outliers are reviewed. A novel fuzzy logic approach is developed and exemplified in a metrology context. A simulation procedure is presented and discussed by comparing fuzzy versus probabilistic models.

  10. The comparison between several robust ridge regression estimators in the presence of multicollinearity and multiple outliers

    Science.gov (United States)

    Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said

    2014-09-01

    In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.

  11. Universal Linear Fit Identification: A Method Independent of Data, Outliers and Noise Distribution Model and Free of Missing or Removed Data Imputation.

    Science.gov (United States)

    Adikaram, K K L B; Hussein, M A; Effenberger, M; Becker, T

    2015-01-01

    Data processing requires a robust linear fit identification method. In this paper, we introduce a non-parametric robust linear fit identification method for time series. The method uses an indicator 2/n to identify linear fit, where n is number of terms in a series. The ratio Rmax of amax - amin and Sn - amin*n and that of Rmin of amax - amin and amax*n - Sn are always equal to 2/n, where amax is the maximum element, amin is the minimum element and Sn is the sum of all elements. If any series expected to follow y = c consists of data that do not agree with y = c form, Rmax > 2/n and Rmin > 2/n imply that the maximum and minimum elements, respectively, do not agree with linear fit. We define threshold values for outliers and noise detection as 2/n * (1 + k1) and 2/n * (1 + k2), respectively, where k1 > k2 and 0 ≤ k1 ≤ n/2 - 1. Given this relation and transformation technique, which transforms data into the form y = c, we show that removing all data that do not agree with linear fit is possible. Furthermore, the method is independent of the number of data points, missing data, removed data points and nature of distribution (Gaussian or non-Gaussian) of outliers, noise and clean data. These are major advantages over the existing linear fit methods. Since having a perfect linear relation between two variables in the real world is impossible, we used artificial data sets with extreme conditions to verify the method. The method detects the correct linear fit when the percentage of data agreeing with linear fit is less than 50%, and the deviation of data that do not agree with linear fit is very small, of the order of ±10-4%. The method results in incorrect detections only when numerical accuracy is insufficient in the calculation process.

  12. Universal Linear Fit Identification: A Method Independent of Data, Outliers and Noise Distribution Model and Free of Missing or Removed Data Imputation.

    Directory of Open Access Journals (Sweden)

    K K L B Adikaram

    Full Text Available Data processing requires a robust linear fit identification method. In this paper, we introduce a non-parametric robust linear fit identification method for time series. The method uses an indicator 2/n to identify linear fit, where n is number of terms in a series. The ratio Rmax of amax - amin and Sn - amin*n and that of Rmin of amax - amin and amax*n - Sn are always equal to 2/n, where amax is the maximum element, amin is the minimum element and Sn is the sum of all elements. If any series expected to follow y = c consists of data that do not agree with y = c form, Rmax > 2/n and Rmin > 2/n imply that the maximum and minimum elements, respectively, do not agree with linear fit. We define threshold values for outliers and noise detection as 2/n * (1 + k1 and 2/n * (1 + k2, respectively, where k1 > k2 and 0 ≤ k1 ≤ n/2 - 1. Given this relation and transformation technique, which transforms data into the form y = c, we show that removing all data that do not agree with linear fit is possible. Furthermore, the method is independent of the number of data points, missing data, removed data points and nature of distribution (Gaussian or non-Gaussian of outliers, noise and clean data. These are major advantages over the existing linear fit methods. Since having a perfect linear relation between two variables in the real world is impossible, we used artificial data sets with extreme conditions to verify the method. The method detects the correct linear fit when the percentage of data agreeing with linear fit is less than 50%, and the deviation of data that do not agree with linear fit is very small, of the order of ±10-4%. The method results in incorrect detections only when numerical accuracy is insufficient in the calculation process.

  13. A study of outliers in statistical distributions of mechanical properties of structural steels

    International Nuclear Information System (INIS)

    Oefverbeck, P.; Oestberg, G.

    1977-01-01

    The safety against failure of pressure vessels can be assessed by statistical methods, so-called probabilistic fracture mechanics. The data base for such estimations is admittedly rather meagre, making it necessary to assume certain conventional statistical distributions. Since the failure rates arrived at are low, for nuclear vessels of the order of 10 - to 10 - per year, the extremes of the variables involved, among other things the mechanical properties of the steel used, are of particular interest. A question sometimes raised is whether outliers, or values exceeding the extremes in the assumed distributions, might occur. In order to explore this possibility a study has been made of strength values of three qualities of structural steels, available in samples of up to about 12,000. Statistical evaluation of these samples with respect to outliers, using standard methods for this purpose, revealed the presence of such outliers in most cases, with a frequency of occurrence of, typically, a few values per thousand, estimated by the methods described. Obviously, statistical analysis alone cannot be expected to shed any light on the causes of outliers. Thus, the interpretation of these results with respect to their implication for the probabilistic estimation of the integrety of pressure vessels must await further studies of a similar nature in which the test specimens corresponding to outliers can be recovered and examined metallographically. For the moment the results should be regarded only as a factor to be considered in discussions of the safety of pressure vessels. (author)

  14. Prospective casemix-based funding, analysis and financial impact of cost outliers in all-patient refined diagnosis related groups in three Belgian general hospitals.

    Science.gov (United States)

    Pirson, Magali; Martins, Dimitri; Jackson, Terri; Dramaix, Michèle; Leclercq, Pol

    2006-03-01

    This study examined the impact of cost outliers in term of hospital resources consumption, the financial impact of the outliers under the Belgium casemix-based system, and the validity of two "proxies" for costs: length of stay and charges. The cost of all hospital stays at three Belgian general hospitals were calculated for the year 2001. High resource use outliers were selected according to the following rule: 75th percentile +1.5 xinter-quartile range. The frequency of cost outliers varied from 7% to 8% across hospitals. Explanatory factors were: major or extreme severity of illness, longer length of stay, and intensive care unit stay. Cost outliers account for 22-30% of hospital costs. One-third of length-of-stay outliers are not cost outliers, and nearly one-quarter of charges outliers are not cost outliers. The current funding system in Belgium does not penalize hospitals having a high percentage of outliers. The billing generated by these patients largely compensates for costs generated. Length of stay and charges are not a good approximation to select cost outliers.

  15. Cross-visit tumor sub-segmentation and registration with outlier rejection for dynamic contrast-enhanced MRI time series data.

    Science.gov (United States)

    Buonaccorsi, G A; Rose, C J; O'Connor, J P B; Roberts, C; Watson, Y; Jackson, A; Jayson, G C; Parker, G J M

    2010-01-01

    Clinical trials of anti-angiogenic and vascular-disrupting agents often use biomarkers derived from DCE-MRI, typically reporting whole-tumor summary statistics and so overlooking spatial parameter variations caused by tissue heterogeneity. We present a data-driven segmentation method comprising tracer-kinetic model-driven registration for motion correction, conversion from MR signal intensity to contrast agent concentration for cross-visit normalization, iterative principal components analysis for imputation of missing data and dimensionality reduction, and statistical outlier detection using the minimum covariance determinant to obtain a robust Mahalanobis distance. After applying these techniques we cluster in the principal components space using k-means. We present results from a clinical trial of a VEGF inhibitor, using time-series data selected because of problems due to motion and outlier time series. We obtained spatially-contiguous clusters that map to regions with distinct microvascular characteristics. This methodology has the potential to uncover localized effects in trials using DCE-MRI-based biomarkers.

  16. A computational study on outliers in world music

    Science.gov (United States)

    Benetos, Emmanouil; Dixon, Simon

    2017-01-01

    The comparative analysis of world music cultures has been the focus of several ethnomusicological studies in the last century. With the advances of Music Information Retrieval and the increased accessibility of sound archives, large-scale analysis of world music with computational tools is today feasible. We investigate music similarity in a corpus of 8200 recordings of folk and traditional music from 137 countries around the world. In particular, we aim to identify music recordings that are most distinct compared to the rest of our corpus. We refer to these recordings as ‘outliers’. We use signal processing tools to extract music information from audio recordings, data mining to quantify similarity and detect outliers, and spatial statistics to account for geographical correlation. Our findings suggest that Botswana is the country with the most distinct recordings in the corpus and China is the country with the most distinct recordings when considering spatial correlation. Our analysis includes a comparison of musical attributes and styles that contribute to the ‘uniqueness’ of the music of each country. PMID:29253027

  17. Accounting for regional background and population size in the detection of spatial clusters and outliers using geostatistical filtering and spatial neutral models: the case of lung cancer in Long Island, New York

    Directory of Open Access Journals (Sweden)

    Goovaerts Pierre

    2004-07-01

    Full Text Available Abstract Background Complete Spatial Randomness (CSR is the null hypothesis employed by many statistical tests for spatial pattern, such as local cluster or boundary analysis. CSR is however not a relevant null hypothesis for highly complex and organized systems such as those encountered in the environmental and health sciences in which underlying spatial pattern is present. This paper presents a geostatistical approach to filter the noise caused by spatially varying population size and to generate spatially correlated neutral models that account for regional background obtained by geostatistical smoothing of observed mortality rates. These neutral models were used in conjunction with the local Moran statistics to identify spatial clusters and outliers in the geographical distribution of male and female lung cancer in Nassau, Queens, and Suffolk counties, New York, USA. Results We developed a typology of neutral models that progressively relaxes the assumptions of null hypotheses, allowing for the presence of spatial autocorrelation, non-uniform risk, and incorporation of spatially heterogeneous population sizes. Incorporation of spatial autocorrelation led to fewer significant ZIP codes than found in previous studies, confirming earlier claims that CSR can lead to over-identification of the number of significant spatial clusters or outliers. Accounting for population size through geostatistical filtering increased the size of clusters while removing most of the spatial outliers. Integration of regional background into the neutral models yielded substantially different spatial clusters and outliers, leading to the identification of ZIP codes where SMR values significantly depart from their regional background. Conclusion The approach presented in this paper enables researchers to assess geographic relationships using appropriate null hypotheses that account for the background variation extant in real-world systems. In particular, this new

  18. New Vehicle Detection Method with Aspect Ratio Estimation for Hypothesized Windows

    Directory of Open Access Journals (Sweden)

    Jisu Kim

    2015-12-01

    Full Text Available All kinds of vehicles have different ratios of width to height, which are called the aspect ratios. Most previous works, however, use a fixed aspect ratio for vehicle detection (VD. The use of a fixed vehicle aspect ratio for VD degrades the performance. Thus, the estimation of a vehicle aspect ratio is an important part of robust VD. Taking this idea into account, a new on-road vehicle detection system is proposed in this paper. The proposed method estimates the aspect ratio of the hypothesized windows to improve the VD performance. Our proposed method uses an Aggregate Channel Feature (ACF and a support vector machine (SVM to verify the hypothesized windows with the estimated aspect ratio. The contribution of this paper is threefold. First, the estimation of vehicle aspect ratio is inserted between the HG (hypothesis generation and the HV (hypothesis verification. Second, a simple HG method named a signed horizontal edge map is proposed to speed up VD. Third, a new measure is proposed to represent the overlapping ratio between the ground truth and the detection results. This new measure is used to show that the proposed method is better than previous works in terms of robust VD. Finally, the Pittsburgh dataset is used to verify the performance of the proposed method.

  19. Comparative study of methods on outlying data detection in experimental results

    International Nuclear Information System (INIS)

    Oliveira, P.M.S.; Munita, C.S.; Hazenfratz, R.

    2009-01-01

    The interpretation of experimental results through multivariate statistical methods might reveal the outliers existence, which is rarely taken into account by the analysts. However, their presence can influence the results interpretation, generating false conclusions. This paper shows the importance of the outliers determination for one data base of 89 samples of ceramic fragments, analyzed by neutron activation analysis. The results were submitted to five procedures to detect outliers: Mahalanobis distance, cluster analysis, principal component analysis, factor analysis, and standardized residual. The results showed that although cluster analysis is one of the procedures most used to identify outliers, it can fail by not showing the samples that are easily identified as outliers by other methods. In general, the statistical procedures for the identification of the outliers are little known by the analysts. (author)

  20. An application of robust ridge regression model in the presence of outliers to real data problem

    Science.gov (United States)

    Shariff, N. S. Md.; Ferdaos, N. A.

    2017-09-01

    Multicollinearity and outliers are often leads to inconsistent and unreliable parameter estimates in regression analysis. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is believed are affected by the presence of outlier. The combination of GM-estimation and ridge parameter that is robust towards both problems is on interest in this study. As such, both techniques are employed to investigate the relationship between stock market price and macroeconomic variables in Malaysia due to curiosity of involving the multicollinearity and outlier problem in the data set. There are four macroeconomic factors selected for this study which are Consumer Price Index (CPI), Gross Domestic Product (GDP), Base Lending Rate (BLR) and Money Supply (M1). The results demonstrate that the proposed procedure is able to produce reliable results towards the presence of multicollinearity and outliers in the real data.

  1. New approach for the identification of implausible values and outliers in longitudinal childhood anthropometric data.

    Science.gov (United States)

    Shi, Joy; Korsiak, Jill; Roth, Daniel E

    2018-03-01

    We aimed to demonstrate the use of jackknife residuals to take advantage of the longitudinal nature of available growth data in assessing potential biologically implausible values and outliers. Artificial errors were induced in 5% of length, weight, and head circumference measurements, measured on 1211 participants from the Maternal Vitamin D for Infant Growth (MDIG) trial from birth to 24 months of age. Each child's sex- and age-standardized z-score or raw measurements were regressed as a function of age in child-specific models. Each error responsible for a biologically implausible decrease between a consecutive pair of measurements was identified based on the higher of the two absolute values of jackknife residuals in each pair. In further analyses, outliers were identified as those values beyond fixed cutoffs of the jackknife residuals (e.g., greater than +5 or less than -5 in primary analyses). Kappa, sensitivity, and specificity were calculated over 1000 simulations to assess the ability of the jackknife residual method to detect induced errors and to compare these methods with the use of conditional growth percentiles and conventional cross-sectional methods. Among the induced errors that resulted in a biologically implausible decrease in measurement between two consecutive values, the jackknife residual method identified the correct value in 84.3%-91.5% of these instances when applied to the sex- and age-standardized z-scores, with kappa values ranging from 0.685 to 0.795. Sensitivity and specificity of the jackknife method were higher than those of the conditional growth percentile method, but specificity was lower than for conventional cross-sectional methods. Using jackknife residuals provides a simple method to identify biologically implausible values and outliers in longitudinal child growth data sets in which each child contributes at least 4 serial measurements. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.

  2. Unsupervised Condition Change Detection In Large Diesel Engines

    DEFF Research Database (Denmark)

    Pontoppidan, Niels Henrik; Larsen, Jan

    2003-01-01

    This paper presents a new method for unsupervised change detection which combines independent component modeling and probabilistic outlier etection. The method further provides a compact data representation, which is amenable to interpretation, i.e., the detected condition changes can be investig...... be investigated further. The method is successfully applied to unsupervised condition change detection in large diesel engines from acoustical emission sensor signal and compared to more classical techniques based on principal component analysis and Gaussian mixture models.......This paper presents a new method for unsupervised change detection which combines independent component modeling and probabilistic outlier etection. The method further provides a compact data representation, which is amenable to interpretation, i.e., the detected condition changes can...

  3. A random sampling approach for robust estimation of tissue-to-plasma ratio from extremely sparse data.

    Science.gov (United States)

    Chu, Hui-May; Ette, Ene I

    2005-09-02

    his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.

  4. A comparative study of outlier detection for large-scale traffic data by one-class SVM and kernel density estimation

    Science.gov (United States)

    Ngan, Henry Y. T.; Yung, Nelson H. C.; Yeh, Anthony G. O.

    2015-02-01

    This paper aims at presenting a comparative study of outlier detection (OD) for large-scale traffic data. The traffic data nowadays are massive in scale and collected in every second throughout any modern city. In this research, the traffic flow dynamic is collected from one of the busiest 4-armed junction in Hong Kong in a 31-day sampling period (with 764,027 vehicles in total). The traffic flow dynamic is expressed in a high dimension spatial-temporal (ST) signal format (i.e. 80 cycles) which has a high degree of similarities among the same signal and across different signals in one direction. A total of 19 traffic directions are identified in this junction and lots of ST signals are collected in the 31-day period (i.e. 874 signals). In order to reduce its dimension, the ST signals are firstly undergone a principal component analysis (PCA) to represent as (x,y)-coordinates. Then, these PCA (x,y)-coordinates are assumed to be conformed as Gaussian distributed. With this assumption, the data points are further to be evaluated by (a) a correlation study with three variant coefficients, (b) one-class support vector machine (SVM) and (c) kernel density estimation (KDE). The correlation study could not give any explicit OD result while the one-class SVM and KDE provide average 59.61% and 95.20% DSRs, respectively.

  5. Identification of Outlier Loci Responding to Anthropogenic and Natural Selection Pressure in Stream Insects Based on a Self-Organizing Map

    Directory of Open Access Journals (Sweden)

    Bin Li

    2016-05-01

    Full Text Available Water quality maintenance should be considered from an ecological perspective since water is a substrate ingredient in the biogeochemical cycle and is closely linked with ecosystem functioning and services. Addressing the status of live organisms in aquatic ecosystems is a critical issue for appropriate prediction and water quality management. Recently, genetic changes in biological organisms have garnered more attention due to their in-depth expression of environmental stress on aquatic ecosystems in an integrative manner. We demonstrate that genetic diversity would adaptively respond to environmental constraints in this study. We applied a self-organizing map (SOM to characterize complex Amplified Fragment Length Polymorphisms (AFLP of aquatic insects in six streams in Japan with natural and anthropogenic variability. After SOM training, the loci compositions of aquatic insects effectively responded to environmental selection pressure. To measure how important the role of loci compositions was in the population division, we altered the AFLP data by flipping the existence of given loci individual by individual. Subsequently we recognized the cluster change of the individuals with altered data using the trained SOM. Based on SOM recognition of these altered data, we determined the outlier loci (over 90th percentile that showed drastic changes in their belonging clusters (D. Subsequently environmental responsiveness (Ek’ was also calculated to address relationships with outliers in different species. Outlier loci were sensitive to slightly polluted conditions including Chl-a, NH4-N, NOX-N, PO4-P, and SS, and the food material, epilithon. Natural environmental factors such as altitude and sediment additionally showed relationships with outliers in somewhat lower levels. Poly-loci like responsiveness was detected in adapting to environmental constraints. SOM training followed by recognition shed light on developing algorithms de novo to

  6. Universal Linear Fit Identification: A Method Independent of Data, Outliers and Noise Distribution Model and Free of Missing or Removed Data Imputation

    Science.gov (United States)

    Adikaram, K. K. L. B.; Becker, T.

    2015-01-01

    Data processing requires a robust linear fit identification method. In this paper, we introduce a non-parametric robust linear fit identification method for time series. The method uses an indicator 2/n to identify linear fit, where n is number of terms in a series. The ratio R max of a max − a min and S n − a min *n and that of R min of a max − a min and a max *n − S n are always equal to 2/n, where a max is the maximum element, a min is the minimum element and S n is the sum of all elements. If any series expected to follow y = c consists of data that do not agree with y = c form, R max > 2/n and R min > 2/n imply that the maximum and minimum elements, respectively, do not agree with linear fit. We define threshold values for outliers and noise detection as 2/n * (1 + k 1 ) and 2/n * (1 + k 2 ), respectively, where k 1 > k 2 and 0 ≤ k 1 ≤ n/2 − 1. Given this relation and transformation technique, which transforms data into the form y = c, we show that removing all data that do not agree with linear fit is possible. Furthermore, the method is independent of the number of data points, missing data, removed data points and nature of distribution (Gaussian or non-Gaussian) of outliers, noise and clean data. These are major advantages over the existing linear fit methods. Since having a perfect linear relation between two variables in the real world is impossible, we used artificial data sets with extreme conditions to verify the method. The method detects the correct linear fit when the percentage of data agreeing with linear fit is less than 50%, and the deviation of data that do not agree with linear fit is very small, of the order of ±10−4%. The method results in incorrect detections only when numerical accuracy is insufficient in the calculation process. PMID:26571035

  7. Técnica de aprendizado semissupervisionado para detecção de outliers

    OpenAIRE

    Fabio Willian Zamoner

    2014-01-01

    Detecção de outliers desempenha um importante papel para descoberta de conhecimento em grandes bases de dados. O estudo é motivado por inúmeras aplicações reais como fraudes de cartões de crédito, detecção de falhas em componentes industriais, intrusão em redes de computadores, aprovação de empréstimos e monitoramento de condições médicas. Um outlier é definido como uma observação que desvia das outras observações em relação a uma medida e exerce considerável influência na análise de dados...

  8. Detection ratios on winter surveys of Rocky Mountain Trumpeter Swans Cygnus buccinator

    Science.gov (United States)

    Bart, J.; Mitchell, C.D.; Fisher, M.N.; Dubovsky, J.A.

    2007-01-01

    We estimated the detection ratio for Rocky Mountain Trumpeter Swans Cygnus buccinator that were counted during aerial surveys made in winter. The standard survey involved counting white or grey birds on snow and ice and thus might be expected to have had low detection ratios. On the other hand, observers were permitted to circle areas where the birds were concentrated multiple times to obtain accurate counts. Actual numbers present were estimated by conducting additional intensive aerial counts either immediately before or immediately after the standard count. Surveyors continued the intensive surveys at each area until consecutive counts were identical. The surveys were made at 10 locations in 2006 and at 19 locations in 2007. A total of 2,452 swans were counted on the intensive surveys. Detection ratios did not vary detectably with year, observer, which survey was conducted first, age of the swans, or the number of swans present. The overall detection ratio was 0.93 (90% confidence interval 0.82-1.04), indicating that the counts were quite accurate. Results are used to depict changes in population size for Rocky Mountain Trumpeter Swans from 1974-2007. ?? Wildfowl & Wetlands Trust.

  9. Anomalous human behavior detection: An Adaptive approach

    NARCIS (Netherlands)

    Leeuwen, C. van; Halma, A.; Schutte, K.

    2013-01-01

    Detection of anomalies (outliers or abnormal instances) is an important element in a range of applications such as fault, fraud, suspicious behavior detection and knowledge discovery. In this article we propose a new method for anomaly detection and performed tested its ability to detect anomalous

  10. Identification of outliers and positive deviants for healthcare improvement: looking for high performers in hypoglycemia safety in patients with diabetes

    Directory of Open Access Journals (Sweden)

    Brigid Wilson

    2017-11-01

    Full Text Available Abstract Background The study objectives were to determine: (1 how statistical outliers exhibiting low rates of diabetes overtreatment performed on a reciprocal measure – rates of diabetes undertreatment; and (2 the impact of different criteria on high performing outlier status. Methods The design was serial cross-sectional, using yearly Veterans Health Administration (VHA administrative data (2009–2013. Our primary outcome measure was facility rate of HbA1c overtreatment of diabetes in patients at risk for hypoglycemia. Outlier status was assessed by using two approaches: calculating a facility outlier value within year, comparator group, and A1c threshold while incorporating at risk population sizes; and examining standardized model residuals across year and A1c threshold. Facilities with outlier values in the lowest decile for all years of data using more than one threshold and comparator or with time-averaged model residuals in the lowest decile for all A1c thresholds were considered high performing outliers. Results Using outlier values, three of the 27 high performers from 2009 were also identified in 2010–2013 and considered outliers. There was only modest overlap between facilities identified as top performers based on three thresholds: A1c  9% than VA average in the population of patients at high risk for hypoglycemia. Conclusions Statistical identification of positive deviants for diabetes overtreatment was dependent upon the specific measures and approaches used. Moreover, because two facilities may arrive at the same results via very different pathways, it is important to consider that a “best” practice may actually reflect a separate “worst” practice.

  11. Detecting Outliers in Marathon Data by Means of the Andrews Plot

    Science.gov (United States)

    Stehlík, Milan; Wald, Helmut; Bielik, Viktor; Petrovič, Juraj

    2011-09-01

    For an optimal race performance, it is important, that the runner keeps steady pace during most of the time of the competition. First time runners or athletes without many competitions often experience an "blow out" after a few kilometers of the race. This could happen, because of strong emotional experiences or low control of running intensity. Competition pace of half marathon of the middle level recreational athletes is approximately 10 sec quicker than their training pace. If an athlete runs the first third of race (7 km) at a pace that is 20 sec quicker than is his capacity (trainability), he would experience an "blow out" in the last third of the race. This would be reflected by reducing the running intensity and inability to keep steady pace in the last kilometers of the race and in the final time as well. In sports science, there are many diagnostic methods ([3], [2], [6]) that are used for prediction of optimal race pace tempo and final time. Otherwise there is lacking practical evidence of diagnostics methods and its use in the field (competition, race). One of the conditions that needs to be carried out is that athletes have not only similar final times, but it is important that they keep constant pace as much as possible during whole race. For this reason it is very important to find outliers. Our experimental group consisted of 20 recreational trained athletes (mean age 32,6 years±8,9). Before the race the athletes were instructed to run on the basis of their subjective feeling and previous experience. The data (running pace of each kilometer, average and maximal heart rate of each kilometer) were collected by GPS-enabled personal trainer Forerunner 305.

  12. Impact of outlier status on critical care patient outcomes: Does boarding medical intensive care unit patients make a difference?

    Science.gov (United States)

    Ahmad, Danish; Moeller, Katherine; Chowdhury, Jared; Patel, Vishal; Yoo, Erika J

    2018-04-01

    To evaluate the impact of outlier status, or the practice of boarding ICU patients in distant critical care units, on clinical and utilization outcomes. Retrospective observational study of all consecutive admissions to the MICU service between April 1, 2014-January 3, 2016, at an urban university hospital. Of 1931 patients, 117 were outliers (6.1%) for the entire duration of their ICU stay. In adjusted analyses, there was no association between outlier status and hospital (OR 1.21, 95% CI 0.72-2.05, p=0.47) or ICU mortality (OR 1.20, 95% CI 0.64-2.25, p=0.57). Outliers had shorter hospital and ICU lengths of stay (LOS) in addition to fewer ventilator days. Crossover patients who had variable outlier exposure also had no increase in hospital (OR 1.61; 95% CI 0.80-3.23; p=0.18) or ICU mortality (OR 1.05; 95% CI 0.43-2.54; p=0.92) after risk-adjustment. Boarding of MICU patients in distant units during times of bed nonavailability does not negatively influence patient mortality or LOS. Increased hospital and ventilator utilization observed among non-outliers in the home unit may be attributable, at least in part, to differences in patient characteristics. Prospective investigation into the practice of ICU boarding will provide further confirmation of its safety. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Robust Regression Procedures for Predictor Variable Outliers.

    Science.gov (United States)

    1982-03-01

    space of probability dis- tributions. Then the influence function of the estimator is defined to be the derivative of the functional evaluated at the...measure of the impact of an outlier x0 on the estimator . . . . . .0 10 T(F) is the " influence function " which is defined to be T(F) - lirT(F")-T(F...positive and negative directions. An em- pirical influence function can be defined in a similar fashion simply by replacing F with F in eqn. (3.4).n

  14. Outlier Loci Detect Intraspecific Biodiversity amongst Spring and Autumn Spawning Herring across Local Scales.

    Directory of Open Access Journals (Sweden)

    Dorte Bekkevold

    Full Text Available Herring, Clupea harengus, is one of the ecologically and commercially most important species in European northern seas, where two distinct ecotypes have been described based on spawning time; spring and autumn. To date, it is unknown if these spring and autumn spawning herring constitute genetically distinct units. We assessed levels of genetic divergence between spring and autumn spawning herring in the Baltic Sea using two types of DNA markers, microsatellites and Single Nucleotide Polymorphisms, and compared the results with data for autumn spawning North Sea herring. Temporally replicated analyses reveal clear genetic differences between ecotypes and hence support reproductive isolation. Loci showing non-neutral behaviour, so-called outlier loci, show convergence between autumn spawning herring from demographically disjoint populations, potentially reflecting selective processes associated with autumn spawning ecotypes. The abundance and exploitation of the two ecotypes have varied strongly over space and time in the Baltic Sea, where autumn spawners have faced strong depression for decades. The results therefore have practical implications by highlighting the need for specific management of these co-occurring ecotypes to meet requirements for sustainable exploitation and ensure optimal livelihood for coastal communities.

  15. TrigDB for improving the reliability of the epicenter locations by considering the neighborhood station's trigger and cutting out of outliers in operation of Earthquake Early Warning System.

    Science.gov (United States)

    Chi, H. C.; Park, J. H.; Lim, I. S.; Seong, Y. J.

    2016-12-01

    TrigDB is initially developed for the discrimination of teleseismic-origin false alarm in the case with unreasonably associated triggers producing mis-located epicenters. We have applied TrigDB to the current EEWS(Earthquake Early Warning System) from 2014. During the early stage of testing EEWS from 2011, we adapted ElarmS from US Berkeley BSL to Korean seismic network and applied more than 5 years. We found out that the real-time testing results of EEWS in Korea showed that all events inside of seismic network with bigger than magnitude 3.0 were well detected. However, two events located at sea area gave false location results with magnitude over 4.0 due to the long period and relatively high amplitude signals related to the teleseismic waves or regional deep sources. These teleseismic-relevant false events were caused by logical co-relation during association procedure and the corresponding geometric distribution of associated stations is crescent-shaped. Seismic stations are not deployed uniformly, so the expected bias ratio varies with evaluated epicentral location. This ratio is calculated in advance and stored into database, called as TrigDB, for the discrimination of teleseismic-origin false alarm. We upgraded this method, so called `TrigDB back filling', updating location with supplementary association of stations comparing triggered times between sandwiched stations which was not associated previously based on predefined criteria such as travel-time. And we have tested a module to reject outlier trigger times by setting a criteria comparing statistical values(Sigma) to the triggered times. The criteria of cutting off the outlier is slightly slow to work until the number of stations more than 8, however, the result of location is very much improved.

  16. The influence of outliers on a model for the estimation of ...

    African Journals Online (AJOL)

    Veekunde

    problems that violate these assumptions is the problem of outliers. .... A normal probability plot of the ordered residuals on the normal order statistics, which are the ... observations from the normal distribution with zero mean and unit variance.

  17. Outliers, Cheese, and Rhizomes: Variations on a Theme of Limitation

    Science.gov (United States)

    Stone, Lynda

    2011-01-01

    All research has limitations, for example, from paradigm, concept, theory, tradition, and discipline. In this article Lynda Stone describes three exemplars that are variations on limitation and are "extraordinary" in that they change what constitutes future research in each domain. Malcolm Gladwell's present day study of outliers makes a…

  18. Adaptive Outlier-tolerant Exponential Smoothing Prediction Algorithms with Applications to Predict the Temperature in Spacecraft

    OpenAIRE

    Hu Shaolin; Zhang Wei; Li Ye; Fan Shunxi

    2011-01-01

    The exponential smoothing prediction algorithm is widely used in spaceflight control and in process monitoring as well as in economical prediction. There are two key conundrums which are open: one is about the selective rule of the parameter in the exponential smoothing prediction, and the other is how to improve the bad influence of outliers on prediction. In this paper a new practical outlier-tolerant algorithm is built to select adaptively proper parameter, and the exponential smoothing pr...

  19. Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.

    Science.gov (United States)

    Westerholt, Rene; Steiger, Enrico; Resch, Bernd; Zipf, Alexander

    2016-01-01

    Twitter and related social media feeds have become valuable data sources to many fields of research. Numerous researchers have thereby used social media posts for spatial analysis, since many of them contain explicit geographic locations. However, despite its widespread use within applied research, a thorough understanding of the underlying spatial characteristics of these data is still lacking. In this paper, we investigate how topological outliers influence the outcomes of spatial analyses of social media data. These outliers appear when different users contribute heterogeneous information about different phenomena simultaneously from similar locations. As a consequence, various messages representing different spatial phenomena are captured closely to each other, and are at risk to be falsely related in a spatial analysis. Our results reveal indications for corresponding spurious effects when analyzing Twitter data. Further, we show how the outliers distort the range of outcomes of spatial analysis methods. This has significant influence on the power of spatial inferential techniques, and, more generally, on the validity and interpretability of spatial analysis results. We further investigate how the issues caused by topological outliers are composed in detail. We unveil that multiple disturbing effects are acting simultaneously and that these are related to the geographic scales of the involved overlapping patterns. Our results show that at some scale configurations, the disturbances added through overlap are more severe than at others. Further, their behavior turns into a volatile and almost chaotic fluctuation when the scales of the involved patterns become too different. Overall, our results highlight the critical importance of thoroughly considering the specific characteristics of social media data when analyzing them spatially.

  20. Ultrasonic detection of solid phase mass flow ratio of pneumatic conveying fly ash

    Science.gov (United States)

    Duan, Guang Bin; Pan, Hong Li; Wang, Yong; Liu, Zong Ming

    2014-04-01

    In this paper, ultrasonic attenuation detection and weight balance are adopted to evaluate the solid mass ratio in this paper. Fly ash is transported on the up extraction fluidization pneumatic conveying workbench. In the ultrasonic test. McClements model and Bouguer-Lambert-Beer law model were applied to formulate the ultrasonic attenuation properties of gas-solid flow, which can give the solid mass ratio. While in the method of weigh balance, the averaged mass addition per second can reveal the solids mass flow ratio. By contrast these two solid phase mass ratio detection methods, we can know, the relative error is less.

  1. The obligation of physicians to medical outliers: a Kantian and Hegelian synthesis.

    Science.gov (United States)

    Papadimos, Thomas J; Marco, Alan P

    2004-06-03

    Patients who present to medical practices without health insurance or with serious co-morbidities can become fiscal disasters to those who care for them. Their consumption of scarce resources has caused consternation among providers and institutions, especially as it concerns the amount and type of care they should receive. In fact, some providers may try to avoid caring for them altogether, or at least try to limit their institutional or practice exposure to them. We present a philosophical discourse, with emphasis on the writings of Immanuel Kant and G.F.W. Hegel, as to why physicians have the moral imperative to give such "outliers" considerate and thoughtful care. Outliers are defined and the ideals of morality, responsibility, good will, duty, and principle are applied to the care of patients whose financial means are meager and to those whose care is physiologically futile. Actions of moral worth, unconditional good will, and doing what is right are examined. Outliers are a legitimate economic concern to individual practitioners and institutions, however this should not lead to an evasion of care. These patients should be identified early in their course of care, but such identification should be preceded by a well-planned recognition of this burden and appropriate staffing and funding should be secured. A thoughtful team approach by medical practices and their institutions, involving both clinicians and non-clinicians, should be pursued.

  2. A Student’s t Mixture Probability Hypothesis Density Filter for Multi-Target Tracking with Outliers

    Science.gov (United States)

    Liu, Zhuowei; Chen, Shuxin; Wu, Hao; He, Renke; Hao, Lin

    2018-01-01

    In multi-target tracking, the outliers-corrupted process and measurement noises can reduce the performance of the probability hypothesis density (PHD) filter severely. To solve the problem, this paper proposed a novel PHD filter, called Student’s t mixture PHD (STM-PHD) filter. The proposed filter models the heavy-tailed process noise and measurement noise as a Student’s t distribution as well as approximates the multi-target intensity as a mixture of Student’s t components to be propagated in time. Then, a closed PHD recursion is obtained based on Student’s t approximation. Our approach can make full use of the heavy-tailed characteristic of a Student’s t distribution to handle the situations with heavy-tailed process and the measurement noises. The simulation results verify that the proposed filter can overcome the negative effect generated by outliers and maintain a good tracking accuracy in the simultaneous presence of process and measurement outliers. PMID:29617348

  3. SQL injection detection system

    OpenAIRE

    Vargonas, Vytautas

    2017-01-01

    SQL injection detection system Programmers do not always ensure security of developed systems. That is why it is important to look for solutions outside being reliant on developers. In this work SQL injection detection system is proposed. The system analyzes HTTP request parameters and detects intrusions. It is based on unsupervised machine learning. Trained by regular request data system detects outlier user parameters. Since training is not reliant on previous knowledge of SQL injections, t...

  4. Identification of Outliers in Grace Data for Indo-Gangetic Plain Using Various Methods (Z-Score, Modified Z-score and Adjusted Boxplot) and Its Removal

    Science.gov (United States)

    Srivastava, S.

    2015-12-01

    Gravity Recovery and Climate Experiment (GRACE) data are widely used for the hydrological studies for large scale basins (≥100,000 sq km). GRACE data (Stokes Coefficients or Equivalent Water Height) used for hydrological studies are not direct observations but result from high level processing of raw data from the GRACE mission. Different partner agencies like CSR, GFZ and JPL implement their own methodology and their processing methods are independent from each other. The primary source of errors in GRACE data are due to measurement and modeling errors and the processing strategy of these agencies. Because of different processing methods, the final data from all the partner agencies are inconsistent with each other at some epoch. GRACE data provide spatio-temporal variations in Earth's gravity which is mainly attributed to the seasonal fluctuations in water level on Earth surfaces and subsurface. During the quantification of error/uncertainties, several high positive and negative peaks were observed which do not correspond to any hydrological processes but may emanate from a combination of primary error sources, or some other geophysical processes (e.g. Earthquakes, landslide, etc.) resulting in redistribution of earth's mass. Such peaks can be considered as outliers for hydrological studies. In this work, an algorithm has been designed to extract outliers from the GRACE data for Indo-Gangetic plain, which considers the seasonal variations and the trend in data. Different outlier detection methods have been used such as Z-score, modified Z-score and adjusted boxplot. For verification, assimilated hydrological (GLDAS) and hydro-meteorological data are used as the reference. The results have shown that the consistency amongst all data sets improved significantly after the removal of outliers.

  5. Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statistic.

    Science.gov (United States)

    Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert

    2012-08-01

    Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.

  6. ROBUST: an interactive FORTRAN-77 package for exploratory data analysis using parametric, ROBUST and nonparametric location and scale estimates, data transformations, normality tests, and outlier assessment

    Science.gov (United States)

    Rock, N. M. S.

    ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures

  7. A robust ridge regression approach in the presence of both multicollinearity and outliers in the data

    Science.gov (United States)

    Shariff, Nurul Sima Mohamad; Ferdaos, Nur Aqilah

    2017-08-01

    Multicollinearity often leads to inconsistent and unreliable parameter estimates in regression analysis. This situation will be more severe in the presence of outliers it will cause fatter tails in the error distributions than the normal distributions. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is expected to be affected by the presence of outliers due to some assumptions imposed in the modeling procedure. Thus, the robust version of existing ridge method with some modification in the inverse matrix and the estimated response value is introduced. The performance of the proposed method is discussed and comparisons are made with several existing estimators namely, Ordinary Least Squares (OLS), ridge regression and robust ridge regression based on GM-estimates. The finding of this study is able to produce reliable parameter estimates in the presence of both multicollinearity and outliers in the data.

  8. Tailor-made Surgical Guide Reduces Incidence of Outliers of Cup Placement.

    Science.gov (United States)

    Hananouchi, Takehito; Saito, Masanobu; Koyama, Tsuyoshi; Sugano, Nobuhiko; Yoshikawa, Hideki

    2010-04-01

    Malalignment of the cup in total hip arthroplasty (THA) increases the risks of postoperative complications such as neck cup impingement, dislocation, and wear. We asked whether a tailor-made surgical guide based on CT images would reduce the incidence of outliers beyond 10 degrees from preoperatively planned alignment of the cup compared with those without the surgical guide. We prospectively followed 38 patients (38 hips, Group 1) having primary THA with the conventional technique and 31 patients (31 hips, Group 2) using the surgical guide. We designed the guide for Group 2 based on CT images and fixed it to the acetabular edge with a Kirschner wire to indicate the planned cup direction. Postoperative CT images showed the guide reduced the number of outliers compared with the conventional method (Group 1, 23.7%; Group 2, 0%). The surgical guide provided more reliable cup insertion compared with conventional techniques. Level II, therapeutic study. See the Guidelines for Authors for a complete description of levels of evidence.

  9. Intelligent Agent-Based Intrusion Detection System Using Enhanced Multiclass SVM

    Science.gov (United States)

    Ganapathy, S.; Yogesh, P.; Kannan, A.

    2012-01-01

    Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set. PMID:23056036

  10. The obligation of physicians to medical outliers: a Kantian and Hegelian synthesis

    Directory of Open Access Journals (Sweden)

    Marco Alan P

    2004-06-01

    Full Text Available Abstract Background Patients who present to medical practices without health insurance or with serious co-morbidities can become fiscal disasters to those who care for them. Their consumption of scarce resources has caused consternation among providers and institutions, especially as it concerns the amount and type of care they should receive. In fact, some providers may try to avoid caring for them altogether, or at least try to limit their institutional or practice exposure to them. Discussion We present a philosophical discourse, with emphasis on the writings of Immanuel Kant and G.F.W. Hegel, as to why physicians have the moral imperative to give such "outliers" considerate and thoughtful care. Outliers are defined and the ideals of morality, responsibility, good will, duty, and principle are applied to the care of patients whose financial means are meager and to those whose care is physiologically futile. Actions of moral worth, unconditional good will, and doing what is right are examined. Summary Outliers are a legitimate economic concern to individual practitioners and institutions, however this should not lead to an evasion of care. These patients should be identified early in their course of care, but such identification should be preceded by a well-planned recognition of this burden and appropriate staffing and funding should be secured. A thoughtful team approach by medical practices and their institutions, involving both clinicians and non-clinicians, should be pursued.

  11. Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations.

    Science.gov (United States)

    Bakker, Marjan; Wicherts, Jelte M

    2014-09-01

    In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal sum scores. After reviewing common practice, we present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. We found Type I error rates of above 20% after removing outliers with a threshold value of Z = 2 in a short and difficult test. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold values of Z after having seen the effects thereof on outcomes. We recommend the use of nonparametric Mann-Whitney-Wilcoxon tests or robust Yuen-Welch tests without removing outliers. These alternatives to independent samples t tests are found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  12. The Relevance of Employee-Related Ratios for Early Detection of Corporate Crises

    Directory of Open Access Journals (Sweden)

    Mario Situm

    2014-12-01

    Full Text Available The purpose of this study was to analyse whether employee-related ratios derived from accounts have incremental predictive power for the early detection of corporate crises and bankruptcies. Based on the literature reviewed, it can be seen that not much attention has been drawn to this task, indicating that further research is justified. For empirical research purposes, a database of Austrian companies was used for the time period 2003 to 2005 in order to develop multivariate linear discriminant functions for the classification of companies into the two states; bankrupt and non-bankrupt, and to detect the contribution of employee-related ratios in explaining why firms fail. Several ratios from prior research were used as potential predictors. In addition, other separate ratios were analysed, including employee-related figures. The results of the study show that while employee-related ratios cannot contribute to an improvement in the classification performance of prediction models, signs of these ratios within the discriminant functions did show the expected directions. Efficient usage of employees seems to play an important role in decreasing the probability of insolvency. Additionally, two employee-related ratios were found which can be used as proxies for the size of the firm. This had not been identified in prior studies for this factor.

  13. Locally adaptive decision in detection of clustered microcalcifications in mammograms

    Science.gov (United States)

    Sainz de Cea, María V.; Nishikawa, Robert M.; Yang, Yongyi

    2018-02-01

    In computer-aided detection or diagnosis of clustered microcalcifications (MCs) in mammograms, the performance often suffers from not only the presence of false positives (FPs) among the detected individual MCs but also large variability in detection accuracy among different cases. To address this issue, we investigate a locally adaptive decision scheme in MC detection by exploiting the noise characteristics in a lesion area. Instead of developing a new MC detector, we propose a decision scheme on how to best decide whether a detected object is an MC or not in the detector output. We formulate the individual MCs as statistical outliers compared to the many noisy detections in a lesion area so as to account for the local image characteristics. To identify the MCs, we first consider a parametric method for outlier detection, the Mahalanobis distance detector, which is based on a multi-dimensional Gaussian distribution on the noisy detections. We also consider a non-parametric method which is based on a stochastic neighbor graph model of the detected objects. We demonstrated the proposed decision approach with two existing MC detectors on a set of 188 full-field digital mammograms (95 cases). The results, evaluated using free response operating characteristic (FROC) analysis, showed a significant improvement in detection accuracy by the proposed outlier decision approach over traditional thresholding (the partial area under the FROC curve increased from 3.95 to 4.25, p-value  FPs at a given sensitivity level. The proposed adaptive decision approach could not only reduce the number of FPs in detected MCs but also improve case-to-case consistency in detection.

  14. Algorithms for Anomaly Detection - Lecture 1

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    The concept of statistical anomalies, or outliers, has fascinated experimentalists since the earliest attempts to interpret data. We want to know why some data points don’t seem to belong with the others: perhaps we want to eliminate spurious or unrepresentative data from our model. Or, the anomalies themselves may be what we are interested in: an outlier could represent the symptom of a disease, an attack on a computer network, a scientific discovery, or even an unfaithful partner. We start with some general considerations, such as the relationship between clustering and anomaly detection, the choice between supervised and unsupervised methods, and the difference between global and local anomalies. Then we will survey the most representative anomaly detection algorithms, highlighting what kind of data each approach is best suited to, and discussing their limitations. We will finish with a discussion of the difficulties of anomaly detection in high-dimensional data and some new directions for anomaly detec...

  15. Algorithms for Anomaly Detection - Lecture 2

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    The concept of statistical anomalies, or outliers, has fascinated experimentalists since the earliest attempts to interpret data. We want to know why some data points don’t seem to belong with the others: perhaps we want to eliminate spurious or unrepresentative data from our model. Or, the anomalies themselves may be what we are interested in: an outlier could represent the symptom of a disease, an attack on a computer network, a scientific discovery, or even an unfaithful partner. We start with some general considerations, such as the relationship between clustering and anomaly detection, the choice between supervised and unsupervised methods, and the difference between global and local anomalies. Then we will survey the most representative anomaly detection algorithms, highlighting what kind of data each approach is best suited to, and discussing their limitations. We will finish with a discussion of the difficulties of anomaly detection in high-dimensional data and some new directions for anomaly detec...

  16. The effects of additive outliers on tests for unit roots and cointegration

    NARCIS (Netherlands)

    Ph.H.B.F. Franses (Philip Hans); N. Haldrup (Niels)

    1994-01-01

    textabstractThe properties of the univariate Dickey-Fuller test and the Johansen test for the cointegrating rank when there exist additive outlying observations in the time series are examined. The analysis provides analytical as well as numerical evidence that additive outliers may produce spurious

  17. Identifying multiple outliers in linear regression: robust fit and clustering approach

    International Nuclear Information System (INIS)

    Robiah Adnan; Mohd Nor Mohamad; Halim Setan

    2001-01-01

    This research provides a clustering based approach for determining potential candidates for outliers. This is modification of the method proposed by Serbert et. al (1988). It is based on using the single linkage clustering algorithm to group the standardized predicted and residual values of data set fit by least trimmed of squares (LTS). (Author)

  18. Computer-controlled detection system for high-precision isotope ratio measurements

    International Nuclear Information System (INIS)

    McCord, B.R.; Taylor, J.W.

    1986-01-01

    In this paper the authors describe a detection system for high-precision isotope ratio measurements. In this new system, the requirement for a ratioing digital voltmeter has been eliminated, and a standard digital voltmeter interfaced to a computer is employed. Instead of measuring the ratio of the two steadily increasing output voltages simultaneously, the digital voltmeter alternately samples the outputs at a precise rate over a certain period of time. The data are sent to the computer which calculates the rate of charge of each amplifier and divides the two rates to obtain the isotopic ratio. These results simulate a coincident measurement of the output of both integrators. The charge rate is calculated by using a linear regression method, and the standard error of the slope gives a measure of the stability of the system at the time the measurement was taken

  19. Outlier Loci and Selection Signatures of Simple Sequence Repeats (SSRs) in Flax (Linum usitatissimum L.).

    Science.gov (United States)

    Soto-Cerda, Braulio J; Cloutier, Sylvie

    2013-01-01

    Genomic microsatellites (gSSRs) and expressed sequence tag-derived SSRs (EST-SSRs) have gained wide application for elucidating genetic diversity and population structure in plants. Both marker systems are assumed to be selectively neutral when making demographic inferences, but this assumption is rarely tested. In this study, three neutrality tests were assessed for identifying outlier loci among 150 SSRs (85 gSSRs and 65 EST-SSRs) that likely influence estimates of population structure in three differentiated flax sub-populations ( F ST  = 0.19). Moreover, the utility of gSSRs, EST-SSRs, and the combined sets of SSRs was also evaluated in assessing genetic diversity and population structure in flax. Six outlier loci were identified by at least two neutrality tests showing footprints of balancing selection. After removing the outlier loci, the STRUCTURE analysis and the dendrogram topology of EST-SSRs improved. Conversely, gSSRs and combined SSRs results did not change significantly, possibly as a consequence of the higher number of neutral loci assessed. Taken together, the genetic structure analyses established the superiority of gSSRs to determine the genetic relationships among flax accessions, although the combined SSRs produced the best results. Genetic diversity parameters did not differ statistically ( P  > 0.05) between gSSRs and EST-SSRs, an observation partially explained by the similar number of repeat motifs. Our study provides new insights into the ability of gSSRs and EST-SSRs to measure genetic diversity and structure in flax and confirms the importance of testing for the occurrence of outlier loci to properly assess natural and breeding populations, particularly in studies considering only few loci.

  20. Robust Wavelet Estimation to Eliminate Simultaneously the Effects of Boundary Problems, Outliers, and Correlated Noise

    Directory of Open Access Journals (Sweden)

    Alsaidi M. Altaher

    2012-01-01

    Full Text Available Classical wavelet thresholding methods suffer from boundary problems caused by the application of the wavelet transformations to a finite signal. As a result, large bias at the edges and artificial wiggles occur when the classical boundary assumptions are not satisfied. Although polynomial wavelet regression and local polynomial wavelet regression effectively reduce the risk of this problem, the estimates from these two methods can be easily affected by the presence of correlated noise and outliers, giving inaccurate estimates. This paper introduces two robust methods in which the effects of boundary problems, outliers, and correlated noise are simultaneously taken into account. The proposed methods combine thresholding estimator with either a local polynomial model or a polynomial model using the generalized least squares method instead of the ordinary one. A primary step that involves removing the outlying observations through a statistical function is considered as well. The practical performance of the proposed methods has been evaluated through simulation experiments and real data examples. The results are strong evidence that the proposed method is extremely effective in terms of correcting the boundary bias and eliminating the effects of outliers and correlated noise.

  1. Hot spots, cluster detection and spatial outlier analysis of teen birth rates in the U.S., 2003-2012.

    Science.gov (United States)

    Khan, Diba; Rossen, Lauren M; Hamilton, Brady E; He, Yulei; Wei, Rong; Dienes, Erin

    2017-06-01

    Teen birth rates have evidenced a significant decline in the United States over the past few decades. Most of the states in the US have mirrored this national decline, though some reports have illustrated substantial variation in the magnitude of these decreases across the U.S. Importantly, geographic variation at the county level has largely not been explored. We used National Vital Statistics Births data and Hierarchical Bayesian space-time interaction models to produce smoothed estimates of teen birth rates at the county level from 2003-2012. Results indicate that teen birth rates show evidence of clustering, where hot and cold spots occur, and identify spatial outliers. Findings from this analysis may help inform efforts targeting the prevention efforts by illustrating how geographic patterns of teen birth rates have changed over the past decade and where clusters of high or low teen birth rates are evident. Published by Elsevier Ltd.

  2. Combined CT-based and image-free navigation systems in TKA reduces postoperative outliers of rotational alignment of the tibial component.

    Science.gov (United States)

    Mitsuhashi, Shota; Akamatsu, Yasushi; Kobayashi, Hideo; Kusayama, Yoshihiro; Kumagai, Ken; Saito, Tomoyuki

    2018-02-01

    Rotational malpositioning of the tibial component can lead to poor functional outcome in TKA. Although various surgical techniques have been proposed, precise rotational placement of the tibial component was difficult to accomplish even with the use of a navigation system. The purpose of this study is to assess whether combined CT-based and image-free navigation systems replicate accurately the rotational alignment of tibial component that was preoperatively planned on CT, compared with the conventional method. We compared the number of outliers for rotational alignment of the tibial component using combined CT-based and image-free navigation systems (navigated group) with those of conventional method (conventional group). Seventy-two TKAs were performed between May 2012 and December 2014. In the navigated group, the anteroposterior axis was prepared using CT-based navigation system and the tibial component was positioned under control of the navigation. In the conventional group, the tibial component was placed with reference to the Akagi line that was determined visually. Fisher's exact probability test was performed to evaluate the results. There was a significant difference between the two groups with regard to the number of outliers: 3 outliers in the navigated group compared with 12 outliers in the conventional group (P image-free navigation systems decreased the number of rotational outliers of tibial component, and was helpful for the replication of the accurate rotational alignment of the tibial component that was preoperatively planned.

  3. The Super‑efficiency Model and its Use for Ranking and Identification of Outliers

    Directory of Open Access Journals (Sweden)

    Kristína Kočišová

    2017-01-01

    Full Text Available This paper employs non‑radial and non‑oriented super‑efficiency SBM model under the assumption of a variable return to scale to analyse performance of twenty‑two Czech and Slovak domestic commercial banks in 2015. The banks were ranked according to asset‑oriented and profit‑oriented intermediation approach. We pooled the cross‑country data and used them to define a common best‑practice efficiency frontier. This allowed us to focus on determining relative differences in efficiency across banks. The average efficiency was evaluated separately on the “national” and “international” level. Based on the results of analysis can be seen that in Slovak banking sector the level of super‑efficiency was lower compared to Czech banks. Also, the number of super‑efficient banks was lower in a case of Slovakia under both approaches. The boxplot analysis was used to determine the outliers in the dataset. The results suggest that the exclusion of outliers led to the better statistical characteristic of estimated efficiency.

  4. Detection strategies for extreme mass ratio inspirals

    International Nuclear Information System (INIS)

    Cornish, Neil J

    2011-01-01

    The capture of compact stellar remnants by galactic black holes provides a unique laboratory for exploring the near-horizon geometry of the Kerr spacetime, or possible departures from general relativity if the central cores prove not to be black holes. The gravitational radiation produced by these extreme mass ratio inspirals (EMRIs) encodes a detailed map of the black hole geometry, and the detection and characterization of these signals is a major scientific goal for the LISA mission. The waveforms produced are very complex, and the signals need to be coherently tracked for tens of thousands of cycles to produce a detection, making EMRI signals one of the most challenging data analysis problems in all of gravitational wave astronomy. Estimates for the number of templates required to perform an exhaustive grid-based matched-filter search for these signals are astronomically large, and far out of reach of current computational resources. Here I describe an alternative approach that employs a hybrid between genetic algorithms and Markov chain Monte Carlo techniques, along with several time-saving techniques for computing the likelihood function. This approach has proven effective at the blind extraction of relatively weak EMRI signals from simulated LISA data sets.

  5. Hot spots, cluster detection and spatial outlier analysis of teen birth rates in the U.S., 2003–2012

    Science.gov (United States)

    Khan, Diba; Rossen, Lauren M.; Hamilton, Brady E.; He, Yulei; Wei, Rong; Dienes, Erin

    2017-01-01

    Teen birth rates have evidenced a significant decline in the United States over the past few decades. Most of the states in the US have mirrored this national decline, though some reports have illustrated substantial variation in the magnitude of these decreases across the U.S. Importantly, geographic variation at the county level has largely not been explored. We used National Vital Statistics Births data and Hierarchical Bayesian space-time interaction models to produce smoothed estimates of teen birth rates at the county level from 2003–2012. Results indicate that teen birth rates show evidence of clustering, where hot and cold spots occur, and identify spatial outliers. Findings from this analysis may help inform efforts targeting the prevention efforts by illustrating how geographic patterns of teen birth rates have changed over the past decade and where clusters of high or low teen birth rates are evident. PMID:28552189

  6. Identification of unusual events in multichannel bridge monitoring data using wavelet transform and outlier analysis

    Science.gov (United States)

    Omenzetter, Piotr; Brownjohn, James M. W.; Moyo, Pilate

    2003-08-01

    Continuously operating instrumented structural health monitoring (SHM) systems are becoming a practical alternative to replace visual inspection for assessment of condition and soundness of civil infrastructure. However, converting large amount of data from an SHM system into usable information is a great challenge to which special signal processing techniques must be applied. This study is devoted to identification of abrupt, anomalous and potentially onerous events in the time histories of static, hourly sampled strains recorded by a multi-sensor SHM system installed in a major bridge structure in Singapore and operating continuously for a long time. Such events may result, among other causes, from sudden settlement of foundation, ground movement, excessive traffic load or failure of post-tensioning cables. A method of outlier detection in multivariate data has been applied to the problem of finding and localizing sudden events in the strain data. For sharp discrimination of abrupt strain changes from slowly varying ones wavelet transform has been used. The proposed method has been successfully tested using known events recorded during construction of the bridge, and later effectively used for detection of anomalous post-construction events.

  7. Probabilistic Neural Networks for Chemical Sensor Array Pattern Recognition: Comparison Studies, Improvements and Automated Outlier Rejection

    National Research Council Canada - National Science Library

    Shaffer, Ronald E

    1998-01-01

    For application to chemical sensor arrays, the ideal pattern recognition is accurate, fast, simple to train, robust to outliers, has low memory requirements, and has the ability to produce a measure...

  8. A Space Object Detection Algorithm using Fourier Domain Likelihood Ratio Test

    Science.gov (United States)

    Becker, D.; Cain, S.

    Space object detection is of great importance in the highly dependent yet competitive and congested space domain. Detection algorithms employed play a crucial role in fulfilling the detection component in the situational awareness mission to detect, track, characterize and catalog unknown space objects. Many current space detection algorithms use a matched filter or a spatial correlator to make a detection decision at a single pixel point of a spatial image based on the assumption that the data follows a Gaussian distribution. This paper explores the potential for detection performance advantages when operating in the Fourier domain of long exposure images of small and/or dim space objects from ground based telescopes. A binary hypothesis test is developed based on the joint probability distribution function of the image under the hypothesis that an object is present and under the hypothesis that the image only contains background noise. The detection algorithm tests each pixel point of the Fourier transformed images to make the determination if an object is present based on the criteria threshold found in the likelihood ratio test. Using simulated data, the performance of the Fourier domain detection algorithm is compared to the current algorithm used in space situational awareness applications to evaluate its value.

  9. Anomaly Detection using the "Isolation Forest" algorithm

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    Anomaly detection can provide clues about an outlying minority class in your data: hackers in a set of network events, fraudsters in a set of credit card transactions, or exotic particles in a set of high-energy collisions. In this talk, we analyze a real dataset of breast tissue biopsies, with malignant results forming the minority class. The "Isolation Forest" algorithm finds anomalies by deliberately “overfitting” models that memorize each data point. Since outliers have more empty space around them, they take fewer steps to memorize. Intuitively, a house in the country can be identified simply as “that house out by the farm”, while a house in the city needs a longer description like “that house in Brooklyn, near Prospect Park, on Union Street, between the firehouse and the library, not far from the French restaurant”. We first use anomaly detection to find outliers in the biopsy data, then apply traditional predictive modeling to discover rules that separate anomalies from normal data...

  10. Augmented kludge waveforms for detecting extreme-mass-ratio inspirals

    Science.gov (United States)

    Chua, Alvin J. K.; Moore, Christopher J.; Gair, Jonathan R.

    2017-08-01

    The extreme-mass-ratio inspirals (EMRIs) of stellar-mass compact objects into massive black holes are an important class of source for the future space-based gravitational-wave detector LISA. Detecting signals from EMRIs will require waveform models that are both accurate and computationally efficient. In this paper, we present the latest implementation of an augmented analytic kludge (AAK) model, publicly available at https://github.com/alvincjk/EMRI_Kludge_Suite as part of an EMRI waveform software suite. This version of the AAK model has improved accuracy compared to its predecessors, with two-month waveform overlaps against a more accurate fiducial model exceeding 0.97 for a generic range of sources; it also generates waveforms 5-15 times faster than the fiducial model. The AAK model is well suited for scoping out data analysis issues in the upcoming round of mock LISA data challenges. A simple analytic argument shows that it might even be viable for detecting EMRIs with LISA through a semicoherent template bank method, while the use of the original analytic kludge in the same approach will result in around 90% fewer detections.

  11. A Positive Deviance Approach to Early Childhood Obesity: Cross-Sectional Characterization of Positive Outliers

    OpenAIRE

    Foster, Byron Alexander; Farragher, Jill; Parker, Paige; Hale, Daniel E.

    2015-01-01

    Objective: Positive deviance methodology has been applied in the developing world to address childhood malnutrition and has potential for application to childhood obesity in the United States. We hypothesized that among children at high-risk for obesity, evaluating normal weight children will enable identification of positive outlier behaviors and practices.

  12. Outlier treatment for improving parameter estimation of group contribution based models for upper flammability limit

    DEFF Research Database (Denmark)

    Frutiger, Jerome; Abildskov, Jens; Sin, Gürkan

    2015-01-01

    Flammability data is needed to assess the risk of fire and explosions. This study presents a new group contribution (GC) model to predict the upper flammability limit UFL oforganic chemicals. Furthermore, it provides a systematic method for outlier treatment inorder to improve the parameter...

  13. Slip Ratio Estimation and Regenerative Brake Control for Decelerating Electric Vehicles without Detection of Vehicle Velocity and Acceleration

    Science.gov (United States)

    Suzuki, Toru; Fujimoto, Hiroshi

    In slip ratio control systems, it is necessary to detect the vehicle velocity in order to obtain the slip ratio. However, it is very difficult to measure this velocity directly. We have proposed slip ratio estimation and control methods that do not require the vehicle velocity with acceleration. In this paper, the slip ratio estimation and control methods are proposed without detecting the vehicle velocity and acceleration when it is decelerating. We carried out simulations and experiments by using an electric vehicle to verify the effectiveness of the proposed method.

  14. Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data

    Science.gov (United States)

    Yin, Shen; Wang, Guang; Yang, Xu

    2014-07-01

    In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.

  15. Differentiation of thyroid lesion detected by FDG PET/CT using SUV ratio

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Bom Sahn; Kang, Won Jun; Lee, Dong Soo; Chung, June Key; Lee, Myung Chul [Seoul National Univ. College of Medicine, Seoul (Korea, Republic of)

    2007-07-01

    We investigated the usefulness of SUV ratio to discriminate focal thyroid lesion incidentally detected on 18F-FDG PET/CT (FDG PET) in patients with malignant disease. A total of 2167 subjects with malignant tumor underwent PET/CT for staging. Forty-five of 2167 subjects (2.1%) showed hypermetabolic thyroid lesions on FDG PET. Of 45, 21 lesions were confirmed by pathology (n = 16) or follow up exam (n=5). Seventeen patients had focal FDG uptakes, while 4 patients had diffuse thyroid uptakes. Standardized uptake value (SUV) was measured by drawing region of interest (ROI) on bilateral thyroid lobes and liver. From 21 patients, 12 thyroid lesions were confirmed as malignant lesions and 9 lesions as benign lesions. All of bilateral thyroid FDG uptakes were determined as benign disease such as thyroiditis. From seventeen focal thyroid incidentaloma, FDG PET had 100 % (12/12) of sensitivity and 60 % (3/5) of specificity, retrospectively. Malignant nodules had a significantly higher lesion to liver ratio than those of benign nodules (2.10.9 vs. 1.20.6, p=0.029). With ROC curve, the best cut-off value of lesion to liver was 1.0 with sensitivity of 100% and specificity of 60 % (area under the curve=0.783). The SUV ratio of lesion to contralateral lobe do not have statistical significance to determine malignancy (3.72.1 vs. 2.61.7, p=0.079). This study showed that focal thyroidal FDG uptake detected by FDG PET could be differentiated with best performance by SUV ratio of lesion to liver.

  16. Differences in Movement Pattern and Detectability between Males and Females Influence How Common Sampling Methods Estimate Sex Ratio.

    Directory of Open Access Journals (Sweden)

    João Fabrício Mota Rodrigues

    Full Text Available Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population's sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns.

  17. Differences in Movement Pattern and Detectability between Males and Females Influence How Common Sampling Methods Estimate Sex Ratio.

    Science.gov (United States)

    Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco

    2016-01-01

    Sampling the biodiversity is an essential step for conservation, and understanding the efficiency of sampling methods allows us to estimate the quality of our biodiversity data. Sex ratio is an important population characteristic, but until now, no study has evaluated how efficient are the sampling methods commonly used in biodiversity surveys in estimating the sex ratio of populations. We used a virtual ecologist approach to investigate whether active and passive capture methods are able to accurately sample a population's sex ratio and whether differences in movement pattern and detectability between males and females produce biased estimates of sex-ratios when using these methods. Our simulation allowed the recognition of individuals, similar to mark-recapture studies. We found that differences in both movement patterns and detectability between males and females produce biased estimates of sex ratios. However, increasing the sampling effort or the number of sampling days improves the ability of passive or active capture methods to properly sample sex ratio. Thus, prior knowledge regarding movement patterns and detectability for species is important information to guide field studies aiming to understand sex ratio related patterns.

  18. SPATIAL CLUSTER AND OUTLIER IDENTIFICATION OF GEOCHEMICAL ASSOCIATION OF ELEMENTS: A CASE STUDY IN JUIRUI COPPER MINING AREA

    Directory of Open Access Journals (Sweden)

    Tien Thanh NGUYEN

    2016-12-01

    Full Text Available Spatial clusters and spatial outliers play an important role in the study of the spatial distribution patterns of geochemical data. They characterize the fundamental properties of mineralization processes, the spatial distribution of mineral deposits, and ore element concentrations in mineral districts. In this study, a new method for the study of spatial distribution patterns of multivariate data is proposed based on a combination of robust Mahalanobis distance and local Moran’s Ii. In order to construct the spatial matrix, the Moran's I spatial correlogram was first used to determine the range. The robust Mahalanobis distances were then computed for an association of elements. Finally, local Moran’s Ii statistics was used to measure the degree of spatial association and discover the spatial distribution patterns of associations of Cu, Au, Mo, Ag, Pb, Zn, As, and Sb elements including spatial clusters and spatial outliers. Spatial patterns were analyzed at six different spatial scales (2km, 4 km, 6 km, 8 km, 10 km and 12 km for both the raw data and Box-Cox transformed data. The results show that identified spatial cluster and spatial outlier areas using local Moran’s Ii and the robust Mahalanobis accord the objective reality and have a good conformity with known deposits in the study area.

  19. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    Science.gov (United States)

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  20. An angle-based subspace anomaly detection approach to high-dimensional data: With an application to industrial fault detection

    International Nuclear Information System (INIS)

    Zhang, Liangwei; Lin, Jing; Karim, Ramin

    2015-01-01

    The accuracy of traditional anomaly detection techniques implemented on full-dimensional spaces degrades significantly as dimensionality increases, thereby hampering many real-world applications. This work proposes an approach to selecting meaningful feature subspace and conducting anomaly detection in the corresponding subspace projection. The aim is to maintain the detection accuracy in high-dimensional circumstances. The suggested approach assesses the angle between all pairs of two lines for one specific anomaly candidate: the first line is connected by the relevant data point and the center of its adjacent points; the other line is one of the axis-parallel lines. Those dimensions which have a relatively small angle with the first line are then chosen to constitute the axis-parallel subspace for the candidate. Next, a normalized Mahalanobis distance is introduced to measure the local outlier-ness of an object in the subspace projection. To comprehensively compare the proposed algorithm with several existing anomaly detection techniques, we constructed artificial datasets with various high-dimensional settings and found the algorithm displayed superior accuracy. A further experiment on an industrial dataset demonstrated the applicability of the proposed algorithm in fault detection tasks and highlighted another of its merits, namely, to provide preliminary interpretation of abnormality through feature ordering in relevant subspaces. - Highlights: • An anomaly detection approach for high-dimensional reliability data is proposed. • The approach selects relevant subspaces by assessing vectorial angles. • The novel ABSAD approach displays superior accuracy over other alternatives. • Numerical illustration approves its efficacy in fault detection applications

  1. Improved anomaly detection using multi-scale PLS and generalized likelihood ratio test

    KAUST Repository

    Madakyaru, Muddu

    2017-02-16

    Process monitoring has a central role in the process industry to enhance productivity, efficiency, and safety, and to avoid expensive maintenance. In this paper, a statistical approach that exploit the advantages of multiscale PLS models (MSPLS) and those of a generalized likelihood ratio (GLR) test to better detect anomalies is proposed. Specifically, to consider the multivariate and multi-scale nature of process dynamics, a MSPLS algorithm combining PLS and wavelet analysis is used as modeling framework. Then, GLR hypothesis testing is applied using the uncorrelated residuals obtained from MSPLS model to improve the anomaly detection abilities of these latent variable based fault detection methods even further. Applications to a simulated distillation column data are used to evaluate the proposed MSPLS-GLR algorithm.

  2. Improved anomaly detection using multi-scale PLS and generalized likelihood ratio test

    KAUST Repository

    Madakyaru, Muddu; Harrou, Fouzi; Sun, Ying

    2017-01-01

    Process monitoring has a central role in the process industry to enhance productivity, efficiency, and safety, and to avoid expensive maintenance. In this paper, a statistical approach that exploit the advantages of multiscale PLS models (MSPLS) and those of a generalized likelihood ratio (GLR) test to better detect anomalies is proposed. Specifically, to consider the multivariate and multi-scale nature of process dynamics, a MSPLS algorithm combining PLS and wavelet analysis is used as modeling framework. Then, GLR hypothesis testing is applied using the uncorrelated residuals obtained from MSPLS model to improve the anomaly detection abilities of these latent variable based fault detection methods even further. Applications to a simulated distillation column data are used to evaluate the proposed MSPLS-GLR algorithm.

  3. An SPSS implementation of the nonrecursive outlier deletion procedure with shifting z score criterion (Van Selst & Jolicoeur, 1994).

    Science.gov (United States)

    Thompson, Glenn L

    2006-05-01

    Sophisticated univariate outlier screening procedures are not yet available in widely used statistical packages such as SPSS. However, SPSS can accept user-supplied programs for executing these procedures. Failing this, researchers tend to rely on simplistic alternatives that can distort data because they do not adjust to cell-specific characteristics. Despite their popularity, these simple procedures may be especially ill suited for some applications (e.g., data from reaction time experiments). A user friendly SPSS Production Facility implementation of the shifting z score criterion procedure (Van Selst & Jolicoeur, 1994) is presented in an attempt to make it easier to use. In addition to outlier screening, optional syntax modules can be added that will perform tedious database management tasks (e.g., restructuring or computing means).

  4. An Unsupervised Anomalous Event Detection and Interactive Analysis Framework for Large-scale Satellite Data

    Science.gov (United States)

    LIU, Q.; Lv, Q.; Klucik, R.; Chen, C.; Gallaher, D. W.; Grant, G.; Shang, L.

    2016-12-01

    Due to the high volume and complexity of satellite data, computer-aided tools for fast quality assessments and scientific discovery are indispensable for scientists in the era of Big Data. In this work, we have developed a framework for automated anomalous event detection in massive satellite data. The framework consists of a clustering-based anomaly detection algorithm and a cloud-based tool for interactive analysis of detected anomalies. The algorithm is unsupervised and requires no prior knowledge of the data (e.g., expected normal pattern or known anomalies). As such, it works for diverse data sets, and performs well even in the presence of missing and noisy data. The cloud-based tool provides an intuitive mapping interface that allows users to interactively analyze anomalies using multiple features. As a whole, our framework can (1) identify outliers in a spatio-temporal context, (2) recognize and distinguish meaningful anomalous events from individual outliers, (3) rank those events based on "interestingness" (e.g., rareness or total number of outliers) defined by users, and (4) enable interactively query, exploration, and analysis of those anomalous events. In this presentation, we will demonstrate the effectiveness and efficiency of our framework in the application of detecting data quality issues and unusual natural events using two satellite datasets. The techniques and tools developed in this project are applicable for a diverse set of satellite data and will be made publicly available for scientists in early 2017.

  5. An unsupervised learning algorithm for fatigue crack detection in waveguides

    International Nuclear Information System (INIS)

    Rizzo, Piervincenzo; Cammarata, Marcello; Kent Harries; Dutta, Debaditya; Sohn, Hoon

    2009-01-01

    Ultrasonic guided waves (UGWs) are a useful tool in structural health monitoring (SHM) applications that can benefit from built-in transduction, moderately large inspection ranges, and high sensitivity to small flaws. This paper describes an SHM method based on UGWs and outlier analysis devoted to the detection and quantification of fatigue cracks in structural waveguides. The method combines the advantages of UGWs with the outcomes of the discrete wavelet transform (DWT) to extract defect-sensitive features aimed at performing a multivariate diagnosis of damage. In particular, the DWT is exploited to generate a set of relevant wavelet coefficients to construct a uni-dimensional or multi-dimensional damage index vector. The vector is fed to an outlier analysis to detect anomalous structural states. The general framework presented in this paper is applied to the detection of fatigue cracks in a steel beam. The probing hardware consists of a National Instruments PXI platform that controls the generation and detection of the ultrasonic signals by means of piezoelectric transducers made of lead zirconate titanate. The effectiveness of the proposed approach to diagnose the presence of defects as small as a few per cent of the waveguide cross-sectional area is demonstrated

  6. SHADOW DETECTION FROM VERY HIGH RESOLUTON SATELLITE IMAGE USING GRABCUT SEGMENTATION AND RATIO-BAND ALGORITHMS

    Directory of Open Access Journals (Sweden)

    N. M. S. M. Kadhim

    2015-03-01

    Full Text Available Very-High-Resolution (VHR satellite imagery is a powerful source of data for detecting and extracting information about urban constructions. Shadow in the VHR satellite imageries provides vital information on urban construction forms, illumination direction, and the spatial distribution of the objects that can help to further understanding of the built environment. However, to extract shadows, the automated detection of shadows from images must be accurate. This paper reviews current automatic approaches that have been used for shadow detection from VHR satellite images and comprises two main parts. In the first part, shadow concepts are presented in terms of shadow appearance in the VHR satellite imageries, current shadow detection methods, and the usefulness of shadow detection in urban environments. In the second part, we adopted two approaches which are considered current state-of-the-art shadow detection, and segmentation algorithms using WorldView-3 and Quickbird images. In the first approach, the ratios between the NIR and visible bands were computed on a pixel-by-pixel basis, which allows for disambiguation between shadows and dark objects. To obtain an accurate shadow candidate map, we further refine the shadow map after applying the ratio algorithm on the Quickbird image. The second selected approach is the GrabCut segmentation approach for examining its performance in detecting the shadow regions of urban objects using the true colour image from WorldView-3. Further refinement was applied to attain a segmented shadow map. Although the detection of shadow regions is a very difficult task when they are derived from a VHR satellite image that comprises a visible spectrum range (RGB true colour, the results demonstrate that the detection of shadow regions in the WorldView-3 image is a reasonable separation from other objects by applying the GrabCut algorithm. In addition, the derived shadow map from the Quickbird image indicates

  7. Shadow Detection from Very High Resoluton Satellite Image Using Grabcut Segmentation and Ratio-Band Algorithms

    Science.gov (United States)

    Kadhim, N. M. S. M.; Mourshed, M.; Bray, M. T.

    2015-03-01

    Very-High-Resolution (VHR) satellite imagery is a powerful source of data for detecting and extracting information about urban constructions. Shadow in the VHR satellite imageries provides vital information on urban construction forms, illumination direction, and the spatial distribution of the objects that can help to further understanding of the built environment. However, to extract shadows, the automated detection of shadows from images must be accurate. This paper reviews current automatic approaches that have been used for shadow detection from VHR satellite images and comprises two main parts. In the first part, shadow concepts are presented in terms of shadow appearance in the VHR satellite imageries, current shadow detection methods, and the usefulness of shadow detection in urban environments. In the second part, we adopted two approaches which are considered current state-of-the-art shadow detection, and segmentation algorithms using WorldView-3 and Quickbird images. In the first approach, the ratios between the NIR and visible bands were computed on a pixel-by-pixel basis, which allows for disambiguation between shadows and dark objects. To obtain an accurate shadow candidate map, we further refine the shadow map after applying the ratio algorithm on the Quickbird image. The second selected approach is the GrabCut segmentation approach for examining its performance in detecting the shadow regions of urban objects using the true colour image from WorldView-3. Further refinement was applied to attain a segmented shadow map. Although the detection of shadow regions is a very difficult task when they are derived from a VHR satellite image that comprises a visible spectrum range (RGB true colour), the results demonstrate that the detection of shadow regions in the WorldView-3 image is a reasonable separation from other objects by applying the GrabCut algorithm. In addition, the derived shadow map from the Quickbird image indicates significant performance of

  8. Outlier-based Health Insurance Fraud Detection for U.S. Medicaid Data

    NARCIS (Netherlands)

    Thornton, Dallas; van Capelleveen, Guido; Poel, Mannes; van Hillegersberg, Jos; Mueller, Roland

    Fraud, waste, and abuse in the U.S. healthcare system are estimated at $700 billion annually. Predictive analytics offers government and private payers the opportunity to identify and prevent or recover such billings. This paper proposes a data-driven method for fraud detection based on comparative

  9. Chapter 5

    African Journals Online (AJOL)

    Sun Botha

    Mean maximum shear force values were calculated from the shear force values recorded for ... When deviations from normality were detected, outliers were removed until data were symmetrical or .... P – Probability value of F-ratio test.

  10. Afrika Statistika ISSN 2316-090X A Bayesian significance test of ...

    African Journals Online (AJOL)

    of the generalized likelihood ratio test to detect a change in binomial ... computational simplicity to the problem of calculating posterior marginals. ... the impact of a single outlier on the performance of the Bayesian significance test of change.

  11. A practical method to detect the freezing/thawing onsets of seasonal frozen ground in Alaska

    Science.gov (United States)

    Chen, Xiyu; Liu, Lin

    2017-04-01

    Microwave remote sensing can provide useful information about freeze/thaw state of soil at the Earth surface. An edge detection method is applied in this study to estimate the onsets of soil freeze/thaw state transition using L band space-borne radiometer data. The Soil Moisture Active Passive (SMAP) mission has a L band radiometer and can provide daily brightness temperature (TB) with horizontal/vertical polarizations. We use the normalized polarization ratios (NPR) calculated based on the Level-1C TB product of SMAP (spatial resolution: 36 km) as the indicator for soil freeze/thaw state, to estimate the freezing and thawing onsets in Alaska in the year of 2015 and 2016. NPR is calculated based on the difference between TB at vertical and horizontal polarizations. Therefore, it is strongly sensitive to liquid water content change in the soil and independent with the soil temperature. Onset estimation is based on the detection of abrupt changes of NPR in transition seasons using edge detection method, and the validation is to compare estimated onsets with the onsets derived from in situ measurement. According to the comparison, the estimated onsets were generally 15 days earlier than the measured onsets in 2015. However, in 2016 there were 4 days in average for the estimation earlier than the measured, which may be due to the less snow cover. Moreover, we extended our estimation to the entire state of Alaska. The estimated freeze/thaw onsets showed a reasonable latitude-dependent distribution although there are still some outliers caused by the noisy variation of NPR. At last, we also try to remove these outliers and improve the performance of the method by smoothing the NPR time series.

  12. Orthogonal series generalized likelihood ratio test for failure detection and isolation. [for aircraft control

    Science.gov (United States)

    Hall, Steven R.; Walker, Bruce K.

    1990-01-01

    A new failure detection and isolation algorithm for linear dynamic systems is presented. This algorithm, the Orthogonal Series Generalized Likelihood Ratio (OSGLR) test, is based on the assumption that the failure modes of interest can be represented by truncated series expansions. This assumption leads to a failure detection algorithm with several desirable properties. Computer simulation results are presented for the detection of the failures of actuators and sensors of a C-130 aircraft. The results show that the OSGLR test generally performs as well as the GLR test in terms of time to detect a failure and is more robust to failure mode uncertainty. However, the OSGLR test is also somewhat more sensitive to modeling errors than the GLR test.

  13. Quantile index for gradual and abrupt change detection from CFB boiler sensor data in online settings

    NARCIS (Netherlands)

    Maslov, A.; Pechenizkiy, M.; Kärkkäinen, T.; Tähtinen, M.

    2012-01-01

    In this paper we consider the problem of online detection of gradual and abrupt changes in sensor data having high levels of noise and outliers. We propose a simple heuristic method based on the Quantile Index (QI) and study how robust this method is for detecting both gradual and abrupt changes

  14. Self-adaptive change detection in streaming data with non-stationary distribution

    KAUST Repository

    Zhang, Xiangliang

    2010-01-01

    Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches. © 2010 Springer-Verlag.

  15. Data Fusion and Fuzzy Clustering on Ratio Images for Change Detection in Synthetic Aperture Radar Images

    Directory of Open Access Journals (Sweden)

    Wenping Ma

    2014-01-01

    Full Text Available The unsupervised approach to change detection via synthetic aperture radar (SAR images becomes more and more popular. The three-step procedure is the most widely used procedure, but it does not work well with the Yellow River Estuary dataset obtained by two synthetic aperture radars. The difference of the two radars in imaging techniques causes severe noise, which seriously affects the difference images generated by a single change detector in step two, producing the difference image. To deal with problem, we propose a change detector to fuse the log-ratio (LR and the mean-ratio (MR images by a context independent variable behavior (CIVB operator and can utilize the complement information in two ratio images. In order to validate the effectiveness of the proposed change detector, the change detector will be compared with three other change detectors, namely, the log-ratio (LR, mean-ratio (MR, and the wavelet-fusion (WR operator, to deal with three datasets with different characteristics. The four operators are applied not only in a widely used three-step procedure but also in a new approach. The experiments show that the false alarms and overall errors of change detection are greatly reduced, and the kappa and KCC are improved a lot. And its superiority can also be observed visually.

  16. Automated rice leaf disease detection using color image analysis

    Science.gov (United States)

    Pugoy, Reinald Adrian D. L.; Mariano, Vladimir Y.

    2011-06-01

    In rice-related institutions such as the International Rice Research Institute, assessing the health condition of a rice plant through its leaves, which is usually done as a manual eyeball exercise, is important to come up with good nutrient and disease management strategies. In this paper, an automated system that can detect diseases present in a rice leaf using color image analysis is presented. In the system, the outlier region is first obtained from a rice leaf image to be tested using histogram intersection between the test and healthy rice leaf images. Upon obtaining the outlier, it is then subjected to a threshold-based K-means clustering algorithm to group related regions into clusters. Then, these clusters are subjected to further analysis to finally determine the suspected diseases of the rice leaf.

  17. Robust Deep Network with Maximum Correntropy Criterion for Seizure Detection

    Directory of Open Access Journals (Sweden)

    Yu Qi

    2014-01-01

    Full Text Available Effective seizure detection from long-term EEG is highly important for seizure diagnosis. Existing methods usually design the feature and classifier individually, while little work has been done for the simultaneous optimization of the two parts. This work proposes a deep network to jointly learn a feature and a classifier so that they could help each other to make the whole system optimal. To deal with the challenge of the impulsive noises and outliers caused by EMG artifacts in EEG signals, we formulate a robust stacked autoencoder (R-SAE as a part of the network to learn an effective feature. In R-SAE, the maximum correntropy criterion (MCC is proposed to reduce the effect of noise/outliers. Unlike the mean square error (MSE, the output of the new kernel MCC increases more slowly than that of MSE when the input goes away from the center. Thus, the effect of those noises/outliers positioned far away from the center can be suppressed. The proposed method is evaluated on six patients of 33.6 hours of scalp EEG data. Our method achieves a sensitivity of 100% and a specificity of 99%, which is promising for clinical applications.

  18. Automated detection and classification of major retinal vessels for determination of diameter ratio of arteries and veins

    Science.gov (United States)

    Muramatsu, Chisako; Hatanaka, Yuji; Iwase, Tatsuhiko; Hara, Takeshi; Fujita, Hiroshi

    2010-03-01

    Abnormalities of retinal vasculatures can indicate health conditions in the body, such as the high blood pressure and diabetes. Providing automatically determined width ratio of arteries and veins (A/V ratio) on retinal fundus images may help physicians in the diagnosis of hypertensive retinopathy, which may cause blindness. The purpose of this study was to detect major retinal vessels and classify them into arteries and veins for the determination of A/V ratio. Images used in this study were obtained from DRIVE database, which consists of 20 cases each for training and testing vessel detection algorithms. Starting with the reference standard of vasculature segmentation provided in the database, major arteries and veins each in the upper and lower temporal regions were manually selected for establishing the gold standard. We applied the black top-hat transformation and double-ring filter to detect retinal blood vessels. From the extracted vessels, large vessels extending from the optic disc to temporal regions were selected as target vessels for calculation of A/V ratio. Image features were extracted from the vessel segments from quarter-disc to one disc diameter from the edge of optic discs. The target segments in the training cases were classified into arteries and veins by using the linear discriminant analysis, and the selected parameters were applied to those in the test cases. Out of 40 pairs, 30 pairs (75%) of arteries and veins in the 20 test cases were correctly classified. The result can be used for the automated calculation of A/V ratio.

  19. Supervised detection of anomalous light curves in massive astronomical catalogs

    International Nuclear Information System (INIS)

    Nun, Isadora; Pichara, Karim; Protopapas, Pavlos; Kim, Dae-Won

    2014-01-01

    The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. In order to process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new methodology to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each of the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. By leaving out one of the classes on the training set, we perform a validity test and show that when the random forest classifier attempts to classify unknown light curves (the class left out), it votes with an unusual distribution among the classes. This rare voting is detected by the Bayesian network and expressed as a low joint probability. Our method is suitable for exploring massive data sets given that the training process is performed offline. We tested our algorithm on 20 million light curves from the MACHO catalog and generated a list of anomalous candidates. After analysis, we divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration, or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post-analysis stage by performing a cross-match with all publicly available catalogs. Within these candidates we identified certain known

  20. Supervised Detection of Anomalous Light Curves in Massive Astronomical Catalogs

    Science.gov (United States)

    Nun, Isadora; Pichara, Karim; Protopapas, Pavlos; Kim, Dae-Won

    2014-09-01

    The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. In order to process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new methodology to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each of the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. By leaving out one of the classes on the training set, we perform a validity test and show that when the random forest classifier attempts to classify unknown light curves (the class left out), it votes with an unusual distribution among the classes. This rare voting is detected by the Bayesian network and expressed as a low joint probability. Our method is suitable for exploring massive data sets given that the training process is performed offline. We tested our algorithm on 20 million light curves from the MACHO catalog and generated a list of anomalous candidates. After analysis, we divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration, or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post-analysis stage by performing a cross-match with all publicly available catalogs. Within these candidates we identified certain known

  1. The Outlier Sectors: Areas of Non-Free Trade in the North American Free Trade Agreement

    OpenAIRE

    Eric T. Miller

    2002-01-01

    Since its entry into force, the North American Free Trade Agreement (NAFTA) has been enormously influential as a model for trade liberalization. While trade in goods among Canada, the United States and Mexico has been liberalized to a significant degree, this most famous of agreements nonetheless contains areas of recalcitrant protectionism. The first part of this paper identifies these "outlier sectors" and classifies them by primary source advocating protectionism, i.e., producer interests ...

  2. Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.

    Science.gov (United States)

    Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram

    2017-02-01

    In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.

  3. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center.

    Science.gov (United States)

    Mehra, Tarun; Müller, Christian Thomas Benedikt; Volbracht, Jörk; Seifert, Burkhardt; Moos, Rudolf

    2015-01-01

    Case weights of Diagnosis Related Groups (DRGs) are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG. 28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings. Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001). Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90). Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile). Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52). ICU stay, mechanical and patient clinical complexity level score (PCCL) predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001). We suggest considering psychiatric diagnosis, admission as an emergency case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses.

  4. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center.

    Directory of Open Access Journals (Sweden)

    Tarun Mehra

    Full Text Available Case weights of Diagnosis Related Groups (DRGs are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG.28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings.Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001. Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90. Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile. Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52. ICU stay, mechanical and patient clinical complexity level score (PCCL predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001.We suggest considering psychiatric diagnosis, admission as an emergency case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses.

  5. Outlier SNP markers reveal fine-scale genetic structuring across European hake populations (Merluccius merluccius)

    DEFF Research Database (Denmark)

    Milano, I.; Babbucci, M.; Cariani, A.

    2014-01-01

    fishery. Analysis of 850 individuals from 19 locations across the entire distribution range showed evidence for several outlier loci, with significantly higher resolving power. While 299 putatively neutral SNPs confirmed the genetic break between basins (FCT = 0.016) and weak differentiation within basins...... even when neutral markers provide genetic homogeneity across populations. Here, 381 SNPs located in transcribed regions were used to assess largeand fine-scale population structure in the European hake (Merluccius merluccius), a widely distributed demersal species of high priority for the European...

  6. Efficient Estimation of Dynamic Density Functions with Applications in Streaming Data

    KAUST Repository

    Qahtan, Abdulhakim Ali Ali

    2016-01-01

    application is to detect outliers in data streams from sensor networks based on the estimated PDF. The method detects outliers accurately and outperforms baseline methods designed for detecting and cleaning outliers in sensor data. The third application

  7. The source of prehistoric obsidian artefacts from the Polynesian outlier of Taumako in the Solomon Islands

    Energy Technology Data Exchange (ETDEWEB)

    Leach, Foss [Otago Univ., Dunedin (New Zealand). Dept. of Anthropology

    1985-01-01

    Six obsidian artefacts from the Polynesian outlier of Taumako in the Solomon Islands dating to between 500 and 1000 B.C. were analysed for trace elements by the PIXE-PIGME method. Four are shown to derive from Vanuatu, but the remaining two artefacts do not match any of the known 66 sources in the Pacific region. Continuing difficulties with the methodology of Pacific obsidian sourcing are discussed. 14 refs; 2 tables.

  8. Similarity ratio analysis for early stage fault detection with optical emission spectrometer in plasma etching process.

    Directory of Open Access Journals (Sweden)

    Jie Yang

    Full Text Available A Similarity Ratio Analysis (SRA method is proposed for early-stage Fault Detection (FD in plasma etching processes using real-time Optical Emission Spectrometer (OES data as input. The SRA method can help to realise a highly precise control system by detecting abnormal etch-rate faults in real-time during an etching process. The method processes spectrum scans at successive time points and uses a windowing mechanism over the time series to alleviate problems with timing uncertainties due to process shift from one process run to another. A SRA library is first built to capture features of a healthy etching process. By comparing with the SRA library, a Similarity Ratio (SR statistic is then calculated for each spectrum scan as the monitored process progresses. A fault detection mechanism, named 3-Warning-1-Alarm (3W1A, takes the SR values as inputs and triggers a system alarm when certain conditions are satisfied. This design reduces the chance of false alarm, and provides a reliable fault reporting service. The SRA method is demonstrated on a real semiconductor manufacturing dataset. The effectiveness of SRA-based fault detection is evaluated using a time-series SR test and also using a post-process SR test. The time-series SR provides an early-stage fault detection service, so less energy and materials will be wasted by faulty processing. The post-process SR provides a fault detection service with higher reliability than the time-series SR, but with fault testing conducted only after each process run completes.

  9. Observed to expected or logistic regression to identify hospitals with high or low 30-day mortality?

    Science.gov (United States)

    Helgeland, Jon; Clench-Aas, Jocelyne; Laake, Petter; Veierød, Marit B.

    2018-01-01

    Introduction A common quality indicator for monitoring and comparing hospitals is based on death within 30 days of admission. An important use is to determine whether a hospital has higher or lower mortality than other hospitals. Thus, the ability to identify such outliers correctly is essential. Two approaches for detection are: 1) calculating the ratio of observed to expected number of deaths (OE) per hospital and 2) including all hospitals in a logistic regression (LR) comparing each hospital to a form of average over all hospitals. The aim of this study was to compare OE and LR with respect to correctly identifying 30-day mortality outliers. Modifications of the methods, i.e., variance corrected approach of OE (OE-Faris), bias corrected LR (LR-Firth), and trimmed mean variants of LR and LR-Firth were also studied. Materials and methods To study the properties of OE and LR and their variants, we performed a simulation study by generating patient data from hospitals with known outlier status (low mortality, high mortality, non-outlier). Data from simulated scenarios with varying number of hospitals, hospital volume, and mortality outlier status, were analysed by the different methods and compared by level of significance (ability to falsely claim an outlier) and power (ability to reveal an outlier). Moreover, administrative data for patients with acute myocardial infarction (AMI), stroke, and hip fracture from Norwegian hospitals for 2012–2014 were analysed. Results None of the methods achieved the nominal (test) level of significance for both low and high mortality outliers. For low mortality outliers, the levels of significance were increased four- to fivefold for OE and OE-Faris. For high mortality outliers, OE and OE-Faris, LR 25% trimmed and LR-Firth 10% and 25% trimmed maintained approximately the nominal level. The methods agreed with respect to outlier status for 94.1% of the AMI hospitals, 98.0% of the stroke, and 97.8% of the hip fracture hospitals

  10. Transfer Entropy Estimation and Directional Coupling Change Detection in Biomedical Time Series

    Directory of Open Access Journals (Sweden)

    Lee Joon

    2012-04-01

    Full Text Available Abstract Background The detection of change in magnitude of directional coupling between two non-linear time series is a common subject of interest in the biomedical domain, including studies involving the respiratory chemoreflex system. Although transfer entropy is a useful tool in this avenue, no study to date has investigated how different transfer entropy estimation methods perform in typical biomedical applications featuring small sample size and presence of outliers. Methods With respect to detection of increased coupling strength, we compared three transfer entropy estimation techniques using both simulated time series and respiratory recordings from lambs. The following estimation methods were analyzed: fixed-binning with ranking, kernel density estimation (KDE, and the Darbellay-Vajda (D-V adaptive partitioning algorithm extended to three dimensions. In the simulated experiment, sample size was varied from 50 to 200, while coupling strength was increased. In order to introduce outliers, the heavy-tailed Laplace distribution was utilized. In the lamb experiment, the objective was to detect increased respiratory-related chemosensitivity to O2 and CO2 induced by a drug, domperidone. Specifically, the separate influence of end-tidal PO2 and PCO2 on minute ventilation (V˙E before and after administration of domperidone was analyzed. Results In the simulation, KDE detected increased coupling strength at the lowest SNR among the three methods. In the lamb experiment, D-V partitioning resulted in the statistically strongest increase in transfer entropy post-domperidone for PO2→V˙E. In addition, D-V partitioning was the only method that could detect an increase in transfer entropy for PCO2→V˙E, in agreement with experimental findings. Conclusions Transfer entropy is capable of detecting directional coupling changes in non-linear biomedical time series analysis featuring a small number of observations and presence of outliers. The results

  11. Using a cross-model loadings plot to identify protein spots causing 2-DE gels to become outliers in PCA

    DEFF Research Database (Denmark)

    Kristiansen, Luise Cederkvist; Jacobsen, Susanne; Jessen, Flemming

    2010-01-01

    The multivariate method PCA is an exploratory tool often used to get an overview of multivariate data, such as the quantified spot volumes of digitized 2-DE gels. PCA can reveal hidden structures present in the data, and thus enables identification of potential outliers and clustering. Based on PCA...

  12. Real-time detection of organic contamination events in water distribution systems by principal components analysis of ultraviolet spectral data.

    Science.gov (United States)

    Zhang, Jian; Hou, Dibo; Wang, Ke; Huang, Pingjie; Zhang, Guangxin; Loáiciga, Hugo

    2017-05-01

    The detection of organic contaminants in water distribution systems is essential to protect public health from potential harmful compounds resulting from accidental spills or intentional releases. Existing methods for detecting organic contaminants are based on quantitative analyses such as chemical testing and gas/liquid chromatography, which are time- and reagent-consuming and involve costly maintenance. This study proposes a novel procedure based on discrete wavelet transform and principal component analysis for detecting organic contamination events from ultraviolet spectral data. Firstly, the spectrum of each observation is transformed using discrete wavelet with a coiflet mother wavelet to capture the abrupt change along the wavelength. Principal component analysis is then employed to approximate the spectra based on capture and fusion features. The significant value of Hotelling's T 2 statistics is calculated and used to detect outliers. An alarm of contamination event is triggered by sequential Bayesian analysis when the outliers appear continuously in several observations. The effectiveness of the proposed procedure is tested on-line using a pilot-scale setup and experimental data.

  13. Online Detection of Anomalous Sub-trajectories: A Sliding Window Approach Based on Conformal Anomaly Detection and Local Outlier Factor

    OpenAIRE

    Laxhammar , Rikard; Falkman , Göran

    2012-01-01

    Part 4: First Conformal Prediction and Its Applications Workshop (COPA 2012); International audience; Automated detection of anomalous trajectories is an important problem in the surveillance domain. Various algorithms based on learning of normal trajectory patterns have been proposed for this problem. Yet, these algorithms suffer from one or more of the following limitations: First, they are essentially designed for offline anomaly detection in databases. Second, they are insensitive to loca...

  14. Detection of prostate cancer with complexed PSA and complexed/total PSA ratio - is there any advantage?

    OpenAIRE

    Strittmatter, F; Stieber, P; Nagel, D; Füllhase, C; Walther, S; Stief, CG; Waidelich, R

    2011-01-01

    Abstract Objective To evaluate the performance of total PSA (tPSA), the free/total PSA ratio (f/tPSA), complexed PSA (cPSA) and the complexed/total PSA ratio (c/tPSA) in prostate cancer detection. Methods Frozen sera of 442 patients have been analysed for tPSA, free PSA (fPSA) and cPSA. 131 patients had prostate cancer and 311 patients benign prostatic hyperplasia. Results Differences in the distribution of the biomarkers were seen as follows: tPSA, cPSA and c/tPSA were significantly higher i...

  15. Eigenvalue ratio detection based on exact moments of smallest and largest eigenvalues

    KAUST Repository

    Shakir, Muhammad; Tang, Wuchen; Rao, Anlei; Imran, Muhammad Ali; Alouini, Mohamed-Slim

    2011-01-01

    Detection based on eigenvalues of received signal covariance matrix is currently one of the most effective solution for spectrum sensing problem in cognitive radios. However, the results of these schemes always depend on asymptotic assumptions since the close-formed expression of exact eigenvalues ratio distribution is exceptionally complex to compute in practice. In this paper, non-asymptotic spectrum sensing approach to approximate the extreme eigenvalues is introduced. In this context, the Gaussian approximation approach based on exact analytical moments of extreme eigenvalues is presented. In this approach, the extreme eigenvalues are considered as dependent Gaussian random variables such that the joint probability density function (PDF) is approximated by bivariate Gaussian distribution function for any number of cooperating secondary users and received samples. In this context, the definition of Copula is cited to analyze the extent of the dependency between the extreme eigenvalues. Later, the decision threshold based on the ratio of dependent Gaussian extreme eigenvalues is derived. The performance analysis of our newly proposed approach is compared with the already published asymptotic Tracy-Widom approximation approach. © 2011 ICST.

  16. Performance and sensitivity analysis of the generalized likelihood ratio method for failure detection. M.S. Thesis

    Science.gov (United States)

    Bueno, R. A.

    1977-01-01

    Results of the generalized likelihood ratio (GLR) technique for the detection of failures in aircraft application are presented, and its relationship to the properties of the Kalman-Bucy filter is examined. Under the assumption that the system is perfectly modeled, the detectability and distinguishability of four failure types are investigated by means of analysis and simulations. Detection of failures is found satisfactory, but problems in identifying correctly the mode of a failure may arise. These issues are closely examined as well as the sensitivity of GLR to modeling errors. The advantages and disadvantages of this technique are discussed, and various modifications are suggested to reduce its limitations in performance and computational complexity.

  17. Tourism Demand in Catalonia: detecting external economic factors

    OpenAIRE

    Clavería González, Óscar; Datzira, Jordi

    2009-01-01

    There is a lack of studies on tourism demand in Catalonia. To fill the gap, this paper focuses on detecting the macroeconomic factors that determine tourism demand in Catalonia. We also analyse the relation between these factors and tourism demand. Despite the strong seasonal component and the outliers in the time series of some countries, overnight stays give a better indication of tourism demand in Catalonia than the number of tourists. The degree of linear association between the macroecon...

  18. Engaging children in the development of obesity interventions: Exploring outcomes that matter most among obesity positive outliers.

    Science.gov (United States)

    Sharifi, Mona; Marshall, Gareth; Goldman, Roberta E; Cunningham, Courtney; Marshall, Richard; Taveras, Elsie M

    2015-11-01

    To explore outcomes and measures of success that matter most to 'positive outlier' children who improved their body mass index (BMI) despite living in obesogenic neighborhoods. We collected residential address and longitudinal height/weight data from electronic health records of 22,657 children ages 6-12 years in Massachusetts. We defined obesity "hotspots" as zip codes where >15% of children had a BMI ≥95th percentile. Using linear mixed effects models, we generated a BMI z-score slope for each child with a history of obesity. We recruited 10-12 year-olds with negative slopes living in hotspots for focus groups. We analyzed group transcripts and discussed emerging themes in iterative meetings using an immersion/crystallization approach. We reached thematic saturation after 4 focus groups with 21 children. Children identified bullying and negative peer comparisons related to physical appearance, clothing size, and athletic ability as motivating them to achieve a healthier weight, and they measured success as improvement in these domains. Positive relationships with friends and family facilitated both behavior change initiation and maintenance. The perspectives of positive outlier children can provide insight into children's motivations leading to successful obesity management. Child/family engagement should guide the development of patient-centered obesity interventions. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  19. Entropy Measures for Stochastic Processes with Applications in Functional Anomaly Detection

    Directory of Open Access Journals (Sweden)

    Gabriel Martos

    2018-01-01

    Full Text Available We propose a definition of entropy for stochastic processes. We provide a reproducing kernel Hilbert space model to estimate entropy from a random sample of realizations of a stochastic process, namely functional data, and introduce two approaches to estimate minimum entropy sets. These sets are relevant to detect anomalous or outlier functional data. A numerical experiment illustrates the performance of the proposed method; in addition, we conduct an analysis of mortality rate curves as an interesting application in a real-data context to explore functional anomaly detection.

  20. Balanced detection for self-mixing interferometry to improve signal-to-noise ratio

    Science.gov (United States)

    Zhao, Changming; Norgia, Michele; Li, Kun

    2018-01-01

    We apply balanced detection to self-mixing interferometry for displacement and vibration measurement, using two photodiodes for implementing a differential acquisition. The method is based on the phase opposition of the self-mixing signal measured between the two laser diode facet outputs. The balanced signal obtained by enlarging the self-mixing signal, also by canceling of the common-due noises mainly due to disturbances on laser supply and transimpedance amplifier. Experimental results demonstrate the signal-to-noise ratio significantly improves, with almost twice signals enhancement and more than half noise decreasing. This method allows for more robust, longer-distance measurement systems, especially using fringe-counting.

  1. Detection of counterfeit antiviral drug Heptodin and classification of counterfeits using isotope amount ratio measurements by multicollector inductively coupled plasma mass spectrometry (MC-ICPMS) and isotope ratio mass spectrometry (IRMS).

    Science.gov (United States)

    Santamaria-Fernandez, Rebeca; Hearn, Ruth; Wolff, Jean-Claude

    2009-06-01

    Isotope ratio mass spectrometry (IRMS) and multicollector inductively coupled plasma mass spectrometry (MC-ICP-MS) are highly important techniques that can provide forensic evidence that otherwise would not be available. MC-ICP-MS has proved to be a very powerful tool for measuring high precision and accuracy isotope amount ratios. In this work, the potential of combining isotope amount ratio measurements performed by MC-ICP-MS and IRMS for the detection of counterfeit pharmaceutical tablets has been investigated. An extensive study for the antiviral drug Heptodin has been performed for several isotopic ratios combining MC-ICP-MS and an elemental analyser EA-IRMS for stable isotope amount ratio measurements. The study has been carried out for 139 batches of the antiviral drug and analyses have been performed for C, S, N and Mg isotope ratios. Authenticity ranges have been obtained for each isotopic system and combined to generate a unique multi-isotopic pattern only present in the genuine tablets. Counterfeit tablets have then been identified as those tablets with an isotopic fingerprint outside the genuine isotopic range. The combination of those two techniques has therefore great potential for pharmaceutical counterfeit detection. A much greater power of discrimination is obtained when at least three isotopic systems are combined. The data from these studies could be presented as evidence in court and therefore methods need to be validated to support their credibility. It is also crucial to be able to produce uncertainty values associated to the isotope amount ratio measurements so that significant differences can be identified and the genuineness of a sample can be assessed.

  2. Genomic outlier profile analysis: mixture models, null hypotheses, and nonparametric estimation.

    Science.gov (United States)

    Ghosh, Debashis; Chinnaiyan, Arul M

    2009-01-01

    In most analyses of large-scale genomic data sets, differential expression analysis is typically assessed by testing for differences in the mean of the distributions between 2 groups. A recent finding by Tomlins and others (2005) is of a different type of pattern of differential expression in which a fraction of samples in one group have overexpression relative to samples in the other group. In this work, we describe a general mixture model framework for the assessment of this type of expression, called outlier profile analysis. We start by considering the single-gene situation and establishing results on identifiability. We propose 2 nonparametric estimation procedures that have natural links to familiar multiple testing procedures. We then develop multivariate extensions of this methodology to handle genome-wide measurements. The proposed methodologies are compared using simulation studies as well as data from a prostate cancer gene expression study.

  3. Optical redox ratio and endogenous porphyrins in the detection of urinary bladder cancer: A patient biopsy analysis.

    Science.gov (United States)

    Palmer, Scott; Litvinova, Karina; Dunaev, Andrey; Yubo, Ji; McGloin, David; Nabi, Ghulam

    2017-08-01

    Bladder cancer is among the most common cancers in the UK and conventional detection techniques suffer from low sensitivity, low specificity, or both. Recent attempts to address the disparity have led to progress in the field of autofluorescence as a means to diagnose the disease with high efficiency, however there is still a lot not known about autofluorescence profiles in the disease. The multi-functional diagnostic system "LAKK-M" was used to assess autofluorescence profiles of healthy and cancerous bladder tissue to identify novel biomarkers of the disease. Statistically significant differences were observed in the optical redox ratio (a measure of tissue metabolic activity), the amplitude of endogenous porphyrins and the NADH/porphyrin ratio between tissue types. These findings could advance understanding of bladder cancer and aid in the development of new techniques for detection and surveillance. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. The internal structure of eclogite-facies ophiolite complexes: Implications from the Austroalpine outliers within the Zermatt-Saas Zone, Western Alps

    Science.gov (United States)

    Weber, Sebastian; Martinez, Raul

    2016-04-01

    The Western Alpine Penninic domain is a classical accretionary prism that formed after the closure of the Penninic oceans in the Paleogene. Continental and oceanic nappes were telescoped into the Western Alpine stack associated with continent-continent collision. Within the Western Alpine geologic framework, the ophiolite nappes of the Zermatt-Saas Zone and the Tsate Unit are the remnants of the southern branch of the Piemonte-Liguria ocean basin. In addition, a series of continental basement slices reported as lower Austroalpine outliers have preserved an eclogitic high-pressure imprint, and are tectonically sandwiched between these oceanic nappes. Since the outliers occur at an unusual intra-ophiolitic setting and show a polymetamorphic character, this group of continental slices is of special importance for understanding the tectono-metamorphic evolution of Western Alps. Recently, more geochronological data from the Austroalpine outliers have become available that make it possible to establish a more complete picture of their complex geological history. The Lu-Hf garnet-whole rock ages for prograde growth of garnet fall into the time interval of 52 to 62 Ma (Weber et al., 2015, Fassmer et al. 2015), but are consistently higher than the Lu-Hf garnet-whole rock ages from several other locations throughout the Zermatt-Saas zone that range from 52 to 38 Ma (Skora et al., 2015). This discrepancy suggests that the Austroalpine outliers may have been subducted earlier than the ophiolites of the Zermatt-Saas Zone and therefore have been tectonically emplaced into their present intra-ophiolite position. This points to the possibility that the Zermatt-Saas Zone consists of tectonic subunits, which reached their respective pressure peaks over a prolonged time period, approximately 10-20 Ma. The pressure-temperature estimates from several members of the Austroalpine outliers indicate a complex distribution of metamorphic peak conditions, without ultrahigh

  5. Robust nonhomogeneous training samples detection method for space-time adaptive processing radar using sparse-recovery with knowledge-aided

    Science.gov (United States)

    Li, Zhihui; Liu, Hanwei; Zhang, Yongshun; Guo, Yiduo

    2017-10-01

    The performance of space-time adaptive processing (STAP) may degrade significantly when some of the training samples are contaminated by the signal-like components (outliers) in nonhomogeneous clutter environments. To remove the training samples contaminated by outliers in nonhomogeneous clutter environments, a robust nonhomogeneous training samples detection method using the sparse-recovery (SR) with knowledge-aided (KA) is proposed. First, the reduced-dimension (RD) overcomplete spatial-temporal steering dictionary is designed with the prior knowledge of system parameters and the possible target region. Then, the clutter covariance matrix (CCM) of cell under test is efficiently estimated using a modified focal underdetermined system solver (FOCUSS) algorithm, where a RD overcomplete spatial-temporal steering dictionary is applied. Third, the proposed statistics are formed by combining the estimated CCM with the generalized inner products (GIP) method, and the contaminated training samples can be detected and removed. Finally, several simulation results validate the effectiveness of the proposed KA-SR-GIP method.

  6. Hybrid online sensor error detection and functional redundancy for systems with time-varying parameters.

    Science.gov (United States)

    Feng, Jianyuan; Turksoy, Kamuran; Samadi, Sediqeh; Hajizadeh, Iman; Littlejohn, Elizabeth; Cinar, Ali

    2017-12-01

    Supervision and control systems rely on signals from sensors to receive information to monitor the operation of a system and adjust manipulated variables to achieve the control objective. However, sensor performance is often limited by their working conditions and sensors may also be subjected to interference by other devices. Many different types of sensor errors such as outliers, missing values, drifts and corruption with noise may occur during process operation. A hybrid online sensor error detection and functional redundancy system is developed to detect errors in online signals, and replace erroneous or missing values detected with model-based estimates. The proposed hybrid system relies on two techniques, an outlier-robust Kalman filter (ORKF) and a locally-weighted partial least squares (LW-PLS) regression model, which leverage the advantages of automatic measurement error elimination with ORKF and data-driven prediction with LW-PLS. The system includes a nominal angle analysis (NAA) method to distinguish between signal faults and large changes in sensor values caused by real dynamic changes in process operation. The performance of the system is illustrated with clinical data continuous glucose monitoring (CGM) sensors from people with type 1 diabetes. More than 50,000 CGM sensor errors were added to original CGM signals from 25 clinical experiments, then the performance of error detection and functional redundancy algorithms were analyzed. The results indicate that the proposed system can successfully detect most of the erroneous signals and substitute them with reasonable estimated values computed by functional redundancy system.

  7. Engaging children in the development of obesity interventions: exploring outcomes that matter most among obesity positive outliers

    OpenAIRE

    Sharifi, Mona; Marshall, Gareth; Goldman, Roberta E.; Cunningham, Courtney; Marshall, Richard; Taveras, Elsie M

    2015-01-01

    Objective To explore outcomes and measures of success that matter most to 'positive outlier' children who improved their body mass index (BMI) despite living in obesogenic neighborhoods. Methods We collected residential address and longitudinal height/weight data from electronic health records of 22,657 children ages 6–12 years in Massachusetts. We defined obesity “hotspots” as zip codes where >15% of children had a BMI ≥95th percentile. Using linear mixed effects models, we gener...

  8. Ratio-metric sensor to detect riboflavin via fluorescence resonance energy transfer with ultrahigh sensitivity

    Science.gov (United States)

    Wang, Jilong; Su, Siheng; Wei, Junhua; Bahgi, Roya; Hope-Weeks, Louisa; Qiu, Jingjing; Wang, Shiren

    2015-08-01

    In this paper, a novel fluorescence resonance energy transfer (FRET) ration-metric fluorescent probe based on heteroatom N, S doped carbon dots (N, S-CDs) was developed to determine riboflavin in aqueous solutions. The ratio of two emission intensities at different wavelengths is applied to determine the concentration of riboflavin (RF). This method is more effective in reducing the background interference and fluctuation of diverse conditions. Therefore, this probe obtains high sensitivity with a low limit of detection (LOD) of 1.9 nM (0.7 ng/ml) which is in the highest level of all riboflavin detection approaches and higher than single wavelength intensity detection (1.9 μM). In addition, this sensor has a high selectivity of detecting riboflavin in deionized water (pH=7) with other biochemical like amino acids. Moreover, riboflavin in aqueous solution is very sensitive to sunlight and can be degraded to lumiflavin, which is toxic. Because the N, S doped carbon dots cannot serve as an energy donor for N, S doped carbon dots and lumiflavin system, this system makes it easy to determine whether the riboflavin is degraded or not, which is first to be reported. This platform may provide possibilities to build a new and facile fluorescence resonance energy transfer based sensor to detect analytes and metamorphous analytes in aqueous solution.

  9. Exploiting the information content of hydrological ''outliers'' for goodness-of-fit testing

    Directory of Open Access Journals (Sweden)

    F. Laio

    2010-10-01

    Full Text Available Validation of probabilistic models based on goodness-of-fit tests is an essential step for the frequency analysis of extreme events. The outcome of standard testing techniques, however, is mainly determined by the behavior of the hypothetical model, FX(x, in the central part of the distribution, while the behavior in the tails of the distribution, which is indeed very relevant in hydrological applications, is relatively unimportant for the results of the tests. The maximum-value test, originally proposed as a technique for outlier detection, is a suitable, but seldom applied, technique that addresses this problem. The test is specifically targeted to verify if the maximum (or minimum values in the sample are consistent with the hypothesis that the distribution FX(x is the real parent distribution. The application of this test is hindered by the fact that the critical values for the test should be numerically obtained when the parameters of FX(x are estimated on the same sample used for verification, which is the standard situation in hydrological applications. We propose here a simple, analytically explicit, technique to suitably account for this effect, based on the application of censored L-moments estimators of the parameters. We demonstrate, with an application that uses artificially generated samples, the superiority of this modified maximum-value test with respect to the standard version of the test. We also show that the test has comparable or larger power with respect to other goodness-of-fit tests (e.g., chi-squared test, Anderson-Darling test, Fung and Paul test, in particular when dealing with small samples (sample size lower than 20–25 and when the parent distribution is similar to the distribution being tested.

  10. Robust w-Estimators for Cryo-EM Class Means

    Science.gov (United States)

    Huang, Chenxi; Tagare, Hemant D.

    2016-01-01

    A critical step in cryogenic electron microscopy (cryo-EM) image analysis is to calculate the average of all images aligned to a projection direction. This average, called the “class mean”, improves the signal-to-noise ratio in single particle reconstruction (SPR). The averaging step is often compromised because of outlier images of ice, contaminants, and particle fragments. Outlier detection and rejection in the majority of current cryo-EM methods is done using cross-correlation with a manually determined threshold. Empirical assessment shows that the performance of these methods is very sensitive to the threshold. This paper proposes an alternative: a “w-estimator” of the average image, which is robust to outliers and which does not use a threshold. Various properties of the estimator, such as consistency and influence function are investigated. An extension of the estimator to images with different contrast transfer functions (CTFs) is also provided. Experiments with simulated and real cryo-EM images show that the proposed estimator performs quite well in the presence of outliers. PMID:26841397

  11. A User-Adaptive Algorithm for Activity Recognition Based on K-Means Clustering, Local Outlier Factor, and Multivariate Gaussian Distribution

    Directory of Open Access Journals (Sweden)

    Shizhen Zhao

    2018-06-01

    Full Text Available Mobile activity recognition is significant to the development of human-centric pervasive applications including elderly care, personalized recommendations, etc. Nevertheless, the distribution of inertial sensor data can be influenced to a great extent by varying users. This means that the performance of an activity recognition classifier trained by one user’s dataset will degenerate when transferred to others. In this study, we focus on building a personalized classifier to detect four categories of human activities: light intensity activity, moderate intensity activity, vigorous intensity activity, and fall. In order to solve the problem caused by different distributions of inertial sensor signals, a user-adaptive algorithm based on K-Means clustering, local outlier factor (LOF, and multivariate Gaussian distribution (MGD is proposed. To automatically cluster and annotate a specific user’s activity data, an improved K-Means algorithm with a novel initialization method is designed. By quantifying the samples’ informative degree in a labeled individual dataset, the most profitable samples can be selected for activity recognition model adaption. Through experiments, we conclude that our proposed models can adapt to new users with good recognition performance.

  12. A flexible spatial scan statistic with a restricted likelihood ratio for detecting disease clusters.

    Science.gov (United States)

    Tango, Toshiro; Takahashi, Kunihiko

    2012-12-30

    Spatial scan statistics are widely used tools for detection of disease clusters. Especially, the circular spatial scan statistic proposed by Kulldorff (1997) has been utilized in a wide variety of epidemiological studies and disease surveillance. However, as it cannot detect noncircular, irregularly shaped clusters, many authors have proposed different spatial scan statistics, including the elliptic version of Kulldorff's scan statistic. The flexible spatial scan statistic proposed by Tango and Takahashi (2005) has also been used for detecting irregularly shaped clusters. However, this method sets a feasible limitation of a maximum of 30 nearest neighbors for searching candidate clusters because of heavy computational load. In this paper, we show a flexible spatial scan statistic implemented with a restricted likelihood ratio proposed by Tango (2008) to (1) eliminate the limitation of 30 nearest neighbors and (2) to have surprisingly much less computational time than the original flexible spatial scan statistic. As a side effect, it is shown to be able to detect clusters with any shape reasonably well as the relative risk of the cluster becomes large via Monte Carlo simulation. We illustrate the proposed spatial scan statistic with data on mortality from cerebrovascular disease in the Tokyo Metropolitan area, Japan. Copyright © 2012 John Wiley & Sons, Ltd.

  13. OutRank

    DEFF Research Database (Denmark)

    Müller, Emmanuel; Assent, Ira; Steinhausen, Uwe

    2008-01-01

    Outlier detection is an important data mining task for consistency checks, fraud detection, etc. Binary decision making on whether or not an object is an outlier is not appropriate in many applications and moreover hard to parametrize. Thus, recently, methods for outlier ranking have been proposed...

  14. DC-to-AC inverter ratio failure detector

    Science.gov (United States)

    Ebersole, T. J.; Andrews, R. E.

    1975-01-01

    Failure detection technique is based upon input-output ratios, which is independent of inverter loading. Since inverter has fixed relationship between V-in/V-out and I-in/I-out, failure detection criteria are based on this ratio, which is simply inverter transformer turns ratio, K, equal to primary turns divided by secondary turns.

  15. Urine Galactomannan-to-Creatinine Ratio for Detection of Invasive Aspergillosis in Patients with Hematological Malignancies

    OpenAIRE

    Reischies, Frederike M. J.; Raggam, Reinhard B.; Prattes, Juergen; Krause, Robert; Eigl, Susanne; List, Agnes; Quehenberger, Franz; Strenger, Volker; Wölfler, Albert; Hoenigl, Martin

    2016-01-01

    Galactomannan (GM) testing of urine specimens may provide important advantages, compared to serum testing, such as easy noninvasive sample collection. We evaluated a total of 632 serial urine samples from 71 patients with underlying hematological malignancies and found that the urine GM/creatinine ratio, i.e., (urine GM level × 100)/urine creatinine level, which takes urine dilution into account, reliably detected invasive aspergillosis and may be a promising diagnostic tool for patients with...

  16. ¿Se pueden predecir geográficamente los resultados electorales? Una aplicación del análisis de clusters y outliers espaciales

    Directory of Open Access Journals (Sweden)

    Carlos J. Vilalta Perdomo

    2008-01-01

    Full Text Available Los resultados de este estudio demuestran que al aplicar la estadística espacial en la geografía electoral es posible predecir los resultados electorales. Se utilizan los conceptos geográficos de cluster y outlier espaciales, y como variable predictiva la segregación espacial socioeconómica. Las técnicas estadísticas que se emplean son los índices globales y locales de autocorrelación espacial de Moran y el análisis de regresión lineal. Sobre los datos analizados se encuentra: 1 que la Ciudad de México posee clusters espaciales de apoyo electoral y de marginación, 2 outliers espaciales de marginación, 3 que los partidos electorales se excluyen geográficamente, y 4 que sus resultados dependen significativamente de los niveles de segregación espacial en la ciudad.

  17. Feature learning and change feature classification based on deep learning for ternary change detection in SAR images

    Science.gov (United States)

    Gong, Maoguo; Yang, Hailun; Zhang, Puzhao

    2017-07-01

    Ternary change detection aims to detect changes and group the changes into positive change and negative change. It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images. In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison. Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise. And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier. The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules. Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent. During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes. Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection. Experimental results on real datasets validate the effectiveness and superiority of the proposed framework.

  18. Gas detection by correlation spectroscopy employing a multimode diode laser.

    Science.gov (United States)

    Lou, Xiutao; Somesfalean, Gabriel; Zhang, Zhiguo

    2008-05-01

    A gas sensor based on the gas-correlation technique has been developed using a multimode diode laser (MDL) in a dual-beam detection scheme. Measurement of CO(2) mixed with CO as an interfering gas is successfully demonstrated using a 1570 nm tunable MDL. Despite overlapping absorption spectra and occasional mode hops, the interfering signals can be effectively excluded by a statistical procedure including correlation analysis and outlier identification. The gas concentration is retrieved from several pair-correlated signals by a linear-regression scheme, yielding a reliable and accurate measurement. This demonstrates the utility of the unsophisticated MDLs as novel light sources for gas detection applications.

  19. Estimation of Non-Revenue Water Ratio for Sustainable Management Using Artificial Neural Network and Z-Score in Incheon, Republic of Korea

    Directory of Open Access Journals (Sweden)

    Dongwoo Jang

    2017-10-01

    Full Text Available The non-revenue water (NRW ratio in a water distribution system is the ratio of the loss due to unbilled authorized consumption, apparent losses and real losses to the overall system input volume (SIV. The method of estimating the NRW ratio by measurement might not work in an area with no district metered areas (DMAs or with unclear administrative district. Through multiple regression analyses is a statistical analysis method for calculating the NRW ratio using the main parameters of the water distribution system, although its disadvantage is lower accuracy than that of the measured NRW ratio. In this study, an artificial neural network (ANN was used to estimate the NRW ratio. The results of the study proved that the accuracy of NRW ratio calculated by the ANN model was higher than by multiple regression analysis. The developed ANN model was shown to have an accuracy that varies depending on the number of neurons in the hidden layer. Therefore, when using the ANN model, the optimal number of neurons must be determined. In addition, the accuracy of the outlier removal condition was higher than that of the original data used condition.

  20. Deep gray matter demyelination detected by magnetization transfer ratio in the cuprizone model.

    Directory of Open Access Journals (Sweden)

    Sveinung Fjær

    Full Text Available In multiple sclerosis (MS, the correlation between lesion load on conventional magnetic resonance imaging (MRI and clinical disability is weak. This clinico-radiological paradox might partly be due to the low sensitivity of conventional MRI to detect gray matter demyelination. Magnetization transfer ratio (MTR has previously been shown to detect white matter demyelination in mice. In this study, we investigated whether MTR can detect gray matter demyelination in cuprizone exposed mice. A total of 54 female C57BL/6 mice were split into one control group ( and eight cuprizone exposed groups ([Formula: see text]. The mice were exposed to [Formula: see text] (w/w cuprizone for up to six weeks. MTR images were obtained at a 7 Tesla Bruker MR-scanner before cuprizone exposure, weekly for six weeks during cuprizone exposure, and once two weeks after termination of cuprizone exposure. Immunohistochemistry staining for myelin (anti-Proteolopid Protein and oligodendrocytes (anti-Neurite Outgrowth Inhibitor Protein A was obtained after each weekly scanning. Rates of MTR change and correlations between MTR values and histological findings were calculated in five brain regions. In the corpus callosum and the deep gray matter a significant rate of MTR value decrease was found, [Formula: see text] per week ([Formula: see text] and [Formula: see text] per week ([Formula: see text] respectively. The MTR values correlated to myelin loss as evaluated by immunohistochemistry (Corpus callosum: [Formula: see text]. Deep gray matter: [Formula: see text], but did not correlate to oligodendrocyte density. Significant results were not found in the cerebellum, the olfactory bulb or the cerebral cortex. This study shows that MTR can be used to detect demyelination in the deep gray matter, which is of particular interest for imaging of patients with MS, as deep gray matter demyelination is common in MS, and is not easily detected on conventional clinical MRI.

  1. Association of apolipoprotein b/apolipoprotein A1 ratio and coronary artery stenosis and plaques detected by multi-detector computed tomography in healthy population.

    Science.gov (United States)

    Jung, Chang Hee; Hwang, Jenie Yoonoo; Shin, Mi Seon; Yu, Ji Hee; Kim, Eun Hee; Bae, Sung Jin; Yang, Dong Hyun; Kang, Joon-Won; Park, Joong-Yeol; Kim, Hong-Kyu; Lee, Woo Je

    2013-05-01

    Despite the noninvasiveness and accuracy of multidetector computed tomography (MDCT), its use as a routine screening tool for occult coronary atherosclerosis is unclear. We investigated whether the ratio of apolipoprotein B (apoB) to apolipoprotein A1 (apoA1), an indicator of the balance between atherogenic and atheroprotective cholesterol transport could predict occult coronary atherosclerosis detected by MDCT. We collected the data of 1,401 subjects (877 men and 524 women) who participated in a routine health screening examination of Asan Medical Center. Significant coronary artery stenosis defined as > 50% stenosis was detected in 114 subjects (8.1%). An increase in apoB/A1 quartiles was associated with increased percentages of subjects with significant coronary stenosis and noncalcified plaques (NCAP). After adjustment for confounding variables, each 0.1 increase in serum apoB/A1 was significantly associated with increased odds ratios (ORs) for coronary stenosis and NCAP of 1.23 and 1.18, respectively. The optimal apoB/A1 ratio cut off value for MDCT detection of significant coronary stenosis was 0.58, which had a sensitivity of 70.2% and a specificity of 48.2% (area under the curve, 0.61; 95% CI, 0.58-0.63, P < 0.001). Our results indicate that apoB/A1 ratio is a good indicator of occult coronary atherosclerosis detected by coronary MDCT.

  2. The B(E2;4^+1->2^+1) / B(E2;2^+1->0^+1) Ratio in Even-Even Nuclei

    Science.gov (United States)

    Loelius, C.; Sharon, Y. Y.; Zamick, L.; G"Urdal, G.

    2009-10-01

    We considered 207 even-even nuclei throughout the chart of nuclides for which the NNDC Tables had data on the energies and lifetimes of the 2^+1 and 4^+1 states. Using these data we calculated for each nucleus the electric quadrupole transition strengths B(E2;4^+1->2^+1) and B(E2;2^+1->0^+1), as well as their ratio. The internal conversion coefficients were obtained by using the NNDC HSICC calculator. For each nucleus we plotted the B(E2) ratio against A, N, and Z. We found that for close to 90% of the nuclei considered the ratio had values between 0.5 and 2.5. Most of the outliers had magic numbers of protons or neutrons. Our ratio results were compared with the theoretical predictions for this ratio by different models--10/7 in the rotational model and 2 in the simplest vibrational model. In the rotational regions (for 150 220) the ratios were indeed close to 10/7. For the few nuclei thought to be vibrational the ratios were usually less than 2. Otherwise, we got a wide scatter of ratio values. Hence other models, including the NpNn scheme, must be considered in interpreting these results.

  3. Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels.

    Science.gov (United States)

    Fu, Yanwei; Hospedales, Timothy M; Xiang, Tao; Xiong, Jiechao; Gong, Shaogang; Wang, Yizhou; Yao, Yuan

    2016-03-01

    The problem of estimating subjective visual properties from image and video has attracted increasing interest. A subjective visual property is useful either on its own (e.g. image and video interestingness) or as an intermediate representation for visual recognition (e.g. a relative attribute). Due to its ambiguous nature, annotating the value of a subjective visual property for learning a prediction model is challenging. To make the annotation more reliable, recent studies employ crowdsourcing tools to collect pairwise comparison labels. However, using crowdsourced data also introduces outliers. Existing methods rely on majority voting to prune the annotation outliers/errors. They thus require a large amount of pairwise labels to be collected. More importantly as a local outlier detection method, majority voting is ineffective in identifying outliers that can cause global ranking inconsistencies. In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly. This differs from existing methods in that (1) the proposed method integrates local pairwise comparison labels together to minimise a cost that corresponds to global inconsistency of ranking order, and (2) the outlier detection and learning to rank problems are solved jointly. This not only leads to better detection of annotation outliers but also enables learning with extremely sparse annotations.

  4. Induced Voltages Ratio-Based Algorithm for Fault Detection, and Faulted Phase and Winding Identification of a Three-Winding Power Transformer

    Directory of Open Access Journals (Sweden)

    Byung Eun Lee

    2014-09-01

    Full Text Available This paper proposes an algorithm for fault detection, faulted phase and winding identification of a three-winding power transformer based on the induced voltages in the electrical power system. The ratio of the induced voltages of the primary-secondary, primary-tertiary and secondary-tertiary windings is the same as the corresponding turns ratio during normal operating conditions, magnetic inrush, and over-excitation. It differs from the turns ratio during an internal fault. For a single phase and a three-phase power transformer with wye-connected windings, the induced voltages of each pair of windings are estimated. For a three-phase power transformer with delta-connected windings, the induced voltage differences are estimated to use the line currents, because the delta winding currents are practically unavailable. Six detectors are suggested for fault detection. An additional three detectors and a rule for faulted phase and winding identification are presented as well. The proposed algorithm can not only detect an internal fault, but also identify the faulted phase and winding of a three-winding power transformer. The various test results with Electromagnetic Transients Program (EMTP-generated data show that the proposed algorithm successfully discriminates internal faults from normal operating conditions including magnetic inrush and over-excitation. This paper concludes by implementing the algorithm into a prototype relay based on a digital signal processor.

  5. The correlation between HCN/H2O flux ratios and disk mass: evidence for protoplanet formation

    Science.gov (United States)

    Rose, Caitlin; Salyk, Colette

    2017-01-01

    We analyze hydrogen cyanide (HCN) and water vapor flux ratios in protoplanetary disks as a way to trace planet formation. Analyzing only disks in the Taurus molecular cloud, Najita et al. (2013) found a tentative correlation between protoplanetary disk mass and the HCN/H2O line flux ratio in Spitzer-IRS emission spectra. They interpret this correlation to be a consequence of more massive disks forming planetesimals more efficiently than smaller disks, as the formation of large planetesimals may lock up water ice in the cool outer disk region and prevent it from migrating, drying out the inner disk. The sequestering of water (and therefore oxygen) in the outer disk may also increase the carbon-to- oxygen ratio in the inner disk, leading to enhanced organic molecule (e.g. HCN) emission. To confirm this trend, we expand the Najita et al. sample by calculating HCN/H2O line flux ratios for 8 more sources with known disk masses from clusters besides Taurus. We find agreement with the Najita et al. trend, suggesting that this is a widespread phenomenon. In addition, we find HCN/H2O line flux ratios for 17 more sources that await disk mass measurements, which should become commonplace in the ALMA era. Finally, we investigate linear fits and outliers to this trend, and discuss possible causes.

  6. Improvement of statistical methods for detecting anomalies in climate and environmental monitoring systems

    Science.gov (United States)

    Yakunin, A. G.; Hussein, H. M.

    2018-01-01

    The article shows how the known statistical methods, which are widely used in solving financial problems and a number of other fields of science and technology, can be effectively applied after minor modification for solving such problems in climate and environment monitoring systems, as the detection of anomalies in the form of abrupt changes in signal levels, the occurrence of positive and negative outliers and the violation of the cycle form in periodic processes.

  7. Outlier detection in UV/Vis spectrophotometric data

    NARCIS (Netherlands)

    Lepot, M.J.; Aubin, Jean Baptiste; Clemens, F.H.L.R.; Mašić, Alma

    2017-01-01

    UV/Vis spectrophotometers have been used to monitor water quality since the early 2000s. Calibration of these devices requires sampling campaigns to elaborate relations between recorded spectra and measured concentrations. In order to build robust calibration data sets, several spectra must be

  8. Evaluation of the expected moments algorithm and a multiple low-outlier test for flood frequency analysis at streamgaging stations in Arizona

    Science.gov (United States)

    Paretti, Nicholas V.; Kennedy, Jeffrey R.; Cohn, Timothy A.

    2014-01-01

    Flooding is among the costliest natural disasters in terms of loss of life and property in Arizona, which is why the accurate estimation of flood frequency and magnitude is crucial for proper structural design and accurate floodplain mapping. Current guidelines for flood frequency analysis in the United States are described in Bulletin 17B (B17B), yet since B17B’s publication in 1982 (Interagency Advisory Committee on Water Data, 1982), several improvements have been proposed as updates for future guidelines. Two proposed updates are the Expected Moments Algorithm (EMA) to accommodate historical and censored data, and a generalized multiple Grubbs-Beck (MGB) low-outlier test. The current guidelines use a standard Grubbs-Beck (GB) method to identify low outliers, changing the determination of the moment estimators because B17B uses a conditional probability adjustment to handle low outliers while EMA censors the low outliers. B17B and EMA estimates are identical if no historical information or censored or low outliers are present in the peak-flow data. EMA with MGB (EMA-MGB) test was compared to the standard B17B (B17B-GB) method for flood frequency analysis at 328 streamgaging stations in Arizona. The methods were compared using the relative percent difference (RPD) between annual exceedance probabilities (AEPs), goodness-of-fit assessments, random resampling procedures, and Monte Carlo simulations. The AEPs were calculated and compared using both station skew and weighted skew. Streamgaging stations were classified by U.S. Geological Survey (USGS) National Water Information System (NWIS) qualification codes, used to denote historical and censored peak-flow data, to better understand the effect that nonstandard flood information has on the flood frequency analysis for each method. Streamgaging stations were also grouped according to geographic flood regions and analyzed separately to better understand regional differences caused by physiography and climate. The B

  9. Novelty detection for breast cancer image classification

    Science.gov (United States)

    Cichosz, Pawel; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold

    2016-09-01

    Using classification learning algorithms for medical applications may require not only refined model creation techniques and careful unbiased model evaluation, but also detecting the risk of misclassification at the time of model application. This is addressed by novelty detection, which identifies instances for which the training set is not sufficiently representative and for which it may be safer to restrain from classification and request a human expert diagnosis. The paper investigates two techniques for isolated instance identification, based on clustering and one-class support vector machines, which represent two different approaches to multidimensional outlier detection. The prediction quality for isolated instances in breast cancer image data is evaluated using the random forest algorithm and found to be substantially inferior to the prediction quality for non-isolated instances. Each of the two techniques is then used to create a novelty detection model which can be combined with a classification model and used at the time of prediction to detect instances for which the latter cannot be reliably applied. Novelty detection is demonstrated to improve random forest prediction quality and argued to deserve further investigation in medical applications.

  10. Density and SUV Ratios from PET/CT in the Detection of Mediastinal Lymph Node Metastasis in Non-small Cell Lung Cancer

    Directory of Open Access Journals (Sweden)

    Tingting SHAO

    2015-03-01

    Full Text Available Background and objective Mediastinal involvement in lung cancer is a highly significant prognostic factor for survival, and accurate staging of the mediastinum will correctly identify patients who will benefit the most from surgery. Positron emission tomography/computed tomography (PET/CT has become the standard imaging modality for the staging of patients with lung cancer. The aim of this study is to investigate 18-fluoro-2-deoxy-glucose (18F-FDG PET/CT imaging in the detection of mediastinal disease in lung cancer. Methods A total of 72 patients newly diagnosed with non-small cell lung cancer (NSCLC who underwent preoperative whole-body 18F-FDG PET/CT were retrospectively included. All patients underwent radical surgery and mediastinal lymph node dissection. Mediastinal disease was histologically confirmed in 45 of 413 lymph nodes. PET/CT doctors analyzed patients’ visual images and evaluated lymph node’s short axis, lymph node’s maximum standardized uptake value (SUVmax, node/aorta density ratio, node/aorta SUV ratio, and other parameters using the histopathological results as the reference standard. The optimal cutoff value for each ratio was determined by receiver operator characteristic curve analysis. Results Using a threshold of 0.9 for density ratio and 1.2 for SUV ratio yielded high accuracy for the detection of mediastinal disease. The lymph node’s short axis, lymph node’s SUVmax, density ratio, and SUV ratio of integrated PET/CT for the accuracy of diagnosing mediastinal lymph node was 95.2%. The diagnostic accuracy of mediastinal lymph node with conventional PET/CT was 89.8%, whereas that of PET/CT comprehensive analysis was 90.8%. Conclusion Node/aorta density ratio and SUV ratio may be complimentary to conventional visual interpretation and SUVmax measurement. The use of lymph node’s short axis, lymph node’s SUVmax, and both ratios in combination is better than either conventional PET/CT analysis or PET

  11. Automatic EEG spike detection.

    Science.gov (United States)

    Harner, Richard

    2009-10-01

    Since the 1970s advances in science and technology during each succeeding decade have renewed the expectation of efficient, reliable automatic epileptiform spike detection (AESD). But even when reinforced with better, faster tools, clinically reliable unsupervised spike detection remains beyond our reach. Expert-selected spike parameters were the first and still most widely used for AESD. Thresholds for amplitude, duration, sharpness, rise-time, fall-time, after-coming slow waves, background frequency, and more have been used. It is still unclear which of these wave parameters are essential, beyond peak-peak amplitude and duration. Wavelet parameters are very appropriate to AESD but need to be combined with other parameters to achieve desired levels of spike detection efficiency. Artificial Neural Network (ANN) and expert-system methods may have reached peak efficiency. Support Vector Machine (SVM) technology focuses on outliers rather than centroids of spike and nonspike data clusters and should improve AESD efficiency. An exemplary spike/nonspike database is suggested as a tool for assessing parameters and methods for AESD and is available in CSV or Matlab formats from the author at brainvue@gmail.com. Exploratory Data Analysis (EDA) is presented as a graphic method for finding better spike parameters and for the step-wise evaluation of the spike detection process.

  12. Motion and Form Coherence Detection in Autistic Spectrum Disorder: Relationship to Motor Control and 2:4 Digit Ratio

    Science.gov (United States)

    Milne, Elizabeth; White, Sarah; Campbell, Ruth; Swettenham, John; Hansen, Peter; Ramus, Franck

    2006-01-01

    Children with autistic spectrum disorder and controls performed tasks of coherent motion and form detection, and motor control. Additionally, the ratio of the 2nd and 4th digits of these children, which is thought to be an indicator of foetal testosterone, was measured. Children in the experimental group were impaired at tasks of motor control,…

  13. Nuclear Power Plant Thermocouple Sensor-Fault Detection and Classification Using Deep Learning and Generalized Likelihood Ratio Test

    Science.gov (United States)

    Mandal, Shyamapada; Santhi, B.; Sridhar, S.; Vinolia, K.; Swaminathan, P.

    2017-06-01

    In this paper, an online fault detection and classification method is proposed for thermocouples used in nuclear power plants. In the proposed method, the fault data are detected by the classification method, which classifies the fault data from the normal data. Deep belief network (DBN), a technique for deep learning, is applied to classify the fault data. The DBN has a multilayer feature extraction scheme, which is highly sensitive to a small variation of data. Since the classification method is unable to detect the faulty sensor; therefore, a technique is proposed to identify the faulty sensor from the fault data. Finally, the composite statistical hypothesis test, namely generalized likelihood ratio test, is applied to compute the fault pattern of the faulty sensor signal based on the magnitude of the fault. The performance of the proposed method is validated by field data obtained from thermocouple sensors of the fast breeder test reactor.

  14. Least-Squares Linear Regression and Schrodinger's Cat: Perspectives on the Analysis of Regression Residuals.

    Science.gov (United States)

    Hecht, Jeffrey B.

    The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…

  15. Signal-to-noise ratio and detective quantum efficiency determination by and alternative use of photographic detectors

    International Nuclear Information System (INIS)

    Burgudzhiev, Z.; Koleva, D.

    1986-01-01

    A known theoretical model of an alternative use of silver-halogenid pnotographic emulsions in which the number of the granulas forming the photographic image is used as a detector output instead of the microdensiometric blackening density is applied to some real photographic emulsions. It is found that by this use the Signal-to-Noise ratio of the photographic detector can be increased to about 5 times while its detective quantum efficiency can reach about 20%, being close to that of some photomultipliers

  16. Size ratio performance in detecting cerebral aneurysm rupture status is insensitive to small vessel removal.

    Science.gov (United States)

    Lauric, Alexandra; Baharoglu, Merih I; Malek, Adel M

    2013-04-01

    The variable definition of size ratio (SR) for sidewall (SW) vs bifurcation (BIF) aneurysms raises confusion for lesions harboring small branches, such as carotid ophthalmic or posterior communicating locations. These aneurysms are considered SW by many clinicians, but SR methodology classifies them as BIF. To evaluate the effect of ignoring small vessels and SW vs stringent BIF labeling on SR ruptured aneurysm detection performance in borderline aneurysms with small branches, and to reconcile SR-based labeling with clinical SW/BIF classification. Catheter rotational angiographic datasets of 134 consecutive aneurysms (60 ruptured) were automatically measured in 3-dimensional. Stringent BIF labeling was applied to clinically labeled aneurysms, with 21 aneurysms switching label from SW to BIF. Parent vessel size was evaluated both taking into account, and ignoring, small vessels. SR was defined accordingly as the ratio between aneurysm and parent vessel sizes. Univariate and multivariate statistics identified significant features. The square of the correlation coefficient (R(2)) was reported for bivariate analysis of alternative SR calculations. Regardless of SW/BIF labeling method, SR was equally significant in discriminating aneurysm ruptured status (P analysis of alternative SR had a high correlation of R(2) = 0.94 on the whole dataset, and R = 0.98 on the 21 borderline aneurysms. Ignoring small branches from SR calculation maintains rupture status detection performance, while reducing postprocessing complexity and removing labeling ambiguity. Aneurysms adjacent to these vessels can be considered SW for morphometric analysis. It is reasonable to use the clinical SW/BIF labeling when using SR for rupture risk evaluation.

  17. Solution Process Synthesis of High Aspect Ratio ZnO Nanorods on Electrode Surface for Sensitive Electrochemical Detection of Uric Acid

    Science.gov (United States)

    Ahmad, Rafiq; Tripathy, Nirmalya; Ahn, Min-Sang; Hahn, Yoon-Bong

    2017-04-01

    This study demonstrates a highly stable, selective and sensitive uric acid (UA) biosensor based on high aspect ratio zinc oxide nanorods (ZNRs) vertical grown on electrode surface via a simple one-step low temperature solution route. Uricase enzyme was immobilized on the ZNRs followed by Nafion covering to fabricate UA sensing electrodes (Nafion/Uricase-ZNRs/Ag). The fabricated electrodes showed enhanced performance with attractive analytical response, such as a high sensitivity of 239.67 μA cm-2 mM-1 in wide-linear range (0.01-4.56 mM), rapid response time (~3 s), low detection limit (5 nM), and low value of apparent Michaelis-Menten constant (Kmapp, 0.025 mM). In addition, selectivity, reproducibility and long-term storage stability of biosensor was also demonstrated. These results can be attributed to the high aspect ratio of vertically grown ZNRs which provides high surface area leading to enhanced enzyme immobilization, high electrocatalytic activity, and direct electron transfer during electrochemical detection of UA. We expect that this biosensor platform will be advantageous to fabricate ultrasensitive, robust, low-cost sensing device for numerous analyte detection.

  18. Urine Galactomannan-to-Creatinine Ratio for Detection of Invasive Aspergillosis in Patients with Hematological Malignancies.

    Science.gov (United States)

    Reischies, Frederike M J; Raggam, Reinhard B; Prattes, Juergen; Krause, Robert; Eigl, Susanne; List, Agnes; Quehenberger, Franz; Strenger, Volker; Wölfler, Albert; Hoenigl, Martin

    2016-03-01

    Galactomannan (GM) testing of urine specimens may provide important advantages, compared to serum testing, such as easy noninvasive sample collection. We evaluated a total of 632 serial urine samples from 71 patients with underlying hematological malignancies and found that the urine GM/creatinine ratio, i.e., (urine GM level × 100)/urine creatinine level, which takes urine dilution into account, reliably detected invasive aspergillosis and may be a promising diagnostic tool for patients with hematological malignancies. (This study has been registered at ClinicalTrials.gov under registration no. NCT01576653.). Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  19. Methodology for obtaining wind gusts using Doppler lidar

    DEFF Research Database (Denmark)

    Suomi, Irene; Gryning, Sven-Erik; O'Connor, Ewan J.

    2017-01-01

    reduced the bias in the Doppler lidar gust factors from 0.07 to 0.03 and can be improved further to reduce the bias by using a realistic estimate of turbulence. Wind gust measurements are often prone to outliers in the time series, because they represent the maximum of a (moving-averaged) horizontal wind...... detection also outperformed the traditional Doppler lidar quality assurance method based on carrier-to-noise ratio, by removing additional unrealistic outliers present in the time series.......A new methodology is proposed for scaling Doppler lidar observations of wind gusts to make them comparable with those observed at a meteorological mast. Doppler lidars can then be used to measure wind gusts in regions and heights where traditional meteorological mast measurements are not available...

  20. Autoimmune hepatitis in a teenage boy: 'overlap' or 'outlier' syndrome--dilemma for internists.

    Science.gov (United States)

    Talukdar, Arunansu; Khanra, Dibbendhu; Mukherjee, Kabita; Saha, Manjari

    2013-02-08

    An 18-year-old boy presented with upper gastrointestinal bleeding and jaundice. Investigations revealed coarse hepatomegaly, splenomegaly and advanced oesophageal varices. Blood reports showed marked rise of alkaline phosphatase and more than twofold rise of transaminases and IgG. Liver histology was suggestive of piecemeal necrosis, interphase hepatitis and bile duct proliferation. Antinuclear antibody was positive in high titre along with positive antismooth muscle antibody and antimitochondrial antibody. The patient was positive for human leukocyte antigen DR3 type. Although an 'overlap' syndrome exists between autoimmune hepatitis (AIH) and primary biliary cirrhosis (PBC), a cholestatic variant of AIH, a rare 'outlier' syndrome could not be excluded in our case. Moreover, 'the chicken or the egg', AIH or PBC, the dilemma for the internists continued. The patient was put on steroid and ursodeoxycholic acid with unsatisfactory response. The existing international criteria for diagnosis of AIH are not generous enough to accommodate its variant forms.

  1. Automated Detection of Knickpoints and Knickzones Across Transient Landscapes

    Science.gov (United States)

    Gailleton, B.; Mudd, S. M.; Clubb, F. J.

    2017-12-01

    Mountainous regions are ubiquitously dissected by river channels, which transmit climate and tectonic signals to the rest of the landscape by adjusting their long profiles. Fluvial response to allogenic forcing is often expressed through the upstream propagation of steepened reaches, referred to as knickpoints or knickzones. The identification and analysis of these steepened reaches has numerous applications in geomorphology, such as modelling long-term landscape evolution, understanding controls on fluvial incision, and constraining tectonic uplift histories. Traditionally, the identification of knickpoints or knickzones from fluvial profiles requires manual selection or calibration. This process is both time-consuming and subjective, as different workers may select different steepened reaches within the profile. We propose an objective, statistically-based method to systematically pick knickpoints/knickzones on a landscape scale using an outlier-detection algorithm. Our method integrates river profiles normalised by drainage area (Chi, using the approach of Perron and Royden, 2013), then separates the chi-elevation plots into a series of transient segments using the method of Mudd et al. (2014). This method allows the systematic detection of knickpoints across a DEM, regardless of size, using a high-performance algorithm implemented in the open-source Edinburgh Land Surface Dynamics Topographic Tools (LSDTopoTools) software package. After initial knickpoint identification, outliers are selected using several sorting and binning methods based on the Median Absolute Deviation, to avoid the influence sample size. We test our method on a series of DEMs and grid resolutions, and show that our method consistently identifies accurate knickpoint locations across each landscape tested.

  2. Muscle MRS detects elevated PDE/ATP ratios prior to fatty infiltration in Becker muscular dystrophy.

    Science.gov (United States)

    Wokke, B H; Hooijmans, M T; van den Bergen, J C; Webb, A G; Verschuuren, J J; Kan, H E

    2014-11-01

    Becker muscular dystrophy (BMD) is characterized by progressive muscle weakness. Muscles show structural changes (fatty infiltration, fibrosis) and metabolic changes, both of which can be assessed using MRI and MRS. It is unknown at what stage of the disease process metabolic changes arise and how this might vary for different metabolites. In this study we assessed metabolic changes in skeletal muscles of Becker patients, both with and without fatty infiltration, quantified via Dixon MRI and (31) P MRS. MRI and (31) P MRS scans were obtained from 25 Becker patients and 14 healthy controls using a 7 T MR scanner. Five lower-leg muscles were individually assessed for fat and muscle metabolite levels. In the peroneus, soleus and anterior tibialis muscles with non-increased fat levels, PDE/ATP ratios were higher (P < 0.02) compared with controls, whereas in all muscles with increased fat levels PDE/ATP ratios were higher compared with healthy controls (P ≤ 0.05). The Pi /ATP ratio in the peroneus muscles was higher in muscles with increased fat fractions (P = 0.005), and the PCr/ATP ratio was lower in the anterior tibialis muscles with increased fat fractions (P = 0.005). There were no other significant changes in metabolites, but an increase in tissue pH was found in all muscles of the total group of BMD patients in comparison with healthy controls (P < 0.05). These findings suggest that (31) P MRS can be used to detect early changes in individual muscles of BMD patients, which are present before the onset of fatty infiltration. Copyright © 2014 John Wiley & Sons, Ltd.

  3. Robust motion correction and outlier rejection of in vivo functional MR images of the fetal brain and placenta during maternal hyperoxia

    OpenAIRE

    You, Wonsang; Serag, Ahmed; Evangelou, Iordanis E.; Andescavage, Nickie; Limperopoulos, Catherine

    2017-01-01

    Subject motion is a major challenge in functional magnetic resonance imaging studies (fMRI) of the fetal brain and placenta during maternal hyperoxia. We propose a motion correction and volume outlier rejection method for the correction of severe motion artifacts in both fetal brain and placenta. The method is optimized to the experimental design by processing different phases of acquisition separately. It also automatically excludes high-motion volumes and all the missing data are regressed ...

  4. Robust motion correction and outlier rejection of in vivo functional MR images of the fetal brain and placenta during maternal hyperoxia

    OpenAIRE

    You, Wonsang; Serag, Ahmed; Evangelou, Iordanis E.; Andescavage, Nickie; Limperopoulos, Catherine

    2015-01-01

    Subject motion is a major challenge in functional magnetic resonance imaging studies (fMRI) of the fetal brain and placenta during maternal hyperoxia. We propose a motion correction and volume outlier rejection method for the correction of severe motion artifacts in both fetal brain and placenta. The method is optimized to the experimental design by processing different phases of acquisition separately. It also automatically excludes high-motion volumes and all the missing data are regressed ...

  5. Detection of Deuterium in Icy Surfaces and the D/H Ratio of Icy Objects

    Science.gov (United States)

    Clark, Roger Nelson; Brown, Robert H.; Swayze, Gregg A.; Cruikshank, Dale P.

    2017-10-01

    Water ice in crystalline or amorphous form is orientationally disordered, which results in very broad absorptions. Deuterium in trace amounts goes into an ordered position, so is not broadened like H2O absorptions. The D-O stretch is located at 4.13 microns with a width of 0.027 micron. Laboratory spectral measurements on natural H2O and deuterium doped ice show the absorption is slightly asymmetric and in reflectance the band shifts from 4.132 to 4.137 microns as abundance decreases. We derive a preliminary absorption coefficient of ~ 80,000 cm^-1 for the D-O stretch compared to about 560 cm^-1 in H2O ice at 4.13 microns, enabling the detection of deuterium at levels less than Vienna Standard Mean Ocean Water (VSMOW), depending on S/N. How accurate the D/H ratios can be derived will require additional lab work and radiative transfer modeling to simultaneously derive the grain size distribution, the abundance of any contaminants, and deuterium abundance. To first order, the grain size distribution can be compensated by computing the D-O stretch band depth to 2-micron H2O ice band depth ratio, which we call Dratio. Colorado fresh water (~80% of VSMOW) has a Dratio of 0.036, at a D/H = 0.0005, the Dratio = 0.15, and at a D/H = 0.0025, the Dratio = 0.42. The VSMOW Dratio is ~ 0.045.We have used VIMS data from the Cassini spacecraft to compute large spectral averages to detect the deuterium in the rings and on the icy satellite surfaces. A B-ring, 21,882 pixel average, at 640 ms/pixel, or 3.89 hours of integration time, shows a 3.5% O-D stretch band depth and a Dratio = 0.045, indicating deuterium abundance equal to VSMOW. Rhea, using 1.89 hours of integration time shows Dratio = 0.052, or slightly higher than VSMOW. Phoebe has an unusually deep O-D stretch band of 1.85% considering the high abundance of dark material suppressing the ice absorptions. We measure a Dratio = 0.11, an enhancement of ~2.4 over VSMOW, but detailed radiative transfer modeling is needed to

  6. Receiver Signal to Noise Ratios for IPDA Lidars Using Sine-wave and Pulsed Laser Modulation and Direct Detections

    Science.gov (United States)

    Sun, Xiaoli; Abshire, James B.

    2011-01-01

    Integrated path differential absorption (IPDA) lidar can be used to remotely measure the column density of gases in the path to a scattering target [1]. The total column gas molecular density can be derived from the ratio of the laser echo signal power with the laser wavelength on the gas absorption line (on-line) to that off the line (off-line). 80th coherent detection and direct detection IPDA lidar have been used successfully in the past in horizontal path and airborne remote sensing measurements. However, for space based measurements, the signal propagation losses are often orders of magnitude higher and it is important to use the most efficient laser modulation and detection technique to minimize the average laser power and the electrical power from the spacecraft. This paper gives an analysis the receiver signal to noise ratio (SNR) of several laser modulation and detection techniques versus the average received laser power under similar operation environments. Coherent detection [2] can give the best receiver performance when the local oscillator laser is relatively strong and the heterodyne mixing losses are negligible. Coherent detection has a high signal gain and a very narrow bandwidth for the background light and detector dark noise. However, coherent detection must maintain a high degree of coherence between the local oscillator laser and the received signal in both temporal and spatial modes. This often results in a high system complexity and low overall measurement efficiency. For measurements through atmosphere the coherence diameter of the received signal also limits the useful size of the receiver telescope. Direct detection IPDA lidars are simpler to build and have fewer constraints on the transmitter and receiver components. They can use much larger size 'photon-bucket' type telescopes to reduce the demands on the laser transmitter. Here we consider the two most widely used direct detection IPDA lidar techniques. The first technique uses two CW

  7. Fusion of an Ensemble of Augmented Image Detectors for Robust Object Detection.

    Science.gov (United States)

    Wei, Pan; Ball, John E; Anderson, Derek T

    2018-03-17

    A significant challenge in object detection is accurate identification of an object's position in image space, whereas one algorithm with one set of parameters is usually not enough, and the fusion of multiple algorithms and/or parameters can lead to more robust results. Herein, a new computational intelligence fusion approach based on the dynamic analysis of agreement among object detection outputs is proposed. Furthermore, we propose an online versus just in training image augmentation strategy. Experiments comparing the results both with and without fusion are presented. We demonstrate that the augmented and fused combination results are the best, with respect to higher accuracy rates and reduction of outlier influences. The approach is demonstrated in the context of cone, pedestrian and box detection for Advanced Driver Assistance Systems (ADAS) applications.

  8. Fusion of an Ensemble of Augmented Image Detectors for Robust Object Detection

    Directory of Open Access Journals (Sweden)

    Pan Wei

    2018-03-01

    Full Text Available A significant challenge in object detection is accurate identification of an object’s position in image space, whereas one algorithm with one set of parameters is usually not enough, and the fusion of multiple algorithms and/or parameters can lead to more robust results. Herein, a new computational intelligence fusion approach based on the dynamic analysis of agreement among object detection outputs is proposed. Furthermore, we propose an online versus just in training image augmentation strategy. Experiments comparing the results both with and without fusion are presented. We demonstrate that the augmented and fused combination results are the best, with respect to higher accuracy rates and reduction of outlier influences. The approach is demonstrated in the context of cone, pedestrian and box detection for Advanced Driver Assistance Systems (ADAS applications.

  9. Detection of anomalous signals in temporally correlated data (Invited)

    Science.gov (United States)

    Langbein, J. O.

    2010-12-01

    Detection of transient tectonic signals in data obtained from large geodetic networks requires the ability to detect signals that are both temporally and spatially coherent. In this report I will describe a modification to an existing method that estimates both the coefficients of temporally correlated noise model and an efficient filter based on the noise model. This filter, when applied to the original time-series, effectively whitens (or flattens) the power spectrum. The filtered data provide the means to calculate running averages which are then used to detect deviations from the background trends. For large networks, time-series of signal-to-noise ratio (SNR) can be easily constructed since, by filtering, each of the original time-series has been transformed into one that is closer to having a Gaussian distribution with a variance of 1.0. Anomalous intervals may be identified by counting the number of GPS sites for which the SNR exceeds a specified value. For example, during one time interval, if there were 5 out of 20 time-series with SNR>2, this would be considered anomalous; typically, one would expect at 95% confidence that there would be at least 1 out of 20 time-series with an SNR>2. For time intervals with an anomalously large number of high SNR, the spatial distribution of the SNR is mapped to identify the location of the anomalous signal(s) and their degree of spatial clustering. Estimating the filter that should be used to whiten the data requires modification of the existing methods that employ maximum likelihood estimation to determine the temporal covariance of the data. In these methods, it is assumed that the noise components in the data are a combination of white, flicker and random-walk processes and that they are derived from three different and independent sources. Instead, in this new method, the covariance matrix is constructed assuming that only one source is responsible for the noise and that source can be represented as a white

  10. Total Variation Depth for Functional Data

    KAUST Repository

    Huang, Huang

    2016-11-15

    There has been extensive work on data depth-based methods for robust multivariate data analysis. Recent developments have moved to infinite-dimensional objects such as functional data. In this work, we propose a new notion of depth, the total variation depth, for functional data. As a measure of depth, its properties are studied theoretically, and the associated outlier detection performance is investigated through simulations. Compared to magnitude outliers, shape outliers are often masked among the rest of samples and harder to identify. We show that the proposed total variation depth has many desirable features and is well suited for outlier detection. In particular, we propose to decompose the total variation depth into two components that are associated with shape and magnitude outlyingness, respectively. This decomposition allows us to develop an effective procedure for outlier detection and useful visualization tools, while naturally accounting for the correlation in functional data. Finally, the proposed methodology is demonstrated using real datasets of curves, images, and video frames.

  11. Aircraft control surface failure detection and isolation using the OSGLR test. [orthogonal series generalized likelihood ratio

    Science.gov (United States)

    Bonnice, W. F.; Motyka, P.; Wagner, E.; Hall, S. R.

    1986-01-01

    The performance of the orthogonal series generalized likelihood ratio (OSGLR) test in detecting and isolating commercial aircraft control surface and actuator failures is evaluated. A modification to incorporate age-weighting which significantly reduces the sensitivity of the algorithm to modeling errors is presented. The steady-state implementation of the algorithm based on a single linear model valid for a cruise flight condition is tested using a nonlinear aircraft simulation. A number of off-nominal no-failure flight conditions including maneuvers, nonzero flap deflections, different turbulence levels and steady winds were tested. Based on the no-failure decision functions produced by off-nominal flight conditions, the failure detection and isolation performance at the nominal flight condition was determined. The extension of the algorithm to a wider flight envelope by scheduling on dynamic pressure and flap deflection is examined. Based on this testing, the OSGLR algorithm should be capable of detecting control surface failures that would affect the safe operation of a commercial aircraft. Isolation may be difficult if there are several surfaces which produce similar effects on the aircraft. Extending the algorithm over the entire operating envelope of a commercial aircraft appears feasible.

  12. Detecting microalbuminuria by urinary albumin/creatinine concentration ratio

    DEFF Research Database (Denmark)

    Jensen, J S; Clausen, P; Borch-Johnsen, K

    1997-01-01

    BACKGROUND: Microalbuminuria, i.e. a subclinical increase of the albumin excretion rate in urine, may be a novel atherosclerotic risk factor. This study aimed to test whether microalbuminuria can be identified by measurement of urinary albumin concentration or urinary albumin/creatinine concentra......BACKGROUND: Microalbuminuria, i.e. a subclinical increase of the albumin excretion rate in urine, may be a novel atherosclerotic risk factor. This study aimed to test whether microalbuminuria can be identified by measurement of urinary albumin concentration or urinary albumin/creatinine...... not included. Urinary albumin (Ualb) and creatinine (Ucreat) concentrations were measured in an overnight collected sample by enzyme-linked immunosorbent and colorimetric assays, respectively. Urinary albumin excretion rate (UAER) and urinary albumin/creatinine concentration ratio (Ualb/Ucreat) were calculated......, and 73%, 97%, and 73% for Ualb/Ucreat, respectively. CONCLUSIONS: It is concluded that measurement of the albumin/creatinine concentration ratio is a specific and quite sensitive alternative to measurement of the urinary albumin excretion rate in timed collections, when screening for microalbuminuria....

  13. Damage detection in carbon composite material typical of wind turbine blades using auto-associative neural networks

    Science.gov (United States)

    Dervilis, N.; Barthorpe, R. J.; Antoniadou, I.; Staszewski, W. J.; Worden, K.

    2012-04-01

    The structure of a wind turbine blade plays a vital role in the mechanical and structural operation of the turbine. As new generations of offshore wind turbines are trying to achieve a leading role in the energy market, key challenges such as a reliable Structural Health Monitoring (SHM) of the blades is significant for the economic and structural efficiency of the wind energy. Fault diagnosis of wind turbine blades is a "grand challenge" due to their composite nature, weight and length. The damage detection procedure involves additional difficulties focused on aerodynamic loads, environmental conditions and gravitational loads. It will be shown that vibration dynamic response data combined with AANNs is a robust and powerful tool, offering on-line and real time damage prediction. In this study the features used for SHM are Frequency Response Functions (FRFs) acquired via experimental methods based on an LMS system by which identification of mode shapes and natural frequencies is accomplished. The methods used are statistical outlier analysis which allows a diagnosis of deviation from normality and an Auto-Associative Neural Network (AANN). Both of these techniques are trained by adopting the FRF data for normal and damage condition. The AANN is a method which has not yet been widely used in the condition monitoring of composite materials of blades. This paper is trying to introduce a new scheme for damage detection, localisation and severity assessment by adopting simple measurements such as FRFs and exploiting multilayer neural networks and outlier novelty detection.

  14. Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets

    Directory of Open Access Journals (Sweden)

    Min-Wei Huang

    2018-01-01

    Full Text Available Many real-world medical datasets contain some proportion of missing (attribute values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.

  15. Gear Fault Detection Effectiveness as Applied to Tooth Surface Pitting Fatigue Damage

    Science.gov (United States)

    Lewicki, David G.; Dempsey, Paula J.; Heath, Gregory F.; Shanthakumaran, Perumal

    2010-01-01

    A study was performed to evaluate fault detection effectiveness as applied to gear-tooth-pitting-fatigue damage. Vibration and oil-debris monitoring (ODM) data were gathered from 24 sets of spur pinion and face gears run during a previous endurance evaluation study. Three common condition indicators (RMS, FM4, and NA4 [Ed. 's note: See Appendix A-Definitions D were deduced from the time-averaged vibration data and used with the ODM to evaluate their performance for gear fault detection. The NA4 parameter showed to be a very good condition indicator for the detection of gear tooth surface pitting failures. The FM4 and RMS parameters perfomu:d average to below average in detection of gear tooth surface pitting failures. The ODM sensor was successful in detecting a significant 8lDOunt of debris from all the gear tooth pitting fatigue failures. Excluding outliers, the average cumulative mass at the end of a test was 40 mg.

  16. Comparing a recursive digital filter with the moving-average and sequential probability-ratio detection methods for SNM portal monitors

    International Nuclear Information System (INIS)

    Fehlau, P.E.

    1993-01-01

    The author compared a recursive digital filter proposed as a detection method for French special nuclear material monitors with the author's detection methods, which employ a moving-average scaler or a sequential probability-ratio test. Each of these nine test subjects repeatedly carried a test source through a walk-through portal monitor that had the same nuisance-alarm rate with each method. He found that the average detection probability for the test source is also the same for each method. However, the recursive digital filter may have on drawback: its exponentially decreasing response to past radiation intensity prolongs the impact of any interference from radiation sources of radiation-producing machinery. He also examined the influence of each test subject on the monitor's operation by measuring individual attenuation factors for background and source radiation, then ranked the subjects' attenuation factors against their individual probabilities for detecting the test source. The one inconsistent ranking was probably caused by that subject's unusually long stride when passing through the portal

  17. A preliminary evaluation of the generalized likelihood ratio for detecting and identifying control element failures in a transport aircraft

    Science.gov (United States)

    Bundick, W. T.

    1985-01-01

    The application of the Generalized Likelihood Ratio technique to the detection and identification of aircraft control element failures has been evaluated in a linear digital simulation of the longitudinal dynamics of a B-737 aircraft. Simulation results show that the technique has potential but that the effects of wind turbulence and Kalman filter model errors are problems which must be overcome.

  18. Mixing ratio sensor of alcohol mixed fuel

    Energy Technology Data Exchange (ETDEWEB)

    Miyata, Shigeru; Matsubara, Yoshihiro

    1987-08-07

    In order to improve combustion efficiency of an internal combustion engine using gasoline-alcohol mixed fuel and to reduce harmful substance in its exhaust gas, it is necessary to control strictly the air-fuel ratio to be supplied and the ignition timing and change the condition of control depending upon the mixing ratio of the mixed fuel. In order to detect the mixing ratio of the mixed fuel, the above mixing ratio has so far been detected by casting a ray of light to the mixed fuel and utilizing a change of critical angle associated with the change of the composition of the fluid of the mixed fuel. However, in case when a light emitting diode is used for the light source above, two kinds of sensors are further needed. Concerning the two kinds of sensors above, this invention offers a mixing ratio sensor for the alcohol mixed fuel which can abolish a temperature sensor to detect the environmental temperature by making a single compensatory light receiving element deal with the compensation of the amount of light emission of the light emitting element due to the temperature change and the compensation of the critical angle caused by the temperature change. (6 figs)

  19. MIDAS robust trend estimator for accurate GPS station velocities without step detection

    Science.gov (United States)

    Blewitt, Geoffrey; Kreemer, Corné; Hammond, William C.; Gazeaux, Julien

    2016-03-01

    Automatic estimation of velocities from GPS coordinate time series is becoming required to cope with the exponentially increasing flood of available data, but problems detectable to the human eye are often overlooked. This motivates us to find an automatic and accurate estimator of trend that is resistant to common problems such as step discontinuities, outliers, seasonality, skewness, and heteroscedasticity. Developed here, Median Interannual Difference Adjusted for Skewness (MIDAS) is a variant of the Theil-Sen median trend estimator, for which the ordinary version is the median of slopes vij = (xj-xi)/(tj-ti) computed between all data pairs i > j. For normally distributed data, Theil-Sen and least squares trend estimates are statistically identical, but unlike least squares, Theil-Sen is resistant to undetected data problems. To mitigate both seasonality and step discontinuities, MIDAS selects data pairs separated by 1 year. This condition is relaxed for time series with gaps so that all data are used. Slopes from data pairs spanning a step function produce one-sided outliers that can bias the median. To reduce bias, MIDAS removes outliers and recomputes the median. MIDAS also computes a robust and realistic estimate of trend uncertainty. Statistical tests using GPS data in the rigid North American plate interior show ±0.23 mm/yr root-mean-square (RMS) accuracy in horizontal velocity. In blind tests using synthetic data, MIDAS velocities have an RMS accuracy of ±0.33 mm/yr horizontal, ±1.1 mm/yr up, with a 5th percentile range smaller than all 20 automatic estimators tested. Considering its general nature, MIDAS has the potential for broader application in the geosciences.

  20. Improving CT detection sensitivity for nodal metastases in oesophageal cancer with combination of smaller size and lymph node axial ratio

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Jianfang [Chinese Academy of Medical Sciences and Peking Union Medical College, National Cancer Center/Cancer Hospital, Beijing (China); Capital Medical University Electric Power Teaching Hospital, Beijing (China); Wang, Zhu; Qu, Dong; Yao, Libo [Chinese Academy of Medical Sciences and Peking Union Medical College, National Cancer Center/Cancer Hospital, Beijing (China); Shao, Huafei [Affiliated Yantai Yuhuangding Hospital of Qingdao University Medical College, Yantai (China); Liu, Jian [Meitan General Hospital, Beijing (China)

    2018-01-15

    To investigate the value of CT with inclusion of smaller lymph node (LN) sizes and axial ratio to improve the sensitivity in diagnosis of regional lymph node metastases in oesophageal squamous cell carcinoma (OSCC). The contrast-enhanced multidetector row spiral CT (MDCT) multiplanar reconstruction images of 204 patients with OSCC were retrospectively analysed. The long-axis and short-axis diameters of the regional LNs were measured and axial ratios were calculated (short-axis/long-axis diameters). Nodes were considered round if the axial ratio exceeded the optimal LN axial ratio, which was determined by receiver operating characteristic analysis. A positive predictive value (PPV) exceeding 50% is needed. This was achieved only with LNs larger than 9 mm in short-axis diameter, but nodes of this size were rare (sensitivity 37.3%, specificity 96.4%, accuracy 85.8%). If those round nodes (axial ratio exceeding 0.66) between 7 mm and 9 mm in size were considered metastases as well, it might improve the sensitivity to 67.2% with a PPV of 63.9% (specificity 91.6%, accuracy 87.2%). Combination of a smaller size and axial ratio for LNs in MDCT as criteria improves the detection sensitivity for LN metastases in OSCC. (orig.)

  1. Oestrus Detection in Dairy Cows Using Likelihood Ratio Tests

    DEFF Research Database (Denmark)

    Jónsson, Ragnar Ingi; Björgvinssin, Trausti; Blanke, Mogens

    2008-01-01

    This paper addresses detection of oestrus in dairy cows using methods from statistical change detection. The activity of the cows was measured by a necklace attached sensor. Statistical properties of the activity measure were investigated. Using data sets from 17 cows, diurnal activity variations...

  2. Damage Detection in an Operating Vestas V27 Wind Turbine Blade by use of Outlier Analysis

    DEFF Research Database (Denmark)

    Ulriksen, Martin Dalgaard; Tcherniak, Dmitri; Damkilde, Lars

    2015-01-01

    The present paper explores the application of a well-established vibration-based damage detection method to an operating Vestas V27 wind turbine blade. The blade is analyzed in a total of four states, namely, a healthy one plus three damaged ones in which trailing edge openings of increasing sizes...

  3. Improved nanostructure reconstruction by performing data refinement in optical scatterometry

    Science.gov (United States)

    Zhu, Jinlong; Jiang, Hao; Shi, Yating; Chen, Xiuguo; Zhang, Chuanwei; Liu, Shiyuan

    2016-01-01

    Recently, we have indirectly demonstrated that nanostructure reconstruction accuracy is degraded by the outliers in optical scatterometry, and we have applied the robust estimation method to suppress these outliers. However, the existence of a possible heavy masking effect could result in the risk of low measurement accuracy, since the detection of outliers is simply based on the judgment of residual value. In this work, a novel method is introduced to directly detect outliers, which can provide the intuitional display of outliers in a two-dimensional coordinate system. Moreover, a robust correction step based on the principle of least trimmed squared estimator regression is proposed to replace the conventional Gauss-Newton iteration step, by which the more reliable and accurate nanostructure reconstruction is achieved. The improved reconstruction of a one-dimensional etched Si grating has demonstrated the feasibility of the proposed methods.

  4. Improved nanostructure reconstruction by performing data refinement in optical scatterometry

    International Nuclear Information System (INIS)

    Zhu, Jinlong; Jiang, Hao; Shi, Yating; Chen, Xiuguo; Zhang, Chuanwei; Liu, Shiyuan

    2016-01-01

    Recently, we have indirectly demonstrated that nanostructure reconstruction accuracy is degraded by the outliers in optical scatterometry, and we have applied the robust estimation method to suppress these outliers. However, the existence of a possible heavy masking effect could result in the risk of low measurement accuracy, since the detection of outliers is simply based on the judgment of residual value. In this work, a novel method is introduced to directly detect outliers, which can provide the intuitional display of outliers in a two-dimensional coordinate system. Moreover, a robust correction step based on the principle of least trimmed squared estimator regression is proposed to replace the conventional Gauss–Newton iteration step, by which the more reliable and accurate nanostructure reconstruction is achieved. The improved reconstruction of a one-dimensional etched Si grating has demonstrated the feasibility of the proposed methods. (paper)

  5. Improved detection of sugar addition to maple syrup using malic acid as internal standard and in 13C isotope ratio mass spectrometry (IRMS).

    Science.gov (United States)

    Tremblay, Patrice; Paquin, Réal

    2007-01-24

    Stable carbon isotope ratio mass spectrometry (delta13C IRMS) was used to detect maple syrup adulteration by exogenous sugar addition (beet and cane sugar). Malic acid present in maple syrup is proposed as an isotopic internal standard to improve actual adulteration detection levels. A lead precipitation method has been modified to isolate quantitatively malic acid from maple syrup using preparative reversed-phase liquid chromatography. The stable carbon isotopic ratio of malic acid isolated from this procedure shows an excellent accuracy and repeatability of 0.01 and 0.1 per thousand respectively, confirming that the modified lead precipitation method is an isotopic fractionation-free process. A new approach is proposed to detect adulteration based on the correlation existing between the delta13Cmalic acid and the delta13Csugars-delta13Cmalic acid (r = 0.704). This technique has been tested on a set of 56 authentic maple syrup samples. Additionally, authentic samples were spiked with exogeneous sugars. The mean theoretical detection level was statistically lowered using this technique in comparison with the usual two-standard deviation approach, especially when maple syrup is adulterated with beet sugar : 24 +/- 12% of adulteration detection versus 48 +/- 20% (t-test, p = 7.3 x 10-15). The method was also applied to published data for pineapple juices and honey with the same improvement.

  6. An Improved Generalized Predictive Control in a Robust Dynamic Partial Least Square Framework

    Directory of Open Access Journals (Sweden)

    Jin Xin

    2015-01-01

    Full Text Available To tackle the sensitivity to outliers in system identification, a new robust dynamic partial least squares (PLS model based on an outliers detection method is proposed in this paper. An improved radial basis function network (RBFN is adopted to construct the predictive model from inputs and outputs dataset, and a hidden Markov model (HMM is applied to detect the outliers. After outliers are removed away, a more robust dynamic PLS model is obtained. In addition, an improved generalized predictive control (GPC with the tuning weights under dynamic PLS framework is proposed to deal with the interaction which is caused by the model mismatch. The results of two simulations demonstrate the effectiveness of proposed method.

  7. When the Plus Sign is a Negative: Challenging and Reinforcing Embodied Stigmas Through Outliers and Counter-Narratives.

    Science.gov (United States)

    Lippert, Alexandra

    2017-11-30

    When individuals become aware of their stigma, they attempt to manage their identity through discourses that both challenge and reinforce power. Identity management is fraught with tensions between the desire to fit normative social constructions and counter the same discourse. This essay explores identity management in the midst of the embodied stigmas concerning unplanned pregnancy during college and raising a biracial son. In doing so, this essay points to the difference between outlier narratives and counter-narratives. The author encourages health communication scholars to explore conditions under which storytelling moves beyond the personal to the political. Emancipatory intent does not guarantee emancipatory outcomes. Storytelling can function therapeutically for individuals while failing to redress forces that constrain human potential and agency.

  8. Radioactive anomaly discrimination from spectral ratios

    Science.gov (United States)

    Maniscalco, James; Sjoden, Glenn; Chapman, Mac Clements

    2013-08-20

    A method for discriminating a radioactive anomaly from naturally occurring radioactive materials includes detecting a first number of gamma photons having energies in a first range of energy values within a predetermined period of time and detecting a second number of gamma photons having energies in a second range of energy values within the predetermined period of time. The method further includes determining, in a controller, a ratio of the first number of gamma photons having energies in the first range and the second number of gamma photons having energies in the second range, and determining that a radioactive anomaly is present when the ratio exceeds a threshold value.

  9. Environmetrics. Part 1. Modeling of water salinity and air quality data

    International Nuclear Information System (INIS)

    Braibanti, A.; Gollapalli, N. R.; Jonnalagaddaj, S. B.; Duvvuru, S.; Rupenaguntla, S. R.

    2001-01-01

    Environmetrics utilities advanced mathematical, statistical and information tools to extract information. Two typical environmental data sets are analysed using MVATOB (Multi Variate Tool Box). The first data set corresponds to the variable river salinity. Least median squares (LMS) detected the outliers whereas linear least squares (LLS) could not detect and remove the outliers. The second data set consists of daily readings of air quality values. Outliers are detected by LMS and unbiased regression coefficients are estimated by multi-linear regression (MLR). As explanatory variables are not independent, principal component regression (PCR) and partial least squares regression (PLSR) are used. Both examples demonstrate the superiority of LMS over LLS [it

  10. Evaluating Effect of Albendazole on Trichuris trichiura Infection: A Systematic Review Article.

    Science.gov (United States)

    Ahmadi Jouybari, Toraj; Najaf Ghobadi, Khadije; Lotfi, Bahare; Alavi Majd, Hamid; Ahmadi, Nayeb Ali; Rostami-Nejad, Mohammad; Aghaei, Abbas

    2016-01-01

    The aim of the study was assessment of defaults and conducted meta-analysis of the efficacy of single-dose oral albendazole against T. trichiura infection. We searched PubMed, ISI Web of Science, Science Direct, the Cochrane Central Register of Controlled Trials, and WHO library databases between 1983 and 2014. Data from 13 clinical trial articles were used. Each article was included the effect of single oral dose (400 mg) albendazole and placebo in treating two groups of patients with T. trichiura infection. For both groups in each article, sample size, the number of those with T. trichiura infection, and the number of those recovered following the intake of albendazole were identified and recorded. The relative risk and variance were computed. Funnel plot, Beggs and Eggers tests were used for assessment of publication bias. The random effect variance shift outlier model and likelihood ratio test were applied for detecting outliers. In order to detect influence, DFFITS values, Cook's distances and COVRATIO were used. Data were analyzed using STATA and R software. The article number 13 and 9 were outlier and influence, respectively. Outlier is diagnosed by variance shift of target study in inferential method and by RR value in graphical method. Funnel plot and Beggs test did not show the publication bias ( P =0.272). However, the Eggers test confirmed it ( P =0.034). Meta-analysis after removal of article 13 showed that relative risk was 1.99 (CI 95% 1.71 - 2.31). The estimated RR and our meta-analyses show that treatment of T. trichiura with single oral doses of albendazole is unsatisfactory. New anthelminthics are urgently needed.

  11. Detection of Doppler Microembolic Signals Using High Order Statistics

    Directory of Open Access Journals (Sweden)

    Maroun Geryes

    2016-01-01

    Full Text Available Robust detection of the smallest circulating cerebral microemboli is an efficient way of preventing strokes, which is second cause of mortality worldwide. Transcranial Doppler ultrasound is widely considered the most convenient system for the detection of microemboli. The most common standard detection is achieved through the Doppler energy signal and depends on an empirically set constant threshold. On the other hand, in the past few years, higher order statistics have been an extensive field of research as they represent descriptive statistics that can be used to detect signal outliers. In this study, we propose new types of microembolic detectors based on the windowed calculation of the third moment skewness and fourth moment kurtosis of the energy signal. During energy embolus-free periods the distribution of the energy is not altered and the skewness and kurtosis signals do not exhibit any peak values. In the presence of emboli, the energy distribution is distorted and the skewness and kurtosis signals exhibit peaks, corresponding to the latter emboli. Applied on real signals, the detection of microemboli through the skewness and kurtosis signals outperformed the detection through standard methods. The sensitivities and specificities reached 78% and 91% and 80% and 90% for the skewness and kurtosis detectors, respectively.

  12. An approach to the analysis of SDSS spectroscopic outliers based on self-organizing maps. Designing the outlier analysis software package for the next Gaia survey

    Science.gov (United States)

    Fustes, D.; Manteiga, M.; Dafonte, C.; Arcay, B.; Ulla, A.; Smith, K.; Borrachero, R.; Sordo, R.

    2013-11-01

    Aims: A new method applied to the segmentation and further analysis of the outliers resulting from the classification of astronomical objects in large databases is discussed. The method is being used in the framework of the Gaia satellite Data Processing and Analysis Consortium (DPAC) activities to prepare automated software tools that will be used to derive basic astrophysical information that is to be included in final Gaia archive. Methods: Our algorithm has been tested by means of simulated Gaia spectrophotometry, which is based on SDSS observations and theoretical spectral libraries covering a wide sample of astronomical objects. Self-organizing maps networks are used to organize the information in clusters of objects, as homogeneously as possible according to their spectral energy distributions, and to project them onto a 2D grid where the data structure can be visualized. Results: We demonstrate the usefulness of the method by analyzing the spectra that were rejected by the SDSS spectroscopic classification pipeline and thus classified as "UNKNOWN". First, our method can help distinguish between astrophysical objects and instrumental artifacts. Additionally, the application of our algorithm to SDSS objects of unknown nature has allowed us to identify classes of objects with similar astrophysical natures. In addition, the method allows for the potential discovery of hundreds of new objects, such as white dwarfs and quasars. Therefore, the proposed method is shown to be very promising for data exploration and knowledge discovery in very large astronomical databases, such as the archive from the upcoming Gaia mission.

  13. Data Fault Detection in Medical Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yang Yang

    2015-03-01

    Full Text Available Medical body sensors can be implanted or attached to the human body to monitor the physiological parameters of patients all the time. Inaccurate data due to sensor faults or incorrect placement on the body will seriously influence clinicians’ diagnosis, therefore detecting sensor data faults has been widely researched in recent years. Most of the typical approaches to sensor fault detection in the medical area ignore the fact that the physiological indexes of patients aren’t changing synchronously at the same time, and fault values mixed with abnormal physiological data due to illness make it difficult to determine true faults. Based on these facts, we propose a Data Fault Detection mechanism in Medical sensor networks (DFD-M. Its mechanism includes: (1 use of a dynamic-local outlier factor (D-LOF algorithm to identify outlying sensed data vectors; (2 use of a linear regression model based on trapezoidal fuzzy numbers to predict which readings in the outlying data vector are suspected to be faulty; (3 the proposal of a novel judgment criterion of fault state according to the prediction values. The simulation results demonstrate the efficiency and superiority of DFD-M.

  14. Control device of air-fuel ratio of alcohol-gasoline mixed fuel

    Energy Technology Data Exchange (ETDEWEB)

    Takahashi, Kazuo

    1987-08-19

    Concerning alcohol-gasoline mixed fuel, even the same amount of the fuel shows different air-fuel ratio depending upon alcohol concentration in the fuel, accordingly it is required to know the alcohol concentration when it is intended to make the air-fuel ratio to be the same as the predetermined ratio. Although a sensor which can detect in quick response and exactly the alcohol concentration has not been developed, the alcohol concentration in gasoline can be detected by detecting the concentration of the water in exhaust gas and many hygrometers which can detect the concentration of the water with high precision are available. With regard to an internal combustion engine equipped with a fuel supply device in order to supply alcohol-gasoline mixed fuel into an engine suction passage, this invention offers an air-fuel ratio control device to control the amount of the fuel to be supplied from the fuel supply device by detecting the concentration of alcohol in the gasoline from among the output signals of the main hygrometer and the auxiliary hygrometer. The former hygrometer to detect the concentration of the water in the exhaust gas is set in the engine exhaust gas passage and the latter is installed to detect the concentration of the water in the air. (4 figs)

  15. Mitochondrial DNA heritage of Cres Islanders--example of Croatian genetic outliers.

    Science.gov (United States)

    Jeran, Nina; Havas Augustin, Dubravka; Grahovac, Blaienka; Kapović, Miljenko; Metspalu, Ene; Villems, Richard; Rudan, Pavao

    2009-12-01

    Diversity of mitochondrial DNA (mtDNA) lineages of the Island of Cres was determined by high-resolution phylogenetic analysis on a sample of 119 adult unrelated individuals from eight settlements. The composition of mtDNA pool of this Island population is in contrast with other Croatian and European populations. The analysis revealed the highest frequency of haplogroup U (29.4%) with the predominance of one single lineage of subhaplogroup U2e (20.2%). Haplogroup H is the second most prevalent one with only 27.7%. Other very interesting features of contemporary Island population are extremely low frequency of haplogroup J (only 0.84%), and much higher frequency of haplogroup W (12.6%) comparing to other Croatian and European populations. Especially interesting finding is a strikingly higher frequency of haplogroup N1a (9.24%) presented with African/south Asian branch almost absent in Europeans, while its European sister-branch, proved to be highly prevalent among Neolithic farmers, is present in contemporary Europeans with only 0.2%. Haplotype analysis revealed that only five mtDNA lineages account for almost 50% of maternal genetic heritage of this island and they present supposed founder lineages. All presented findings confirm that genetic drift, especially founder effect, has played significant role in shaping genetic composition of the isolated population of the Island of Cres. Due to presented data contemporary population of Cres Island can be considered as genetic "outlier" among Croatian populations.

  16. A quick method based on SIMPLISMA-KPLS for simultaneously selecting outlier samples and informative samples for model standardization in near infrared spectroscopy

    Science.gov (United States)

    Li, Li-Na; Ma, Chang-Ming; Chang, Ming; Zhang, Ren-Cheng

    2017-12-01

    A novel method based on SIMPLe-to-use Interactive Self-modeling Mixture Analysis (SIMPLISMA) and Kernel Partial Least Square (KPLS), named as SIMPLISMA-KPLS, is proposed in this paper for selection of outlier samples and informative samples simultaneously. It is a quick algorithm used to model standardization (or named as model transfer) in near infrared (NIR) spectroscopy. The NIR experiment data of the corn for analysis of the protein content is introduced to evaluate the proposed method. Piecewise direct standardization (PDS) is employed in model transfer. And the comparison of SIMPLISMA-PDS-KPLS and KS-PDS-KPLS is given in this research by discussion of the prediction accuracy of protein content and calculation speed of each algorithm. The conclusions include that SIMPLISMA-KPLS can be utilized as an alternative sample selection method for model transfer. Although it has similar accuracy to Kennard-Stone (KS), it is different from KS as it employs concentration information in selection program. This means that it ensures analyte information is involved in analysis, and the spectra (X) of the selected samples is interrelated with concentration (y). And it can be used for outlier sample elimination simultaneously by validation of calibration. According to the statistical data results of running time, it is clear that the sample selection process is more rapid when using KPLS. The quick algorithm of SIMPLISMA-KPLS is beneficial to improve the speed of online measurement using NIR spectroscopy.

  17. Data Mining for Anomaly Detection

    Science.gov (United States)

    Biswas, Gautam; Mack, Daniel; Mylaraswamy, Dinkar; Bharadwaj, Raj

    2013-01-01

    The Vehicle Integrated Prognostics Reasoner (VIPR) program describes methods for enhanced diagnostics as well as a prognostic extension to current state of art Aircraft Diagnostic and Maintenance System (ADMS). VIPR introduced a new anomaly detection function for discovering previously undetected and undocumented situations, where there are clear deviations from nominal behavior. Once a baseline (nominal model of operations) is established, the detection and analysis is split between on-aircraft outlier generation and off-aircraft expert analysis to characterize and classify events that may not have been anticipated by individual system providers. Offline expert analysis is supported by data curation and data mining algorithms that can be applied in the contexts of supervised learning methods and unsupervised learning. In this report, we discuss efficient methods to implement the Kolmogorov complexity measure using compression algorithms, and run a systematic empirical analysis to determine the best compression measure. Our experiments established that the combination of the DZIP compression algorithm and CiDM distance measure provides the best results for capturing relevant properties of time series data encountered in aircraft operations. This combination was used as the basis for developing an unsupervised learning algorithm to define "nominal" flight segments using historical flight segments.

  18. Anomaly Detection for Beam Loss Maps in the Large Hadron Collider

    Science.gov (United States)

    Valentino, Gianluca; Bruce, Roderik; Redaelli, Stefano; Rossi, Roberto; Theodoropoulos, Panagiotis; Jaster-Merz, Sonja

    2017-07-01

    In the LHC, beam loss maps are used to validate collimator settings for cleaning and machine protection. This is done by monitoring the loss distribution in the ring during infrequent controlled loss map campaigns, as well as in standard operation. Due to the complexity of the system, consisting of more than 50 collimators per beam, it is difficult to identify small changes in the collimation hierarchy, which may be due to setting errors or beam orbit drifts with such methods. A technique based on Principal Component Analysis and Local Outlier Factor is presented to detect anomalies in the loss maps and therefore provide an automatic check of the collimation hierarchy.

  19. Anomaly Detection for Beam Loss Maps in the Large Hadron Collider

    International Nuclear Information System (INIS)

    Valentino, Gianluca; Bruce, Roderik; Redaelli, Stefano; Rossi, Roberto; Theodoropoulos, Panagiotis; Jaster-Merz, Sonja

    2017-01-01

    In the LHC, beam loss maps are used to validate collimator settings for cleaning and machine protection. This is done by monitoring the loss distribution in the ring during infrequent controlled loss map campaigns, as well as in standard operation. Due to the complexity of the system, consisting of more than 50 collimators per beam, it is difficult to identify small changes in the collimation hierarchy, which may be due to setting errors or beam orbit drifts with such methods. A technique based on Principal Component Analysis and Local Outlier Factor is presented to detect anomalies in the loss maps and therefore provide an automatic check of the collimation hierarchy. (paper)

  20. A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia

    Science.gov (United States)

    Radhakrishnan, Regunathan; Divakaran, Ajay; Xiong, Ziyou; Otsuka, Isao

    2006-12-01

    We propose a content-adaptive analysis and representation framework to discover events using audio features from "unscripted" multimedia such as sports and surveillance for summarization. The proposed analysis framework performs an inlier/outlier-based temporal segmentation of the content. It is motivated by the observation that "interesting" events in unscripted multimedia occur sparsely in a background of usual or "uninteresting" events. We treat the sequence of low/mid-level features extracted from the audio as a time series and identify subsequences that are outliers. The outlier detection is based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the subsequences of the time series. We define the confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the parameters of the proposed framework and the confidence measure. Furthermore, we use the confidence measure to rank the detected outliers in terms of their departures from the background process. Our experimental results with sequences of low- and mid-level audio features extracted from sports video show that "highlight" events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out suspicious events from surveillance videos without any a priori knowledge. We show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length. Finally, we also show that the proposed framework can be used to systematically select "key audio classes" that are indicative of events of interest in the chosen domain.

  1. Detecting animal by-product intake using stable isotope ratio mass spectrometry (IRMS).

    Science.gov (United States)

    da Silva, D A F; Biscola, N P; Dos Santos, L D; Sartori, M M P; Denadai, J C; da Silva, E T; Ducatti, C; Bicudo, S D; Barraviera, B; Ferreira, R S

    2016-11-01

    Sheep are used in many countries as food and for manufacturing bioproducts. However, when these animals consume animal by-products (ABP), which is widely prohibited, there is a risk of transmitting scrapie - a fatal prion disease in human beings. Therefore, it is essential to develop sensitive methods to detect previous ABP intake to select safe animals for producing biopharmaceuticals. We used stable isotope ratio mass spectrometry (IRMS) for 13 C and 15 N to trace animal proteins in the serum of three groups of sheep: 1 - received only vegetable protein (VP) for 89 days; 2 - received animal and vegetable protein (AVP); and 3 - received animal and vegetable protein with animal protein subsequently removed (AVPR). Groups 2 and 3 received diets with 30% bovine meat and bone meal (MBM) added to a vegetable diet (from days 16-89 in the AVP group and until day 49 in the AVPR group, when MBM was removed). The AVPR group showed 15 N equilibrium 5 days after MBM removal (54th day). Conversely, 15 N equilibrium in the AVP group occurred 22 days later (76th day). The half-life differed between these groups by 3.55 days. In the AVPR group, 15 N elimination required 53 days, which was similar to this isotope's incorporation time. Turnover was determined based on natural 15 N signatures. IRMS followed by turnover calculations was used to evaluate the time period for the incorporation and elimination of animal protein in sheep serum. The δ 13 C and δ 15 N values were used to track animal protein in the diet. This method is biologically and economically relevant for the veterinary field because it can track protein over time or make a point assessment of animal feed with high sensitivity and resolution, providing a low-cost analysis coupled with fast detection. Isotopic profiles could be measured throughout the experimental period, demonstrating the potential to use the method for traceability and certification assessments. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Robust bivariate error detection in skewed data with application to historical radiosonde winds

    KAUST Repository

    Sun, Ying

    2017-01-18

    The global historical radiosonde archives date back to the 1920s and contain the only directly observed measurements of temperature, wind, and moisture in the upper atmosphere, but they contain many random errors. Most of the focus on cleaning these large datasets has been on temperatures, but winds are important inputs to climate models and in studies of wind climatology. The bivariate distribution of the wind vector does not have elliptical contours but is skewed and heavy-tailed, so we develop two methods for outlier detection based on the bivariate skew-t (BST) distribution, using either distance-based or contour-based approaches to flag observations as potential outliers. We develop a framework to robustly estimate the parameters of the BST and then show how the tuning parameter to get these estimates is chosen. In simulation, we compare our methods with one based on a bivariate normal distribution and a nonparametric approach based on the bagplot. We then apply all four methods to the winds observed for over 35,000 radiosonde launches at a single station and demonstrate differences in the number of observations flagged across eight pressure levels and through time. In this pilot study, the method based on the BST contours performs very well.

  3. Robust bivariate error detection in skewed data with application to historical radiosonde winds

    KAUST Repository

    Sun, Ying; Hering, Amanda S.; Browning, Joshua M.

    2017-01-01

    The global historical radiosonde archives date back to the 1920s and contain the only directly observed measurements of temperature, wind, and moisture in the upper atmosphere, but they contain many random errors. Most of the focus on cleaning these large datasets has been on temperatures, but winds are important inputs to climate models and in studies of wind climatology. The bivariate distribution of the wind vector does not have elliptical contours but is skewed and heavy-tailed, so we develop two methods for outlier detection based on the bivariate skew-t (BST) distribution, using either distance-based or contour-based approaches to flag observations as potential outliers. We develop a framework to robustly estimate the parameters of the BST and then show how the tuning parameter to get these estimates is chosen. In simulation, we compare our methods with one based on a bivariate normal distribution and a nonparametric approach based on the bagplot. We then apply all four methods to the winds observed for over 35,000 radiosonde launches at a single station and demonstrate differences in the number of observations flagged across eight pressure levels and through time. In this pilot study, the method based on the BST contours performs very well.

  4. Likelihood-ratio-based biometric verification

    NARCIS (Netherlands)

    Bazen, A.M.; Veldhuis, Raymond N.J.

    2002-01-01

    This paper presents results on optimal similarity measures for biometric verification based on fixed-length feature vectors. First, we show that the verification of a single user is equivalent to the detection problem, which implies that for single-user verification the likelihood ratio is optimal.

  5. Likelihood Ratio-Based Biometric Verification

    NARCIS (Netherlands)

    Bazen, A.M.; Veldhuis, Raymond N.J.

    The paper presents results on optimal similarity measures for biometric verification based on fixed-length feature vectors. First, we show that the verification of a single user is equivalent to the detection problem, which implies that, for single-user verification, the likelihood ratio is optimal.

  6. Hepatic MR imaging for in vivo differentiation of steatosis, iron deposition and combined storage disorder: Single-ratio in/opposed phase analysis vs. dual-ratio Dixon discrimination

    International Nuclear Information System (INIS)

    Bashir, Mustafa R.; Merkle, Elmar M.; Smith, Alastair D.; Boll, Daniel T.

    2012-01-01

    Objective: To assess whether in vivo dual-ratio Dixon discrimination can improve detection of diffuse liver disease, specifically steatosis, iron deposition and combined disease over traditional single-ratio in/opposed phase analysis. Methods: Seventy-one patients with biopsy-proven (17.7 ± 17.0 days) hepatic steatosis (n = 16), iron deposition (n = 11), combined deposition (n = 3) and neither disease (n = 41) underwent MR examinations. Dual-echo in/opposed-phase MR with Dixon water/fat reconstructions were acquired. Analysis consisted of: (a) single-ratio hepatic region-of-interest (ROI)-based assessment of in/opposed ratios; (b) dual-ratio hepatic ROI assessment of in/opposed and fat/water ratios; (c) computer-aided dual-ratio assessment evaluating all hepatic voxels. Disease-specific thresholds were determined; statistical analyses assessed disease-dependent voxel ratios, based on single-ratio (a) and dual-ratio (b and c) techniques. Results: Single-ratio discrimination succeeded in identifying iron deposition (I/O Ironthreshold Fatthreshold>1.15 ) from normal parenchyma, sensitivity 70.0%; it failed to detect combined disease. Dual-ratio discrimination succeeded in identifying abnormal hepatic parenchyma (F/W Normalthreshold > 0.05), sensitivity 96.7%; logarithmic functions for iron deposition (I/O Iron d iscriminator (0.01−F/W Iron )/0.48 ) and for steatosis (I/O Fatdiscriminator > e (F/W Fat −0.01)/0.48 ) differentiated combined from isolated diseases, sensitivity 100.0%; computer-aided dual-ratio analysis was comparably sensitive but less specific, 90.2% vs. 97.6%. Conclusion: MR two-point-Dixon imaging using dual-ratio post-processing based on in/opposed and fat/water ratios improved in vivo detection of hepatic steatosis, iron deposition, and combined storage disease beyond traditional in/opposed analysis.

  7. Mixing ratio sensor for alcohol mixed fuel

    Energy Technology Data Exchange (ETDEWEB)

    Miyata, Shigeru; Matsubara, Yoshihiro

    1987-08-24

    In order to improve the combustion efficiency of an internal combustion engine using gasoline-alcohol mixed fuel and to reduce harmful substance in its exhaust gas, it is necessary to control strictly the air-fuel ratio to be supplied and the ignition timing. In order to detect the mixing ratio of the mixed fuel, a mixing ratio sensor has so far been proposed to detect the above mixing ratio by casting a ray of light to the mixed fuel and utilizing a change of critical angle associated with the change of the composition of the fluid of the mixed fuel. However, because of the arrangement of its transparent substance in the fuel passage with the sealing material in between, this sensor invited the leakage of the fluid due to deterioration of the sealing material, etc. and its cost became high because of too many parts to be assembled. In view of the above, in order to reduce the number of parts, to lower the cost of parts and the assembling cost and to secure no fluid leakage from the fuel passage, this invention formed the above fuel passage and the above transparent substance both concerning the above mixing ratio sensor in an integrated manner using light transmitting resin. (3 figs)

  8. EasyPCC: Benchmark Datasets and Tools for High-Throughput Measurement of the Plant Canopy Coverage Ratio under Field Conditions

    Directory of Open Access Journals (Sweden)

    Wei Guo

    2017-04-01

    Full Text Available Understanding interactions of genotype, environment, and management under field conditions is vital for selecting new cultivars and farming systems. Image analysis is considered a robust technique in high-throughput phenotyping with non-destructive sampling. However, analysis of digital field-derived images remains challenging because of the variety of light intensities, growth environments, and developmental stages. The plant canopy coverage (PCC ratio is an important index of crop growth and development. Here, we present a tool, EasyPCC, for effective and accurate evaluation of the ground coverage ratio from a large number of images under variable field conditions. The core algorithm of EasyPCC is based on a pixel-based segmentation method using a decision-tree-based segmentation model (DTSM. EasyPCC was developed under the MATLAB® and R languages; thus, it could be implemented in high-performance computing to handle large numbers of images following just a single model training process. This study used an experimental set of images from a paddy field to demonstrate EasyPCC, and to show the accuracy improvement possible by adjusting key points (e.g., outlier deletion and model retraining. The accuracy (R2 = 0.99 of the calculated coverage ratio was validated against a corresponding benchmark dataset. The EasyPCC source code is released under GPL license with benchmark datasets of several different crop types for algorithm development and for evaluating ground coverage ratios.

  9. Time-series models on somatic cell score improve detection of matistis

    DEFF Research Database (Denmark)

    Norberg, E; Korsgaard, I R; Sloth, K H M N

    2008-01-01

    In-line detection of mastitis using frequent milk sampling was studied in 241 cows in a Danish research herd. Somatic cell scores obtained at a daily basis were analyzed using a mixture of four time-series models. Probabilities were assigned to each model for the observations to belong to a normal...... "steady-state" development, change in "level", change of "slope" or "outlier". Mastitis was indicated from the sum of probabilities for the "level" and "slope" models. Time-series models were based on the Kalman filter. Reference data was obtained from veterinary assessment of health status combined...... with bacteriological findings. At a sensitivity of 90% the corresponding specificity was 68%, which increased to 83% using a one-step back smoothing. It is concluded that mixture models based on Kalman filters are efficient in handling in-line sensor data for detection of mastitis and may be useful for similar...

  10. DETECÇÃO DE OUTLIERS NO DESEMPENHO ECONÔMICO-FINANCEIRO DO SPORT CLUB CORINTHIANS PAULISTA NO PERÍODO 2008 A 2010

    Directory of Open Access Journals (Sweden)

    Marke Geisy da Silva Dantas

    2011-12-01

    Full Text Available Os ativos intangíveis permeiam o mercado de futebol onde os principais ativos das entidades futebolísticas são os contratos com os jogadores e os torcedores são considerados usuários importantes da informação contábil, uma vez que fornecem recursos para tais entidades. É dentro desse contexto que o estudo ganha relevância, visando analisar a presença de outliers nas contas do Sport Club Corinthians Paulista, referente aos anos de 2008 e 2009, quando o clube participou da Série B do Campeonato Brasileiro e quando foi efetivada a contratação de Ronaldo, respectivamente. No tocante aos procedimentos metodológicos, essa pesquisa se constitui de um estudo exploratório, demonstrando a utilização do teste de Grubbs para analisar o impacto dos ativos intangíveis sobre as contas do Corinthians, detectando anormalidades nos anos estudados. Os dados foram coletados em sites e artigos que tratavam sobre a mensuração e o enquadramento como ativo dos jogadores de futebol. Para o tratamento dos dados foi utilizada a planilha eletrônica MICROSOFT EXCEL®. Os resultados demonstraram um grande aumento percentual nas contas estudadas na comparação dos anos. Foram encontrados dois outliers em 2008 (Licenciamentos e franquias e Ativo Total, mas, em 2009 foram encontradas cinco contas que ultrapassaram a normalidade (“Licenciamentos e franquias”, “Patrocínio e publicidades”, “Arrecadação de jogos”, “Direitos de TV” e “Premiação em campeonatos”. Em 2010, só a conta “Direitos de TV”.

  11. Automatic Pedestrian Crossing Detection and Impairment Analysis Based on Mobile Mapping System

    Science.gov (United States)

    Liu, X.; Zhang, Y.; Li, Q.

    2017-09-01

    Pedestrian crossing, as an important part of transportation infrastructures, serves to secure pedestrians' lives and possessions and keep traffic flow in order. As a prominent feature in the street scene, detection of pedestrian crossing contributes to 3D road marking reconstruction and diminishing the adverse impact of outliers in 3D street scene reconstruction. Since pedestrian crossing is subject to wearing and tearing from heavy traffic flow, it is of great imperative to monitor its status quo. On this account, an approach of automatic pedestrian crossing detection using images from vehicle-based Mobile Mapping System is put forward and its defilement and impairment are analyzed in this paper. Firstly, pedestrian crossing classifier is trained with low recall rate. Then initial detections are refined by utilizing projection filtering, contour information analysis, and monocular vision. Finally, a pedestrian crossing detection and analysis system with high recall rate, precision and robustness will be achieved. This system works for pedestrian crossing detection under different situations and light conditions. It can recognize defiled and impaired crossings automatically in the meanwhile, which facilitates monitoring and maintenance of traffic facilities, so as to reduce potential traffic safety problems and secure lives and property.

  12. AUTOMATIC PEDESTRIAN CROSSING DETECTION AND IMPAIRMENT ANALYSIS BASED ON MOBILE MAPPING SYSTEM

    Directory of Open Access Journals (Sweden)

    X. Liu

    2017-09-01

    Full Text Available Pedestrian crossing, as an important part of transportation infrastructures, serves to secure pedestrians’ lives and possessions and keep traffic flow in order. As a prominent feature in the street scene, detection of pedestrian crossing contributes to 3D road marking reconstruction and diminishing the adverse impact of outliers in 3D street scene reconstruction. Since pedestrian crossing is subject to wearing and tearing from heavy traffic flow, it is of great imperative to monitor its status quo. On this account, an approach of automatic pedestrian crossing detection using images from vehicle-based Mobile Mapping System is put forward and its defilement and impairment are analyzed in this paper. Firstly, pedestrian crossing classifier is trained with low recall rate. Then initial detections are refined by utilizing projection filtering, contour information analysis, and monocular vision. Finally, a pedestrian crossing detection and analysis system with high recall rate, precision and robustness will be achieved. This system works for pedestrian crossing detection under different situations and light conditions. It can recognize defiled and impaired crossings automatically in the meanwhile, which facilitates monitoring and maintenance of traffic facilities, so as to reduce potential traffic safety problems and secure lives and property.

  13. A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data

    Directory of Open Access Journals (Sweden)

    Hongchao Song

    2017-01-01

    Full Text Available Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE and an ensemble k-nearest neighbor graphs- (K-NNG- based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.

  14. D/H ratio for Jupiter

    International Nuclear Information System (INIS)

    Smith, H.; Schempp, W.V.; Baines, K.H.

    1989-01-01

    Observations of Jupiter's spectrum near the R5(0) HD line at 6063.88 A are reported. A feature with an equivalent width of 0.065 + or - 0.021 mA is coincident with the expected line. This feature is compared with HD profiles computed for inhomogeneous scattering models for Jupiter to yield a range for the Jovian D/H ratio of 1.0-2.9 x 10 to the -5th. This D/H ratio is in the lower range of previously reported D/H values for Jupiter and corresponds to an essentially solar D/H ratio for Jupiter. The detection of HD features in the presence of probable blends with spectral features of minor atmospheric hydrocarbon molecules is discussed. Such blends may make unambiguous identification of HD features difficult. 26 references

  15. Volatility persistence in crude oil markets

    International Nuclear Information System (INIS)

    Charles, Amélie; Darné, Olivier

    2014-01-01

    Financial market participants and policy-makers can benefit from a better understanding of how shocks can affect volatility over time. This study assesses the impact of structural changes and outliers on volatility persistence of three crude oil markets – Brent, West Texas Intermediate (WTI) and Organization of Petroleum Exporting Countries (OPEC) – between January 2, 1985 and June 17, 2011. We identify outliers using a new semi-parametric test based on conditional heteroscedasticity models. These large shocks can be associated with particular event patterns, such as the invasion of Kuwait by Iraq, the Operation Desert Storm, the Operation Desert Fox, and the Global Financial Crisis as well as OPEC announcements on production reduction or US announcements on crude inventories. We show that outliers can bias (i) the estimates of the parameters of the equation governing volatility dynamics; (ii) the regularity and non-negativity conditions of GARCH-type models (GARCH, IGARCH, FIGARCH and HYGARCH); and (iii) the detection of structural breaks in volatility, and thus the estimation of the persistence of the volatility. Therefore, taking into account the outliers on the volatility modelling process may improve the understanding of volatility in crude oil markets. - Highlights: • We study the impact of outliers on volatility persistence of crude oil markets. • We identify outliers and patches of outliers due to specific events. • We show that outliers can bias (i) the estimates of the parameters of GARCH models, (ii) the regularity and non-negativity conditions of GARCH-type models, (iii) the detection of structural breaks in volatility of crude oil markets

  16. A nontoxic, photostable and high signal-to-noise ratio mitochondrial probe with mitochondrial membrane potential and viscosity detectivity

    Science.gov (United States)

    Chen, Yanan; Qi, Jianguo; Huang, Jing; Zhou, Xiaomin; Niu, Linqiang; Yan, Zhijie; Wang, Jianhong

    2018-01-01

    Herein, we reported a yellow emission probe 1-methyl-4-(6-morpholino-1, 3-dioxo-1H-benzo[de]isoquinolin-2(3H)-yl) pyridin-1-ium iodide which could specifically stain mitochondria in living immortalized and normal cells. In comparison to the common mitochondria tracker (Mitotracker Deep Red, MTDR), this probe was nontoxic, photostable and ultrahigh signal-to-noise ratio, which could real-time monitor mitochondria for a long time. Moreover, this probe also showed high sensitivity towards mitochondrial membrane potential and intramitochondrial viscosity change. Consequently, this probe was used for imaging mitochondria, detecting changes in mitochondrial membrane potential and intramitochondrial viscosity in physiological and pathological processes.

  17. Detection of Synthetic Testosterone Use by Novel Comprehensive Two-Dimensional Gas Chromatography Combustion Isotope Ratio Mass Spectrometry (GC×GCC-IRMS)

    Science.gov (United States)

    Tobias, Herbert J.; Zhang, Ying; Auchus, Richard J.; Brenna, J. Thomas

    2011-01-01

    We report the first demonstration of Comprehensive Two-dimensional Gas Chromatography Combustion Isotope Ratio Mass Spectrometry (GC×GCC-IRMS) for the analysis of urinary steroids to detect illicit synthetic testosterone use, of interest in sport doping. GC coupled to IRMS (GCC-IRMS) is currently used to measure the carbon isotope ratios (CIR, δ13C) of urinary steroids in anti-doping efforts; however, extensive cleanup of urine extracts is required prior to analysis to enable baseline separation of target steroids. With its greater separation capabilities, GC×GC has the potential to reduce sample preparation requirements and enable CIR analysis of minimally processed urine extracts. Challenges addressed include on-line reactors with minimized dimensions to retain narrow peaks shapes, baseline separation of peaks in some cases, and reconstruction of isotopic information from sliced steroid chromatographic peaks. Difficulties remaining include long-term robustness of on-line reactors and urine matrix effects that preclude baseline separation and isotopic analysis of low concentration and trace components. In this work, steroids were extracted, acetylated, and analyzed using a refined, home-built GC×GCC-IRMS system. 11-hydroxy-androsterone (11OHA) and 11-ketoetiocolanolone (11KE) were chosen as endogenous reference compounds (ERC) because of their satisfactory signal intensity, and their CIR was compared to target compounds (TC) androsterone (A) and etiocholanolone (E). Separately, a GC×GC-qMS system was used to measure testosterone (T)/EpiT concentration ratios. Urinary extracts of urine pooled from professional athletes, and urine from one individual that received testosterone gel (T-gel) and one individual that received testosterone injections (T-shot) were analyzed. The average precisions of δ13C and Δδ13C measurements were SD(δ13C) approximately ± 1‰ (n=11). The T-shot sample resulted in a positive for T use with a T/EpiT ratio of > 9 and CIR

  18. Automated detection of Lupus white matter lesions in MRI

    Directory of Open Access Journals (Sweden)

    Eloy Roura Perez

    2016-08-01

    Full Text Available Brain magnetic resonance imaging provides detailed information which can be used to detect and segment white matter lesions (WML. In this work we propose an approach to automatically segment WML in Lupus patients by using T1w and fluid-attenuated inversion recovery (FLAIR images. Lupus WML appear as small focal abnormal tissue observed as hyperintensities in the FLAIR images. The quantification of these WML is a key factor for the stratification of lupus patients and therefore both lesion detection and segmentation play an important role. In our approach, the T1w image is first used to classify the three main tissues of the brain, white matter (WM, gray matter (GM and cerebrospinal fluid (CSF, while the FLAIR image is then used to detect focal WML as outliers of its GM intensity distribution. A set of post-processing steps based on lesion size, tissue neighborhood, and location are used to refine the lesion candidates. The proposal is evaluated on 20 patients, presenting qualitative and quantitative results in terms of precision and sensitivity of lesion detection (True Positive Rate (62% and Positive Prediction Value (80% respectively as well as segmentation accuracy (Dice Similarity Coefficient (72%. Obtained results illustrate the validity of the approach to automatically detect and segment lupus lesions. Besides, our approach is publicly available as a SPM8/12 toolbox extension with a simple parameter configuration.

  19. The model of fraud detection in financial statements by means of financial ratios

    OpenAIRE

    Kanapickienė, Rasa; Grundienė, Živilė

    2015-01-01

    Analysis of financial ratios is one of those simple methods to identify frauds. Theoretical survey revealed that, in scientific literature, financial ratios are analysed in order to designate which ratios of the financial statements are the most sensitive in relation with the motifs of executive managers and employees of companies to commit frauds. Empirical study included the analysis of the following: 1) 40 sets of fraudulent financial statements and 2) 125 sets of non-fraudulent financ...

  20. Detecting and monitoring water stress states in maize crops using spectral ratios obtained in the photosynthetic domain

    Science.gov (United States)

    Baranoski, Gladimir V. G.; Van Leeuwen, Spencer R.

    2017-07-01

    The reliable detection and monitoring of changes in the water status of crops composed of plants like maize, a highly adaptable C4 species in large demand for both food and biofuel production, are longstanding remote sensing goals. Existing procedures employed to achieve these goals rely predominantly on the spectral signatures of plant leaves in the infrared domain where the light absorption within the foliar tissues is dominated by water. It has been suggested that such procedures could be implemented using subsurface reflectance to transmittance ratios obtained in the visible (photosynthetic) domain with the assistance of polarization devices. However, the experiments leading to this proposition were performed on detached maize leaves, which were not influenced by the whole (living) plant's adaptation mechanisms to water stress. In this work, we employ predictive simulations of light-leaf interactions in the photosynthetic domain to demonstrate that the living specimens' physiological responses to dehydration stress should be taken into account in this context. Our findings also indicate that a reflectance to transmittance ratio obtained in the photosynthetic domain at a lower angle of incidence without the use of polarization devices may represent a cost-effective alternative for the assessment of water stress states in maize crops.

  1. Outlier detection algorithms for least squares time series regression

    DEFF Research Database (Denmark)

    Johansen, Søren; Nielsen, Bent

    We review recent asymptotic results on some robust methods for multiple regression. The regressors include stationary and non-stationary time series as well as polynomial terms. The methods include the Huber-skip M-estimator, 1-step Huber-skip M-estimators, in particular the Impulse Indicator Sat...

  2. Methods of Detecting Outliers in A Regression Analysis Model ...

    African Journals Online (AJOL)

    PROF. O. E. OSUAGWU

    2013-06-01

    Jun 1, 2013 ... especially true in observational studies .... Simple linear regression and multiple ... The simple linear ..... Grubbs,F.E (1950): Sample Criteria for Testing Outlying observations: Annals of ... In experimental design, the Relative.

  3. Sequential probability ratio controllers for safeguards radiation monitors

    International Nuclear Information System (INIS)

    Fehlau, P.E.; Coop, K.L.; Nixon, K.V.

    1984-01-01

    Sequential hypothesis tests applied to nuclear safeguards accounting methods make the methods more sensitive to detecting diversion. The sequential tests also improve transient signal detection in safeguards radiation monitors. This paper describes three microprocessor control units with sequential probability-ratio tests for detecting transient increases in radiation intensity. The control units are designed for three specific applications: low-intensity monitoring with Poisson probability ratios, higher intensity gamma-ray monitoring where fixed counting intervals are shortened by sequential testing, and monitoring moving traffic where the sequential technique responds to variable-duration signals. The fixed-interval controller shortens a customary 50-s monitoring time to an average of 18 s, making the monitoring delay less bothersome. The controller for monitoring moving vehicles benefits from the sequential technique by maintaining more than half its sensitivity when the normal passage speed doubles

  4. Inclusion Detection in Aluminum Alloys Via Laser-Induced Breakdown Spectroscopy

    Science.gov (United States)

    Hudson, Shaymus W.; Craparo, Joseph; De Saro, Robert; Apelian, Diran

    2018-04-01

    Laser-induced breakdown spectroscopy (LIBS) has shown promise as a technique to quickly determine molten metal chemistry in real time. Because of its characteristics, LIBS could also be used as a technique to sense for unwanted inclusions and impurities. Simulated Al2O3 inclusions were added to molten aluminum via a metal-matrix composite. LIBS was performed in situ to determine whether particles could be detected. Outlier analysis on oxygen signal was performed on LIBS data and compared to oxide volume fraction measured through metallography. It was determined that LIBS could differentiate between melts with different amounts of inclusions by monitoring the fluctuations in signal for elements of interest. LIBS shows promise as an enabling tool for monitoring metal cleanliness.

  5. Combining multivariate analysis and monosaccharide composition modeling to identify plant cell wall variations by Fourier Transform Near Infrared spectroscopy

    Directory of Open Access Journals (Sweden)

    Smith-Moritz Andreia M

    2011-08-01

    Full Text Available Abstract We outline a high throughput procedure that improves outlier detection in cell wall screens using FT-NIR spectroscopy of plant leaves. The improvement relies on generating a calibration set from a subset of a mutant population by taking advantage of the Mahalanobis distance outlier scheme to construct a monosaccharide range predictive model using PLS regression. This model was then used to identify specific monosaccharide outliers from the mutant population.

  6. Likelihood ratio decisions in memory: three implied regularities.

    Science.gov (United States)

    Glanzer, Murray; Hilford, Andrew; Maloney, Laurence T

    2009-06-01

    We analyze four general signal detection models for recognition memory that differ in their distributional assumptions. Our analyses show that a basic assumption of signal detection theory, the likelihood ratio decision axis, implies three regularities in recognition memory: (1) the mirror effect, (2) the variance effect, and (3) the z-ROC length effect. For each model, we present the equations that produce the three regularities and show, in computed examples, how they do so. We then show that the regularities appear in data from a range of recognition studies. The analyses and data in our study support the following generalization: Individuals make efficient recognition decisions on the basis of likelihood ratios.

  7. Comparative Performance of Four Single Extreme Outlier Discordancy Tests from Monte Carlo Simulations

    Directory of Open Access Journals (Sweden)

    Surendra P. Verma

    2014-01-01

    Full Text Available Using highly precise and accurate Monte Carlo simulations of 20,000,000 replications and 102 independent simulation experiments with extremely low simulation errors and total uncertainties, we evaluated the performance of four single outlier discordancy tests (Grubbs test N2, Dixon test N8, skewness test N14, and kurtosis test N15 for normal samples of sizes 5 to 20. Statistical contaminations of a single observation resulting from parameters called δ from ±0.1 up to ±20 for modeling the slippage of central tendency or ε from ±1.1 up to ±200 for slippage of dispersion, as well as no contamination (δ=0 and ε=±1, were simulated. Because of the use of precise and accurate random and normally distributed simulated data, very large replications, and a large number of independent experiments, this paper presents a novel approach for precise and accurate estimations of power functions of four popular discordancy tests and, therefore, should not be considered as a simple simulation exercise unrelated to probability and statistics. From both criteria of the Power of Test proposed by Hayes and Kinsella and the Test Performance Criterion of Barnett and Lewis, Dixon test N8 performs less well than the other three tests. The overall performance of these four tests could be summarized as N2≅N15>N14>N8.

  8. Australasian microtektites: Impactor identification using Cr, Co and Ni ratios

    Science.gov (United States)

    Folco, L.; Glass, B. P.; D'Orazio, M.; Rochette, P.

    2018-02-01

    Impactor identification is one of the challenges of large-scale impact cratering studies due to the dilution of meteoritic material in impactites (typically ratios in a Co/Ni vs Cr/Ni space (46 microtektites analyzed in this work by Laser Ablation-Inductively Coupled Plasma -Mass Spectrometry and 31 from literature by means of Neutron Activation Analyses with Cr, Co and Ni concentrations up to ∼370, 50 and 680 μg/g, respectively). Despite substantial overlap in Cr/Ni versus Co/Ni composition for several meteorite types with chondritic composition (chondrites and primitive achondrites), regression calculation based on ∼85% of the studied microtektites best fit a mixing line between crustal compositions and an LL chondrite. However, due to some scatter mainly in the Cr versus Ni ratios in the considered dataset, an LL chondrite may not be the best fit to the data amongst impactors of primitive compositions. Eight high Ni/Cr and five low Ni/Cr outlier microtektites (∼15% in total) deviate from the above mixing trend, perhaps resulting from incomplete homogenization of heterogeneous impactor and target precursor materials at the microtektite scale, respectively. Together with previous evidence from the ∼35 Myr old Popigai impact spherules and the ∼1 Myr old Ivory Coast microtektites, our finding suggests that at least three of the five known Cenozoic distal impact ejecta were generated by the impacts of large stony asteroids of chondritic composition, and possibly of ordinary chondritic composition. The impactor signature found in Australasian microtektites documents mixing of target and impactor melts upon impact cratering. This requires target-impactor mixing in both the two competing models in literature for the formation of the Australasian tektites/microtektites: the impact cratering and low-altitude airburst plume models.

  9. Elevated sacroilac joint uptake ratios in systemic lupus erythematosus

    International Nuclear Information System (INIS)

    De Smet, A.A.; Mahmood, T.; Robinson, R.G.; Lindsley, H.B.

    1984-01-01

    Sacroiliac joint radiographs and radionuclide sacroiliac joint uptake ratios were obtained on 14 patients with active systemic lupus erythematosus. Elevated joint ratios were found unilaterally in two patients and bilaterally in seven patients when their lupus was active. In patients whose disease became quiescent, the uptake ratios returned to normal. Two patients had persistently elevated ratios with continued clinical and laboratory evidence of active lupus. Mild sacroiliac joint sclerosis and erosions were detected on pelvic radiographs in these same two patients. Elevated quantitative sacroiliac joint uptake ratios may occur as a manifestation of active systemic lupus erythematosus

  10. Detection of data taking anomalies for the ATLAS experiment

    CERN Document Server

    De Castro Vargas Fernandes, Julio; The ATLAS collaboration; Lehmann Miotto, Giovanna

    2015-01-01

    The physics signals produced by the ATLAS detector at the Large Hadron Collider (LHC) at CERN are acquired and selected by a distributed Trigger and Data AcQuistition (TDAQ) system, comprising a large number of hardware devices and software components. In this work, we focus on the problem of online detection of anomalies along the data taking period. Anomalies, in this context, are defined as an unexpected behaviour of the TDAQ system that result in a loss of data taking efficiency: the causes for those anomalies may come from the TDAQ itself or from external sources. While the TDAQ system operates, it publishes several useful information (trigger rates, dead times, memory usage…). Such information over time creates a set of time series that can be monitored in order to detect (and react to) problems (or anomalies). Here, we approach TDAQ operation monitoring through a data quality perspective, i.e, an anomaly is seen as a loss of quality (an outlier) and it is reported: this information can be used to rea...

  11. Detecting instability in the volatility of carbon prices

    Energy Technology Data Exchange (ETDEWEB)

    Chevallier, Julien [Univ. Paris Dauphine (France)

    2011-01-15

    This article investigates the presence of outliers in the volatility of carbon prices. We compute three different measures of volatility for European Union Allowances, based on daily data (EGARCH model), option prices (implied volatility), and intraday data (realized volatility). Based on the methodology developed by Zeileis et al. (2003) and Zeileis (2006), we detect instability in the volatility of carbon prices based on two kinds of tests: retrospective tests (OLS-/Recursive-based CUSUM processes, F-statistics, and residual sum of squares), and forward-looking tests (by monitoring structural changes recursively or with moving estimates). We show evidence of strong shifts mainly for the EGARCH and IV models during the time period. Overall, we suggest that yearly compliance events, and growing uncertainties in post-Kyoto international agreements, may explain the instability in the volatility of carbon prices. (author)

  12. An Improvement of the Hotelling T2 Statistic in Monitoring Multivariate Quality Characteristics

    Directory of Open Access Journals (Sweden)

    Ashkan Shabbak

    2012-01-01

    Full Text Available The Hotelling T2 statistic is the most popular statistic used in multivariate control charts to monitor multiple qualities. However, this statistic is easily affected by the existence of more than one outlier in the data set. To rectify this problem, robust control charts, which are based on the minimum volume ellipsoid and the minimum covariance determinant, have been proposed. Most researchers assess the performance of multivariate control charts based on the number of signals without paying much attention to whether those signals are really outliers. With due respect, we propose to evaluate control charts not only based on the number of detected outliers but also with respect to their correct positions. In this paper, an Upper Control Limit based on the median and the median absolute deviation is also proposed. The results of this study signify that the proposed Upper Control Limit improves the detection of correct outliers but that it suffers from a swamping effect when the positions of outliers are not taken into consideration. Finally, a robust control chart based on the diagnostic robust generalised potential procedure is introduced to remedy this drawback.

  13. Unsupervised Scalable Statistical Method for Identifying Influential Users in Online Social Networks.

    Science.gov (United States)

    Azcorra, A; Chiroque, L F; Cuevas, R; Fernández Anta, A; Laniado, H; Lillo, R E; Romo, J; Sguera, C

    2018-05-03

    Billions of users interact intensively every day via Online Social Networks (OSNs) such as Facebook, Twitter, or Google+. This makes OSNs an invaluable source of information, and channel of actuation, for sectors like advertising, marketing, or politics. To get the most of OSNs, analysts need to identify influential users that can be leveraged for promoting products, distributing messages, or improving the image of companies. In this report we propose a new unsupervised method, Massive Unsupervised Outlier Detection (MUOD), based on outliers detection, for providing support in the identification of influential users. MUOD is scalable, and can hence be used in large OSNs. Moreover, it labels the outliers as of shape, magnitude, or amplitude, depending of their features. This allows classifying the outlier users in multiple different classes, which are likely to include different types of influential users. Applying MUOD to a subset of roughly 400 million Google+ users, it has allowed identifying and discriminating automatically sets of outlier users, which present features associated to different definitions of influential users, like capacity to attract engagement, capacity to attract a large number of followers, or high infection capacity.

  14. Robust Curb Detection with Fusion of 3D-Lidar and Camera Data

    Directory of Open Access Journals (Sweden)

    Jun Tan

    2014-05-01

    Full Text Available Curb detection is an essential component of Autonomous Land Vehicles (ALV, especially important for safe driving in urban environments. In this paper, we propose a fusion-based curb detection method through exploiting 3D-Lidar and camera data. More specifically, we first fuse the sparse 3D-Lidar points and high-resolution camera images together to recover a dense depth image of the captured scene. Based on the recovered dense depth image, we propose a filter-based method to estimate the normal direction within the image. Then, by using the multi-scale normal patterns based on the curb’s geometric property, curb point features fitting the patterns are detected in the normal image row by row. After that, we construct a Markov Chain to model the consistency of curb points which utilizes the continuous property of the curb, and thus the optimal curb path which links the curb points together can be efficiently estimated by dynamic programming. Finally, we perform post-processing operations to filter the outliers, parameterize the curbs and give the confidence scores on the detected curbs. Extensive evaluations clearly show that our proposed method can detect curbs with strong robustness at real-time speed for both static and dynamic scenes.

  15. Major Mergers in CANDELS up to z=3: Calibrating the Close-Pair Method Using Semi-Analytic Models and Baryonic Mass Ratio Estimates

    Science.gov (United States)

    Mantha, Kameswara; McIntosh, Daniel H.; Conselice, Christopher; Cook, Joshua S.; Croton, Darren J.; Dekel, Avishai; Ferguson, Henry C.; Hathi, Nimish; Kodra, Dritan; Koo, David C.; Lotz, Jennifer M.; Newman, Jeffrey A.; Popping, Gergo; Rafelski, Marc; Rodriguez-Gomez, Vicente; Simmons, Brooke D.; Somerville, Rachel; Straughn, Amber N.; Snyder, Gregory; Wuyts, Stijn; Yu, Lu; Cosmic Assembly Near-Infrared Deep Extragalactic Legacy Survey (CANDELS) Team

    2018-01-01

    Cosmological simulations predict that the rate of merging between similar-mass massive galaxies should increase towards early cosmic-time. We study the incidence of major (stellar mass ratio SMR 10.3 galaxies spanning 01.5 in strong disagreement with theoretical merger rate predictions. On the other hand, if we compare to a simulation-tuned, evolving timescale prescription from Snyder et al., 2017, we find that the merger rate evolution agrees with theory out to z=3. These results highlight the need for robust calibrations on the complex and presumably redshift-dependent pair-to-merger-rate conversion factors to improve constraints of the empirical merger history. To address this, we use a unique compilation of mock datasets produced by three independent state-of-the-art Semi-Analytic Models (SAMs). We present preliminary calibrations of the close-pair observability timescale and outlier fraction as a function of redshift, stellar-mass, mass-ratio, and local over-density. Furthermore, to verify the hypothesis by previous empirical studies that SMR-selection of major pairs may be biased, we present a new analysis of the baryonic (gas+stars) mass ratios of a subset of close pairs in our sample. For the first time, our preliminary analysis highlights that a noticeable fraction of SMR-selected minor pairs (SMR>4) have major baryonic-mass ratios (BMR<4), which indicate that merger rates based on SMR selection may be under-estimated.

  16. Systems Biology and Ratio-Based, Real-Time Disease Surveillance.

    Science.gov (United States)

    Fair, J M; Rivas, A L

    2015-08-01

    Most infectious disease surveillance methods are not well fit for early detection. To address such limitation, here we evaluated a ratio- and Systems Biology-based method that does not require prior knowledge on the identity of an infective agent. Using a reference group of birds experimentally infected with West Nile virus (WNV) and a problem group of unknown health status (except that they were WNV-negative and displayed inflammation), both groups were followed over 22 days and tested with a system that analyses blood leucocyte ratios. To test the ability of the method to discriminate small data sets, both the reference group (n = 5) and the problem group (n = 4) were small. The questions of interest were as follows: (i) whether individuals presenting inflammation (disease-positive or D+) can be distinguished from non-inflamed (disease-negative or D-) birds, (ii) whether two or more D+ stages can be detected and (iii) whether sample size influences detection. Within the problem group, the ratio-based method distinguished the following: (i) three (one D- and two D+) data classes; (ii) two (early and late) inflammatory stages; (iii) fast versus regular or slow responders; and (iv) individuals that recovered from those that remained inflamed. Because ratios differed in larger magnitudes (up to 48 times larger) than percentages, it is suggested that data patterns are likely to be recognized when disease surveillance methods are designed to measure inflammation and utilize ratios. Published 2013. This article is a U.S. Government work and is in the public domain in the USA.

  17. The Role of SPINK1 in ETS Rearrangement Negative Prostate Cancers

    Science.gov (United States)

    Tomlins, Scott A.; Rhodes, Daniel R.; Yu, Jianjun; Varambally, Sooryanarayana; Mehra, Rohit; Perner, Sven; Demichelis, Francesca; Helgeson, Beth E.; Laxman, Bharathi; Morris, David S.; Cao, Qi; Cao, Xuhong; Andrén, Ove; Fall, Katja; Johnson, Laura; Wei, John T.; Shah, Rajal B.; Al-Ahmadie, Hikmat; Eastham, James A.; Eggener, Scott E.; Fine, Samson W.; Hotakainen, Kristina; Stenman, Ulf-Håkan; Tsodikov, Alex; Gerald, William L.; Lilja, Hans; Reuter, Victor E.; Kantoff, Phillip W.; Scardino, Peter T.; Rubin, Mark A.; Bjartell, Anders S.; Chinnaiyan, Arul M.

    2009-01-01

    Summary ETS gene fusions have been characterized in a majority of prostate cancers, however the key molecular alterations in ETS negative cancers are unclear. Here we used an outlier meta-analysis (meta-COPA) to identify SPINK1 outlier-expression exclusively in a subset of ETS rearrangement negative cancers (~10% of total cases). We validated the mutual exclusivity of SPINK1 expression and ETS fusion status, demonstrated that SPINK1 outlier-expression can be detected non-invasively in urine and observed that SPINK1 outlier-expression is an independent predictor of biochemical recurrence after resection. We identified the aggressive 22RV1 cell line as a SPINK1 outlier-expression model, and demonstrate that SPINK1 knockdown in 22RV1 attenuates invasion, suggesting a functional role in ETS rearrangement negative prostate cancers. PMID:18538735

  18. Temporal interpolation alters motion in fMRI scans: Magnitudes and consequences for artifact detection.

    Directory of Open Access Journals (Sweden)

    Jonathan D Power

    Full Text Available Head motion can be estimated at any point of fMRI image processing. Processing steps involving temporal interpolation (e.g., slice time correction or outlier replacement often precede motion estimation in the literature. From first principles it can be anticipated that temporal interpolation will alter head motion in a scan. Here we demonstrate this effect and its consequences in five large fMRI datasets. Estimated head motion was reduced by 10-50% or more following temporal interpolation, and reductions were often visible to the naked eye. Such reductions make the data seem to be of improved quality. Such reductions also degrade the sensitivity of analyses aimed at detecting motion-related artifact and can cause a dataset with artifact to falsely appear artifact-free. These reduced motion estimates will be particularly problematic for studies needing estimates of motion in time, such as studies of dynamics. Based on these findings, it is sensible to obtain motion estimates prior to any image processing (regardless of subsequent processing steps and the actual timing of motion correction procedures, which need not be changed. We also find that outlier replacement procedures change signals almost entirely during times of motion and therefore have notable similarities to motion-targeting censoring strategies (which withhold or replace signals entirely during times of motion.

  19. Calculation of the Cardiothoracic Ratio from Portable Anteroposterior Chest Radiography

    Science.gov (United States)

    Chon, Sung Bin; Oh, Won Sup; Cho, Jun Hwi; Kim, Sam Soo

    2011-01-01

    Cardiothoracic ratio (CTR), the ratio of cardiac diameter (CD) to thoracic diameter (TD), is a useful screening method to detect cardiomegaly, but is reliable only on posteroanterior chest radiography (chest PA). We performed this cross-sectional 3-phase study to establish reliable CTR from anteroposterior chest radiography (chest AP). First, CDChest PA/CDChest AP ratios were determined at different radiation distances by manipulating chest computed tomography to simulate chest PA and AP. CDChest PA was inferred from multiplying CDChest AP by this ratio. Incorporating this CD and substituting the most recent TDChest PA, we calculated the 'corrected' CTR and compared it with the conventional one in patients who took both the chest radiographies. Finally, its validity was investigated among the critically ill patients who performed portable chest AP. CDChest PA/CDChest AP ratio was {0.00099 × (radiation distance [cm])} + 0.79 (n = 61, r = 1.00, P chest AP with an available previous chest PA. This might help physicians detect congestive cardiomegaly for patients undergoing portable chest AP. PMID:22065900

  20. Detection of Adulterated Vegetable Oils Containing Waste Cooking Oils Based on the Contents and Ratios of Cholesterol, β-Sitosterol, and Campesterol by Gas Chromatography/Mass Spectrometry.

    Science.gov (United States)

    Zhao, Haixiang; Wang, Yongli; Xu, Xiuli; Ren, Heling; Li, Li; Xiang, Li; Zhong, Weike

    2015-01-01

    A simple and accurate authentication method for the detection of adulterated vegetable oils that contain waste cooking oil (WCO) was developed. This method is based on the determination of cholesterol, β-sitosterol, and campesterol in vegetable oils and WCO by GC/MS without any derivatization. A total of 148 samples involving 12 types of vegetable oil and WCO were analyzed. According to the results, the contents and ratios of cholesterol, β-sitosterol, and campesterol were found to be criteria for detecting vegetable oils adulterated with WCO. This method could accurately detect adulterated vegetable oils containing 5% refined WCO. The developed method has been successfully applied to multilaboratory analysis of 81 oil samples. Seventy-five samples were analyzed correctly, and only six adulterated samples could not be detected. This method could not yet be used for detection of vegetable oils adulterated with WCO that are used for frying non-animal foods. It provides a quick method for detecting adulterated edible vegetable oils containing WCO.

  1. Evaluation of compression ratio using JPEG 2000 on diagnostic images in dentistry

    International Nuclear Information System (INIS)

    Jung, Gi Hun; Han, Won Jeong; Yoo, Dong Soo; Kim, Eun Kyung; Choi, Soon Chul

    2005-01-01

    To find out the proper compression ratios without degrading image quality and affecting lesion detectability on diagnostic images used in dentistry compressed with JPEG 2000 algorithm. Sixty Digora peri apical images, sixty panoramic computed radiographic (CR) images, sixty computed tomography (CT) images, and sixty magnetic resonance (MR) images were compressed into JPEG 2000 with ratios of 10 levels from 5:1 to 50:1. To evaluate the lesion detectability, the images were graded with 5 levels (1 : definitely absent ; 2 : probably absent ; 3 : equivocal ; 4 : probably present ; 5 : definitely present), and then receiver operating characteristic analysis was performed using the original image as a gold standard. Also to evaluate subjectively the image quality, the images were graded with 5 levels (1 : definitely unacceptable ; 2 : probably unacceptable ; 3 : equivocal ; 4 : probably acceptable ; 5 : definitely acceptable), and then paired t-test was performed. In Digora, CR panoramic and CT images, compressed images up to ratios of 15:1 showed nearly the same lesion detectability as original images, and in MR images, compressed images did up to ratios of 25:1. In Digora and CR panoramic images, compressed images up to ratios of 5:1 showed little difference between the original and reconstructed images in subjective assessment of image quality. In CT images, compressed images did up to ratios of 10:1 and in MR images up to ratios of 15:1. We considered compression ratios up to 5:1 in Digora and CR panoramic images, up to 10:1 in CT images, up to 15:1 in MR images as clinically applicable compression ratios.

  2. Measurement of fluorescent probes concentration ratio in the cerebrospinal fluid for early detection of Alzheimer's disease

    Science.gov (United States)

    Harbater, Osnat; Gannot, Israel

    2014-03-01

    The pathogenic process of Alzheimer's Disease (AD), characterized by amyloid plaques and neurofibrillary tangles in the brain, begins years before the clinical diagnosis. Here, we suggest a novel method which may detect AD up to nine years earlier than current exams, minimally invasive, with minimal risk, pain and side effects. The method is based on previous reports which relate the concentrations of biomarkers in the Cerebrospinal Fluid (CSF) (Aβ and Tau proteins) to the future development of AD in mild cognitive impairment patients. Our method, which uses fluorescence measurements of the relative concentrations of the CSF biomarkers, replaces the lumbar puncture process required for CSF drawing. The process uses a miniature needle coupled trough an optical fiber to a laser source and a detector. The laser radiation excites fluorescent probes which were prior injected and bond to the CSF biomarkers. Using the ratio between the fluorescence intensities emitted from the two biomarkers, which is correlated to their concentration ratio, the patient's risk of developing AD is estimated. A theoretical model was developed and validated using Monte Carlo simulations, demonstrating the relation between fluorescence emission and biomarker concentration. The method was tested using multi-layered tissue phantoms simulating the epidural fat, the CSF in the sub-arachnoid space and the bone. These phantoms were prepared with different scattering and absorption coefficients, thicknesses and fluorescence concentrations in order to simulate variations in human anatomy and in the needle location. The theoretical and in-vitro results are compared and the method's accuracy is discussed.

  3. Compact point-detection fluorescence spectroscopy system for quantifying intrinsic fluorescence redox ratio in brain cancer diagnostics

    Science.gov (United States)

    Liu, Quan; Grant, Gerald; Li, Jianjun; Zhang, Yan; Hu, Fangyao; Li, Shuqin; Wilson, Christy; Chen, Kui; Bigner, Darell; Vo-Dinh, Tuan

    2011-03-01

    We report the development of a compact point-detection fluorescence spectroscopy system and two data analysis methods to quantify the intrinsic fluorescence redox ratio and diagnose brain cancer in an orthotopic brain tumor rat model. Our system employs one compact cw diode laser (407 nm) to excite two primary endogenous fluorophores, reduced nicotinamide adenine dinucleotide, and flavin adenine dinucleotide. The spectra were first analyzed using a spectral filtering modulation method developed previously to derive the intrinsic fluorescence redox ratio, which has the advantages of insensitivty to optical coupling and rapid data acquisition and analysis. This method represents a convenient and rapid alternative for achieving intrinsic fluorescence-based redox measurements as compared to those complicated model-based methods. It is worth noting that the method can also extract total hemoglobin concentration at the same time but only if the emission path length of fluorescence light, which depends on the illumination and collection geometry of the optical probe, is long enough so that the effect of absorption on fluorescence intensity due to hemoglobin is significant. Then a multivariate method was used to statistically classify normal tissues and tumors. Although the first method offers quantitative tissue metabolism information, the second method provides high overall classification accuracy. The two methods provide complementary capabilities for understanding cancer development and noninvasively diagnosing brain cancer. The results of our study suggest that this portable system can be potentially used to demarcate the elusive boundary between a brain tumor and the surrounding normal tissue during surgical resection.

  4. Statistical Techniques For Real-time Anomaly Detection Using Spark Over Multi-source VMware Performance Data

    Energy Technology Data Exchange (ETDEWEB)

    Solaimani, Mohiuddin [Univ. of Texas-Dallas, Richardson, TX (United States); Iftekhar, Mohammed [Univ. of Texas-Dallas, Richardson, TX (United States); Khan, Latifur [Univ. of Texas-Dallas, Richardson, TX (United States); Thuraisingham, Bhavani [Univ. of Texas-Dallas, Richardson, TX (United States); Ingram, Joey Burton [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    Anomaly detection refers to the identi cation of an irregular or unusual pat- tern which deviates from what is standard, normal, or expected. Such deviated patterns typically correspond to samples of interest and are assigned different labels in different domains, such as outliers, anomalies, exceptions, or malware. Detecting anomalies in fast, voluminous streams of data is a formidable chal- lenge. This paper presents a novel, generic, real-time distributed anomaly detection framework for heterogeneous streaming data where anomalies appear as a group. We have developed a distributed statistical approach to build a model and later use it to detect anomaly. As a case study, we investigate group anomaly de- tection for a VMware-based cloud data center, which maintains a large number of virtual machines (VMs). We have built our framework using Apache Spark to get higher throughput and lower data processing time on streaming data. We have developed a window-based statistical anomaly detection technique to detect anomalies that appear sporadically. We then relaxed this constraint with higher accuracy by implementing a cluster-based technique to detect sporadic and continuous anomalies. We conclude that our cluster-based technique out- performs other statistical techniques with higher accuracy and lower processing time.

  5. Non-rigid Point Matching using Topology Preserving Constraints for Medical Computer Vision

    Directory of Open Access Journals (Sweden)

    Jong-Ha Lee

    2017-01-01

    Full Text Available This work presents a novel algorithm of finding correspondence using a relaxation labeling. For the variance experiments, the variance of all algorithms except the proposed algorithm is large. The largest variance of the proposed algorithm is +0.01 in the 0.08 deformation test of a character. Overall, the proposed algorithm outperforms compared to the rest of algorithms. Except the proposed algorithm, matching with neighborhood algorithm shows the best performance except an outlier to data ratio in a character test. The proposed algorithm shows the best performance as well as an outlier to data ratio in a character test.

  6. The effectiveness of robust RMCD control chart as outliers’ detector

    Science.gov (United States)

    Darmanto; Astutik, Suci

    2017-12-01

    A well-known control chart to monitor a multivariate process is Hotelling’s T 2 which its parameters are estimated classically, very sensitive and also marred by masking and swamping of outliers data effect. To overcome these situation, robust estimators are strongly recommended. One of robust estimators is re-weighted minimum covariance determinant (RMCD) which has robust characteristics as same as MCD. In this paper, the effectiveness term is accuracy of the RMCD control chart in detecting outliers as real outliers. In other word, how effectively this control chart can identify and remove masking and swamping effects of outliers. We assessed the effectiveness the robust control chart based on simulation by considering different scenarios: n sample sizes, proportion of outliers, number of p quality characteristics. We found that in some scenarios, this RMCD robust control chart works effectively.

  7. Detecting Solar-like Oscillations in Red Giants with Deep Learning

    Science.gov (United States)

    Hon, Marc; Stello, Dennis; Zinn, Joel C.

    2018-05-01

    Time-resolved photometry of tens of thousands of red giant stars from space missions like Kepler and K2 has created the need for automated asteroseismic analysis methods. The first and most fundamental step in such analysis is to identify which stars show oscillations. It is critical that this step be performed with no, or little, detection bias, particularly when performing subsequent ensemble analyses that aim to compare the properties of observed stellar populations with those from galactic models. However, an efficient, automated solution to this initial detection step still has not been found, meaning that expert visual inspection of data from each star is required to obtain the highest level of detections. Hence, to mimic how an expert eye analyzes the data, we use supervised deep learning to not only detect oscillations in red giants, but also to predict the location of the frequency at maximum power, ν max, by observing features in 2D images of power spectra. By training on Kepler data, we benchmark our deep-learning classifier against K2 data that are given detections by the expert eye, achieving a detection accuracy of 98% on K2 Campaign 6 stars and a detection accuracy of 99% on K2 Campaign 3 stars. We further find that the estimated uncertainty of our deep-learning-based ν max predictions is about 5%. This is comparable to human-level performance using visual inspection. When examining outliers, we find that the deep-learning results are more likely to provide robust ν max estimates than the classical model-fitting method.

  8. Replica Node Detection Using Enhanced Single Hop Detection with Clonal Selection Algorithm in Mobile Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    L. S. Sindhuja

    2016-01-01

    Full Text Available Security of Mobile Wireless Sensor Networks is a vital challenge as the sensor nodes are deployed in unattended environment and they are prone to various attacks. One among them is the node replication attack. In this, the physically insecure nodes are acquired by the adversary to clone them by having the same identity of the captured node, and the adversary deploys an unpredictable number of replicas throughout the network. Hence replica node detection is an important challenge in Mobile Wireless Sensor Networks. Various replica node detection techniques have been proposed to detect these replica nodes. These methods incur control overheads and the detection accuracy is low when the replica is selected as a witness node. This paper proposes to solve these issues by enhancing the Single Hop Detection (SHD method using the Clonal Selection algorithm to detect the clones by selecting the appropriate witness nodes. The advantages of the proposed method include (i increase in the detection ratio, (ii decrease in the control overhead, and (iii increase in throughput. The performance of the proposed work is measured using detection ratio, false detection ratio, packet delivery ratio, average delay, control overheads, and throughput. The implementation is done using ns-2 to exhibit the actuality of the proposed work.

  9. The isotope correlation experiment

    International Nuclear Information System (INIS)

    Koch, L.; Schoof, S.

    1983-01-01

    The ESARDA working group on Isotopic Correlation Techniques, ICT and Reprocessing Input Analysis performed an Isotope Correlation Experiment, ICE with the aim to check the feasibility of the new technique. Ten input batches of the reprocessing of the KWO fuel at the WAK plant were analysed by 4 laboratories. All information to compare ICT with the gravimetric and volumetric methods was available. ICT combined with simplified reactor physics calculation was included. The main objectives of the statistical data evaluation were detection of outliers, the estimation of random errors and of systematic errors of the measurements performed by the 4 laboratories. Different methods for outlier detection, analysis of variances, Grubbs' analysis for the constant-bias model and Jaech's non-constant-bias model were applied. Some of the results of the statistical analysis may seem inconsistent which is due to the following reasons. For the statistical evaluations isotope abundance data (weight percent) as well as nuclear concentration data (atoms/initial metal atoms) were subjected to different outlier criteria before being used for further statistical evaluations. None of the four data evaluation groups performed a complete statistical data analysis which would render possible a comparison of the different methods applied since no commonly agreed statistical evaluation procedure existed. The results prove that ICT is as accurate as conventional techniques which have to rely on costly mass spectrometric isotope dilution analysis. The potential of outlier detection by ICT on the basis of the results from a single laboratory is as good as outlier detection by costly interlaboratory comparison. The application of fission product or Cm-244 correlations would be more timely than remeasurements at safeguards laboratories

  10. Micro- and macro-geographic scale effect on the molecular imprint of selection and adaptation in Norway spruce.

    Directory of Open Access Journals (Sweden)

    Marta Scalfi

    Full Text Available Forest tree species of temperate and boreal regions have undergone a long history of demographic changes and evolutionary adaptations. The main objective of this study was to detect signals of selection in Norway spruce (Picea abies [L.] Karst, at different sampling-scales and to investigate, accounting for population structure, the effect of environment on species genetic diversity. A total of 384 single nucleotide polymorphisms (SNPs representing 290 genes were genotyped at two geographic scales: across 12 populations distributed along two altitudinal-transects in the Alps (micro-geographic scale, and across 27 populations belonging to the range of Norway spruce in central and south-east Europe (macro-geographic scale. At the macrogeographic scale, principal component analysis combined with Bayesian clustering revealed three major clusters, corresponding to the main areas of southern spruce occurrence, i.e. the Alps, Carpathians, and Hercynia. The populations along the altitudinal transects were not differentiated. To assess the role of selection in structuring genetic variation, we applied a Bayesian and coalescent-based F(ST-outlier method and tested for correlations between allele frequencies and climatic variables using regression analyses. At the macro-geographic scale, the F(ST-outlier methods detected together 11 F(ST-outliers. Six outliers were detected when the same analyses were carried out taking into account the genetic structure. Regression analyses with population structure correction resulted in the identification of two (micro-geographic scale and 38 SNPs (macro-geographic scale significantly correlated with temperature and/or precipitation. Six of these loci overlapped with F(ST-outliers, among them two loci encoding an enzyme involved in riboflavin biosynthesis and a sucrose synthase. The results of this study indicate a strong relationship between genetic and environmental variation at both geographic scales. It also

  11. Micro- and macro-geographic scale effect on the molecular imprint of selection and adaptation in Norway spruce.

    Science.gov (United States)

    Scalfi, Marta; Mosca, Elena; Di Pierro, Erica Adele; Troggio, Michela; Vendramin, Giovanni Giuseppe; Sperisen, Christoph; La Porta, Nicola; Neale, David B

    2014-01-01

    Forest tree species of temperate and boreal regions have undergone a long history of demographic changes and evolutionary adaptations. The main objective of this study was to detect signals of selection in Norway spruce (Picea abies [L.] Karst), at different sampling-scales and to investigate, accounting for population structure, the effect of environment on species genetic diversity. A total of 384 single nucleotide polymorphisms (SNPs) representing 290 genes were genotyped at two geographic scales: across 12 populations distributed along two altitudinal-transects in the Alps (micro-geographic scale), and across 27 populations belonging to the range of Norway spruce in central and south-east Europe (macro-geographic scale). At the macrogeographic scale, principal component analysis combined with Bayesian clustering revealed three major clusters, corresponding to the main areas of southern spruce occurrence, i.e. the Alps, Carpathians, and Hercynia. The populations along the altitudinal transects were not differentiated. To assess the role of selection in structuring genetic variation, we applied a Bayesian and coalescent-based F(ST)-outlier method and tested for correlations between allele frequencies and climatic variables using regression analyses. At the macro-geographic scale, the F(ST)-outlier methods detected together 11 F(ST)-outliers. Six outliers were detected when the same analyses were carried out taking into account the genetic structure. Regression analyses with population structure correction resulted in the identification of two (micro-geographic scale) and 38 SNPs (macro-geographic scale) significantly correlated with temperature and/or precipitation. Six of these loci overlapped with F(ST)-outliers, among them two loci encoding an enzyme involved in riboflavin biosynthesis and a sucrose synthase. The results of this study indicate a strong relationship between genetic and environmental variation at both geographic scales. It also suggests that an

  12. Principal components in the discrimination of outliers: A study in simulation sample data corrected by Pearson's and Yates´s chi-square distance

    Directory of Open Access Journals (Sweden)

    Manoel Vitor de Souza Veloso

    2016-04-01

    Full Text Available Current study employs Monte Carlo simulation in the building of a significance test to indicate the principal components that best discriminate against outliers. Different sample sizes were generated by multivariate normal distribution with different numbers of variables and correlation structures. Corrections by chi-square distance of Pearson´s and Yates's were provided for each sample size. Pearson´s correlation test showed the best performance. By increasing the number of variables, significance probabilities in favor of hypothesis H0 were reduced. So that the proposed method could be illustrated, a multivariate time series was applied with regard to sales volume rates in the state of Minas Gerais, obtained in different market segments.

  13. Detection of Anomalies in Hydrometric Data Using Artificial Intelligence Techniques

    Science.gov (United States)

    Lauzon, N.; Lence, B. J.

    2002-12-01

    This work focuses on the detection of anomalies in hydrometric data sequences, such as 1) outliers, which are individual data having statistical properties that differ from those of the overall population; 2) shifts, which are sudden changes over time in the statistical properties of the historical records of data; and 3) trends, which are systematic changes over time in the statistical properties. For the purpose of the design and management of water resources systems, it is important to be aware of these anomalies in hydrometric data, for they can induce a bias in the estimation of water quantity and quality parameters. These anomalies may be viewed as specific patterns affecting the data, and therefore pattern recognition techniques can be used for identifying them. However, the number of possible patterns is very large for each type of anomaly and consequently large computing capacities are required to account for all possibilities using the standard statistical techniques, such as cluster analysis. Artificial intelligence techniques, such as the Kohonen neural network and fuzzy c-means, are clustering techniques commonly used for pattern recognition in several areas of engineering and have recently begun to be used for the analysis of natural systems. They require much less computing capacity than the standard statistical techniques, and therefore are well suited for the identification of outliers, shifts and trends in hydrometric data. This work constitutes a preliminary study, using synthetic data representing hydrometric data that can be found in Canada. The analysis of the results obtained shows that the Kohonen neural network and fuzzy c-means are reasonably successful in identifying anomalies. This work also addresses the problem of uncertainties inherent to the calibration procedures that fit the clusters to the possible patterns for both the Kohonen neural network and fuzzy c-means. Indeed, for the same database, different sets of clusters can be

  14. STEM - software test and evaluation methods: fault detection using static analysis techniques

    International Nuclear Information System (INIS)

    Bishop, P.G.; Esp, D.G.

    1988-08-01

    STEM is a software reliability project with the objective of evaluating a number of fault detection and fault estimation methods which can be applied to high integrity software. This Report gives some interim results of applying both manual and computer-based static analysis techniques, in particular SPADE, to an early CERL version of the PODS software containing known faults. The main results of this study are that: The scope for thorough verification is determined by the quality of the design documentation; documentation defects become especially apparent when verification is attempted. For well-defined software, the thoroughness of SPADE-assisted verification for detecting a large class of faults was successfully demonstrated. For imprecisely-defined software (not recommended for high-integrity systems) the use of tools such as SPADE is difficult and inappropriate. Analysis and verification tools are helpful, through their reliability and thoroughness. However, they are designed to assist, not replace, a human in validating software. Manual inspection can still reveal errors (such as errors in specification and errors of transcription of systems constants) which current tools cannot detect. There is a need for tools to automatically detect typographical errors in system constants, for example by reporting outliers to patterns. To obtain the maximum benefit from advanced tools, they should be applied during software development (when verification problems can be detected and corrected) rather than retrospectively. (author)

  15. Early Automatic Detection of Parkinson's Disease Based on Sleep Recordings

    DEFF Research Database (Denmark)

    Kempfner, Jacob; Sorensen, Helge B D; Nikolic, Miki

    2014-01-01

    SUMMARY: Idiopathic rapid-eye-movement (REM) sleep behavior disorder (iRBD) is most likely the earliest sign of Parkinson's Disease (PD) and is characterized by REM sleep without atonia (RSWA) and consequently increased muscle activity. However, some muscle twitching in normal subjects occurs...... during REM sleep. PURPOSE: There are no generally accepted methods for evaluation of this activity and a normal range has not been established. Consequently, there is a need for objective criteria. METHOD: In this study we propose a full-automatic method for detection of RSWA. REM sleep identification...... the number of outliers during REM sleep was used as a quantitative measure of muscle activity. RESULTS: The proposed method was able to automatically separate all iRBD test subjects from healthy elderly controls and subjects with periodic limb movement disorder. CONCLUSION: The proposed work is considered...

  16. Meteor localization via statistical analysis of spatially temporal fluctuations in image sequences

    Science.gov (United States)

    Kukal, Jaromír.; Klimt, Martin; Šihlík, Jan; Fliegel, Karel

    2015-09-01

    Meteor detection is one of the most important procedures in astronomical imaging. Meteor path in Earth's atmosphere is traditionally reconstructed from double station video observation system generating 2D image sequences. However, the atmospheric turbulence and other factors cause spatially-temporal fluctuations of image background, which makes the localization of meteor path more difficult. Our approach is based on nonlinear preprocessing of image intensity using Box-Cox and logarithmic transform as its particular case. The transformed image sequences are then differentiated along discrete coordinates to obtain statistical description of sky background fluctuations, which can be modeled by multivariate normal distribution. After verification and hypothesis testing, we use the statistical model for outlier detection. Meanwhile the isolated outlier points are ignored, the compact cluster of outliers indicates the presence of meteoroids after ignition.

  17. Understanding inhibitory mechanisms of lumbar spinal manipulation using H-reflex and F-wave responses: a methodological approach.

    Science.gov (United States)

    Dishman, J Donald; Weber, Kenneth A; Corbin, Roger L; Burke, Jeanmarie R

    2012-09-30

    The purpose of this research was to characterize unique neurophysiologic events following a high velocity, low amplitude (HVLA) spinal manipulation (SM) procedure. Descriptive time series analysis techniques of time plots, outlier detection and autocorrelation functions were applied to time series of tibial nerve H-reflexes that were evoked at 10-s intervals from 100 s before the event until 100 s after three distinct events L5-S1 HVLA SM, or a L5-S1 joint pre-loading procedure, or the control condition. Sixty-six subjects were randomly assigned to three procedures, i.e., 22 time series per group. If the detection of outliers and correlograms revealed a pattern of non-randomness that was only time-locked to a single, specific event in the normalized time series, then an experimental effect would be inferred beyond the inherent variability of H-reflex responses. Tibial nerve F-wave responses were included to determine if any new information about central nervous function following a HVLA SM procedure could be ascertained. Time series analyses of H(max)/M(max) ratios, pre-post L5-S1 HVLA SM, substantiated the hypothesis that the specific aspects of the manipulative thrust lead to a greater attenuation of the H(max)/M(max) ratio as compared to the non-specific aspects related to the postural perturbation and joint pre-loading. The attenuation of the H(max)/M(max) ratio following the HVLA SM procedure was reliable and may hold promise as a translational tool to measure the consistency and accuracy of protocol implementation involving SM in clinical trials research. F-wave responses were not sensitive to mechanical perturbations of the lumbar spine. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Fuel failure detection device

    International Nuclear Information System (INIS)

    Katagiri, Masaki.

    1979-01-01

    Purpose: To improve the SN ratio in the detection. Constitution: Improved precipitator method is provided. Scintillation detectors of a same function are provided respectively by each one before and after a gas reservoir for depositing fission products in the cover gas to detecting wires. The outputs from the two detectors (output from the wire not deposited with the fission products and the output from the wire after deposition) are compared to eliminate background noises resulted from not-decayed nucleides. A subtraction circuit is provided for the elimination. Since the background noises of the detecting wire can thus be measured and corrected on every detection, the SN ratio can be increased. (Ikeda, J.)

  19. Detecting vocal fatigue in student singers using acoustic measures of mean fundamental frequency, jitter, shimmer, and harmonics-to-noise ratio

    Science.gov (United States)

    Sisakun, Siphan

    2000-12-01

    The purpose of this study is to explore the ability of four acoustic parameters, mean fundamental frequency, jitter, shimmer, and harmonics-to-noise ratio, to detect vocal fatigue in student singers. The participants are 15 voice students, who perform two distinct tasks, data collection task and vocal fatiguing task. The data collection task includes the sustained vowel /a/, reading a standard passage, and self-rate on a vocal fatigue form. The vocal fatiguing task is the vocal practice of musical scores for a total of 45 minutes. The four acoustic parameters are extracted using the software EZVoicePlus. The data analyses are performed to answer eight research questions. The first four questions relate to correlations of the self-rating scale and each of the four parameters. The next four research questions relate to differences in the parameters over time using one-factor repeated measures analysis of variance (ANOVA). The result yields a proposed acoustic profile of vocal fatigue in student singers. This profile is characterized by increased fundamental frequency; slightly decreased jitter; slightly decreased shimmer; and slightly increased harmonics-to-noise ratio. The proposed profile requires further investigation.

  20. A positive deviance approach to early childhood obesity: cross-sectional characterization of positive outliers.

    Science.gov (United States)

    Foster, Byron Alexander; Farragher, Jill; Parker, Paige; Hale, Daniel E

    2015-06-01

    Positive deviance methodology has been applied in the developing world to address childhood malnutrition and has potential for application to childhood obesity in the United States. We hypothesized that among children at high-risk for obesity, evaluating normal weight children will enable identification of positive outlier behaviors and practices. In a community at high-risk for obesity, a cross-sectional mixed-methods analysis was done of normal weight, overweight, and obese children, classified by BMI percentile. Parents were interviewed using a semistructured format in regard to their children's general health, feeding and activity practices, and perceptions of weight. Interviews were conducted in 40 homes in the lower Rio Grande Valley in Texas with a largely Hispanic (87.5%) population. Demographics, including income, education, and food assistance use, did not vary between groups. Nearly all (93.8%) parents of normal weight children perceived their child to be lower than the median weight. Group differences were observed for reported juice and yogurt consumption. Differences in both emotional feeding behaviors and parents' internalization of reasons for healthy habits were identified as different between groups. We found subtle variations in reported feeding and activity practices by weight status among healthy children in a population at high risk for obesity. The behaviors and attitudes described were consistent with previous literature; however, the local strategies associated with a healthy weight are novel, potentially providing a basis for a specific intervention in this population.

  1. AnyOut : Anytime Outlier Detection Approach for High-dimensional Data

    DEFF Research Database (Denmark)

    Assent, Ira; Kranen, Philipp; Baldauf, Corinna

    2012-01-01

    With the increase of sensor and monitoring applications, data mining on streaming data is receiving increasing research attention. As data is continuously generated, mining algorithms need to be able to analyze the data in a one-pass fashion. In many applications the rate at which the data objects...

  2. Anomaly Detection in Smart Metering Infrastructure with the Use of Time Series Analysis

    Directory of Open Access Journals (Sweden)

    Tomasz Andrysiak

    2017-01-01

    Full Text Available The article presents solutions to anomaly detection in network traffic for critical smart metering infrastructure, realized with the use of radio sensory network. The structure of the examined smart meter network and the key security aspects which have influence on the correct performance of an advanced metering infrastructure (possibility of passive and active cyberattacks are described. An effective and quick anomaly detection method is proposed. At its initial stage, Cook’s distance was used for detection and elimination of outlier observations. So prepared data was used to estimate standard statistical models based on exponential smoothing, that is, Brown’s, Holt’s, and Winters’ models. To estimate possible fluctuations in forecasts of the implemented models, properly parameterized Bollinger Bands was used. Next, statistical relations between the estimated traffic model and its real variability were examined to detect abnormal behavior, which could indicate a cyberattack attempt. An update procedure of standard models in case there were significant real network traffic fluctuations was also proposed. The choice of optimal parameter values of statistical models was realized as forecast error minimization. The results confirmed efficiency of the presented method and accuracy of choice of the proper statistical model for the analyzed time series.

  3. Detecting unknown attacks in wireless sensor networks that contain mobile nodes.

    Science.gov (United States)

    Banković, Zorana; Fraga, David; Moya, José M; Vallejo, Juan Carlos

    2012-01-01

    As wireless sensor networks are usually deployed in unattended areas, security policies cannot be updated in a timely fashion upon identification of new attacks. This gives enough time for attackers to cause significant damage. Thus, it is of great importance to provide protection from unknown attacks. However, existing solutions are mostly concentrated on known attacks. On the other hand, mobility can make the sensor network more resilient to failures, reactive to events, and able to support disparate missions with a common set of sensors, yet the problem of security becomes more complicated. In order to address the issue of security in networks with mobile nodes, we propose a machine learning solution for anomaly detection along with the feature extraction process that tries to detect temporal and spatial inconsistencies in the sequences of sensed values and the routing paths used to forward these values to the base station. We also propose a special way to treat mobile nodes, which is the main novelty of this work. The data produced in the presence of an attacker are treated as outliers, and detected using clustering techniques. These techniques are further coupled with a reputation system, in this way isolating compromised nodes in timely fashion. The proposal exhibits good performances at detecting and confining previously unseen attacks, including the cases when mobile nodes are compromised.

  4. Much of the variation in breast pathology quality assurance data in the UK can be explained by the random order in which cases arrive at individual centres, but some true outliers do exist.

    Science.gov (United States)

    Cross, Simon S; Stephenson, Timothy J; Harrison, Robert F

    2011-10-01

    To investigate the role of random temporal order of patient arrival at screening centres in the variability seen in rates of node positivity and breast cancer grade between centres in the NHS Breast Screening Programme. Computer simulations were performed of the variation in node positivity and breast cancer grade with the random temporal arrival of patients at screening centres based on national UK audit data. Cumulative mean graphs of these data were plotted. Confidence intervals for the parameters were generated, using the binomial distribution. UK audit data were plotted on these control limit graphs. The results showed that much of the variability in the audit data could be accounted for by the effects of random order of arrival of cases at the screening centres. Confidence intervals of 99.7% identified true outliers in the data. Much of the variation in breast pathology quality assurance data in the UK can be explained by the random order in which cases arrive at individual centres. Control charts with confidence intervals of 99.7% plotted against the number of reported cases are useful tools for identification of true outliers. 2011 Blackwell Publishing Limited.

  5. Detection of adulterated honey produced by honeybee (Apis mellifera L.) colonies fed with different levels of commercial industrial sugar (C₃ and C₄ plants) syrups by the carbon isotope ratio analysis.

    Science.gov (United States)

    Guler, Ahmet; Kocaokutgen, Hasan; Garipoglu, Ali V; Onder, Hasan; Ekinci, Deniz; Biyik, Selim

    2014-07-15

    In the present study, one hundred pure and adulterated honey samples obtained from feeding honeybee colonies with different levels (5, 20 and 100 L/colony) of various commercial sugar syrups including High Fructose Corn Syrup 85 (HFCS-85), High Fructose Corn Syrup 55 (HFCS-55), Bee Feeding Syrup (BFS), Glucose Monohydrate Sugar (GMS) and Sucrose Sugar (SS) were evaluated in terms of the δ(13)C value of honey and its protein, difference between the δ(13)C value of protein and honey (Δδ(13)C), and C4% sugar ratio. Sugar type, sugar level and the sugar type*sugar level interaction were found to be significant (Phoney, Δδ(13)C (protein-honey), and C4% sugar ratio were used as criteria according to the AOAC standards. However, it was possible to detect the adulteration by using the same criteria in the honeys taken from the 20 and 100 L/colony of HFCS-85 and the 100L/colony of HFCS-55. Adulteration at low syrup level (20 L/colony) was more easily detected when the fructose content of HFCS syrup increased. As a result, the official methods (AOAC, 978.17, 1995; AOAC, 991.41, 1995; AOAC 998.12, 2005) and Internal Standard Carbon Isotope Ratio Analysis could not efficiently detect the indirect adulteration of honey obtained by feeding the bee colonies with the syrups produced from C3 plants such as sugar beet (Beta vulgaris) and wheat (Triticium vulgare). For this reason, it is strongly needed to develop novel methods and standards that can detect the presence and the level of indirect adulterations. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Anomaly Detection Based on Sensor Data in Petroleum Industry Applications

    Directory of Open Access Journals (Sweden)

    Luis Martí

    2015-01-01

    Full Text Available Anomaly detection is the problem of finding patterns in data that do not conform to an a priori expected behavior. This is related to the problem in which some samples are distant, in terms of a given metric, from the rest of the dataset, where these anomalous samples are indicated as outliers. Anomaly detection has recently attracted the attention of the research community, because of its relevance in real-world applications, like intrusion detection, fraud detection, fault detection and system health monitoring, among many others. Anomalies themselves can have a positive or negative nature, depending on their context and interpretation. However, in either case, it is important for decision makers to be able to detect them in order to take appropriate actions. The petroleum industry is one of the application contexts where these problems are present. The correct detection of such types of unusual information empowers the decision maker with the capacity to act on the system in order to correctly avoid, correct or react to the situations associated with them. In that application context, heavy extraction machines for pumping and generation operations, like turbomachines, are intensively monitored by hundreds of sensors each that send measurements with a high frequency for damage prevention. In this paper, we propose a combination of yet another segmentation algorithm (YASA, a novel fast and high quality segmentation algorithm, with a one-class support vector machine approach for efficient anomaly detection in turbomachines. The proposal is meant for dealing with the aforementioned task and to cope with the lack of labeled training data. As a result, we perform a series of empirical studies comparing our approach to other methods applied to benchmark problems and a real-life application related to oil platform turbomachinery anomaly detection.

  7. Smartphone-Based Indoor Localization with Bluetooth Low Energy Beacons.

    Science.gov (United States)

    Zhuang, Yuan; Yang, Jun; Li, You; Qi, Longning; El-Sheimy, Naser

    2016-04-26

    Indoor wireless localization using Bluetooth Low Energy (BLE) beacons has attracted considerable attention after the release of the BLE protocol. In this paper, we propose an algorithm that uses the combination of channel-separate polynomial regression model (PRM), channel-separate fingerprinting (FP), outlier detection and extended Kalman filtering (EKF) for smartphone-based indoor localization with BLE beacons. The proposed algorithm uses FP and PRM to estimate the target's location and the distances between the target and BLE beacons respectively. We compare the performance of distance estimation that uses separate PRM for three advertisement channels (i.e., the separate strategy) with that use an aggregate PRM generated through the combination of information from all channels (i.e., the aggregate strategy). The performance of FP-based location estimation results of the separate strategy and the aggregate strategy are also compared. It was found that the separate strategy can provide higher accuracy; thus, it is preferred to adopt PRM and FP for each BLE advertisement channel separately. Furthermore, to enhance the robustness of the algorithm, a two-level outlier detection mechanism is designed. Distance and location estimates obtained from PRM and FP are passed to the first outlier detection to generate improved distance estimates for the EKF. After the EKF process, the second outlier detection algorithm based on statistical testing is further performed to remove the outliers. The proposed algorithm was evaluated by various field experiments. Results show that the proposed algorithm achieved the accuracy of EKF algorithm and 15.77% more accurate than EKF algorithm. With sparse deployment (1 beacon per 18 m), the proposed algorithm achieves the accuracies of EKF algorithm and 21.41% better than EKF algorithm. Therefore, the proposed algorithm is especially useful to improve the localization accuracy in environments with sparse beacon deployment.

  8. Detection of algorithmic trading

    Science.gov (United States)

    Bogoev, Dimitar; Karam, Arzé

    2017-10-01

    We develop a new approach to reflect the behavior of algorithmic traders. Specifically, we provide an analytical and tractable way to infer patterns of quote volatility and price momentum consistent with different types of strategies employed by algorithmic traders, and we propose two ratios to quantify these patterns. Quote volatility ratio is based on the rate of oscillation of the best ask and best bid quotes over an extremely short period of time; whereas price momentum ratio is based on identifying patterns of rapid upward or downward movement in prices. The two ratios are evaluated across several asset classes. We further run a two-stage Artificial Neural Network experiment on the quote volatility ratio; the first stage is used to detect the quote volatility patterns resulting from algorithmic activity, while the second is used to validate the quality of signal detection provided by our measure.

  9. Unbalance detection in rotor systems with active bearings using self-sensing piezoelectric actuators

    Science.gov (United States)

    Ambur, Ramakrishnan; Rinderknecht, Stephan

    2018-03-01

    Machines which are developed today are highly automated due to increased use of mechatronic systems. To ensure their reliable operation, fault detection and isolation (FDI) is an important feature along with a better control. This research work aims to achieve and integrate both these functions with minimum number of components in a mechatronic system. This article investigates a rotating machine with active bearings equipped with piezoelectric actuators. There is an inherent coupling between their electrical and mechanical properties because of which they can also be used as sensors. Mechanical deflection can be reconstructed from these self-sensing actuators from measured voltage and current signals. These virtual sensor signals are utilised to detect unbalance in a rotor system. Parameters of unbalance such as its magnitude and phase are detected by parametric estimation method in frequency domain. Unbalance location has been identified using hypothesis of localization of faults. Robustness of the estimates against outliers in measurements is improved using weighted least squares method. Unbalances are detected in a real test bench apart from simulation using its model. Experiments are performed in stationary as well as in transient case. As a further step unbalances are estimated during simultaneous actuation of actuators in closed loop with an adaptive algorithm for vibration minimisation. This strategy could be used in systems which aim for both fault detection and control action.

  10. Development of air fuel ratio sensor; A/F sensor no kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    Sakawa, T; Hori, M [Denso Corp., Aichi (Japan); Nakamura, Y [Toyota Motor Corp., Aichi (Japan)

    1997-10-01

    The Air Fuel Ratio Sensor (A/F sensor), which is applied to a 1997 model year Low Emission Vehicle (LEV) was developed. This sensor enables the detection of the exhaust gas air fuel ratio, both lean and rich of stoichiometric. It has an effective air fuel ratio range from 12 to 18 as required for LEV regulation. It has the fast light off, - within 20 seconds - to minimize exhaust hydrocarbon content. Further, it has fast response time, less than 200 msec, to improve the air fuel ratio controllability. 3 refs., 7 figs.

  11. GasBench/isotope ratio mass spectrometry: a carbon isotope approach to detect exogenous CO(2) in sparkling drinks.

    Science.gov (United States)

    Cabañero, Ana I; San-Hipólito, Tamar; Rupérez, Mercedes

    2007-01-01

    A new procedure for the determination of carbon dioxide (CO(2)) (13)C/(12)C isotope ratios, using direct injection into a GasBench/isotope ratio mass spectrometry (GasBench/IRMS) system, has been developed to improve isotopic methods devoted to the study of the authenticity of sparkling drinks. Thirty-nine commercial sparkling drink samples from various origins were analyzed. Values of delta(13)C(cava) ranged from -20.30 per thousand to -23.63 per thousand, when C3 sugar addition was performed for a second alcoholic fermentation. Values of delta(13)C(water) ranged from -5.59 per thousand to -6.87 per thousand in the case of naturally carbonated water or water fortified with gas from the spring, and delta(13)C(water) ranged from -29.36 per thousand to -42.09 per thousand when industrial CO(2) was added. It has been demonstrated that the addition of C4 sugar to semi-sparkling wine (aguja) and industrial CO(2) addition to sparkling wine (cava) or water can be detected. The new procedure has advantages over existing methods in terms of analysis time and sample treatment. In addition, it is the first isotopic method developed that allows (13)C/(12)C determination directly from a liquid sample without previous CO(2) extraction. No significant isotopic fractionation was observed nor any influence by secondary compounds present in the liquid phase. Copyright (c) 2007 John Wiley & Sons, Ltd.

  12. GAE detection for mass measurement for D-T ratio control

    International Nuclear Information System (INIS)

    Lister, J.B.; Villard, L.; Ridder, G. de

    1997-09-01

    This report includes two papers by the authors Lister, Villard and de Ridder: 1) Measurement of the effective plasma ion mass in large tokamaks using Global Alfven Eigenmodes, 2) GAE detection for mass measurement for plasma density control. The second paper represents the final report of JET article 14 contract 950104. figs., tabs., refs

  13. Alpha-in-air monitor for continuous monitoring based on alpha to beta ratio

    International Nuclear Information System (INIS)

    Somayaji, K.S.; Venkataramani, R.; Swaminathan, N.; Pushparaja

    1997-01-01

    Measurement of long-lived alpha activity collected on a filter paper in continuous air monitoring of ambient working environment is difficult due to interference from much larger concentrations of short-lived alpha emitting daughter products of 222 Rn and 220 Rn. However, the ratio between the natural alpha and beta activity is approximately constant and this constancy of the ratio is used to discriminate against short-lived natural radioactivity in continuous air monitoring. Detection system was specially designed for the purpose of simultaneous counting of alpha and beta activity deposited on the filter paper during continuous monitoring. The activity ratios were calculated and plotted against the monitoring duration up to about six hours. Monitoring was carried out in three facilities with different ventilation conditions. Presence of any long-lived alpha contamination on the filter paper results in increase in the alpha to beta ratio. Long-lived 239 Pu contamination of about 16 DAC.h could be detected after about 45 minutes of commencement of the sampling. The experimental results using prototype units have shown that the approach of using alpha to beta activity ratio method to detect long-lived alpha activity in the presence of short-lived natural activity is satisfactory. (author)

  14. EXPLORING THE UNUSUALLY HIGH BLACK-HOLE-TO-BULGE MASS RATIOS IN NGC 4342 AND NGC 4291: THE ASYNCHRONOUS GROWTH OF BULGES AND BLACK HOLES

    International Nuclear Information System (INIS)

    Bogdán, Ákos; Forman, William R.; Kraft, Ralph P.; Li, Zhiyuan; Vikhlinin, Alexey; Nulsen, Paul E. J.; Jones, Christine; Zhuravleva, Irina; Churazov, Eugene; Mihos, J. Christopher; Harding, Paul; Guo, Qi; Schindler, Sabine

    2012-01-01

    We study two nearby early-type galaxies, NGC 4342 and NGC 4291, that host unusually massive black holes relative to their low stellar mass. The observed black-hole-to-bulge mass ratios of NGC 4342 and NGC 4291 are 6.9 +3.8 –2.3 % and 1.9% ± 0.6%, respectively, which significantly exceed the typical observed ratio of ∼0.2%. As a consequence of the exceedingly large black-hole-to-bulge mass ratios, NGC 4342 and NGC 4291 are ≈5.1σ and ≈3.4σ outliers from the M . -M bulge scaling relation, respectively. In this paper, we explore the origin of the unusually high black-hole-to-bulge mass ratio. Based on Chandra X-ray observations of the hot gas content of NGC 4342 and NGC 4291, we compute gravitating mass profiles, and conclude that both galaxies reside in massive dark matter halos, which extend well beyond the stellar light. The presence of dark matter halos around NGC 4342 and NGC 4291 and a deep optical image of the environment of NGC 4342 indicate that tidal stripping, in which ∼> 90% of the stellar mass was lost, cannot explain the observed high black-hole-to-bulge mass ratios. Therefore, we conclude that these galaxies formed with low stellar masses, implying that the bulge and black hole did not grow in tandem. We also find that the black hole mass correlates well with the properties of the dark matter halo, suggesting that dark matter halos may play a major role in regulating the growth of the supermassive black holes.

  15. Evaluation of electrical impedance ratio measurements in accuracy of electronic apex locators.

    Science.gov (United States)

    Kim, Pil-Jong; Kim, Hong-Gee; Cho, Byeong-Hoon

    2015-05-01

    The aim of this paper was evaluating the ratios of electrical impedance measurements reported in previous studies through a correlation analysis in order to explicit it as the contributing factor to the accuracy of electronic apex locator (EAL). The literature regarding electrical property measurements of EALs was screened using Medline and Embase. All data acquired were plotted to identify correlations between impedance and log-scaled frequency. The accuracy of the impedance ratio method used to detect the apical constriction (APC) in most EALs was evaluated using linear ramp function fitting. Changes of impedance ratios for various frequencies were evaluated for a variety of file positions. Among the ten papers selected in the search process, the first-order equations between log-scaled frequency and impedance were in the negative direction. When the model for the ratios was assumed to be a linear ramp function, the ratio values decreased if the file went deeper and the average ratio values of the left and right horizontal zones were significantly different in 8 out of 9 studies. The APC was located within the interval of linear relation between the left and right horizontal zones of the linear ramp model. Using the ratio method, the APC was located within a linear interval. Therefore, using the impedance ratio between electrical impedance measurements at different frequencies was a robust method for detection of the APC.

  16. Study on the ratio of signal to noise for single photon resolution time spectrometer

    International Nuclear Information System (INIS)

    Wang Zhaomin; Huang Shengli; Xu Zizong; Wu Chong

    2001-01-01

    The ratio of signal to noise for single photon resolution time spectrometer and their influence factors were studied. A method to depress the background, to shorten the measurement time and to increase the ratio of signal to noise was discussed. Results show that ratio of signal to noise is proportional to solid angle of detector to source and detection efficiency, and inverse proportional to electronics noise. Choose the activity of the source was important for decreasing of random coincidence counting. To use a coincidence gate and a discriminator of single photon were an effective way of increasing measurement accuracy and detection efficiency

  17. Potential for waist-to-height ratio to detect overfat adolescents from a Pacific Island, even those within the normal BMI range.

    Science.gov (United States)

    Frayon, Stéphane; Cavaloc, Yolande; Wattelez, Guillaume; Cherrier, Sophie; Lerrant, Yannick; Ashwell, Margaret; Galy, Olivier

    2017-12-15

    Waist-to-height ratio (WHtR) is a simple anthropometric proxy for central body fat; it is easy to use from a health education perspective. A WHtR value >0.5 has been proposed as a first level indicator of health risk. The first aim of this study was to compare WHtR with values based on body mass index (BMI) in their prediction of the percentage of body fat (%BF) in a multi-ethnic population of adolescents from New-Caledonia (age 11-16year). Secondly, to see whether WHtR >0.5 could be used to detect overfat subjects whose BMI was in the normal range. Body fat percentage (%BF, based on skinfold measurements), BMI and WHtR were calculated for New Caledonian adolescents from different ethnic backgrounds. The relationship between %BF, BMI and WHtR was determined using quadratic models and from linear regression equations. The sensitivity and specificity of WHtR for detecting overfat adolescents (%BF >25% in boys and >30% in girls) were assessed and compared with those from the BMI-based classification. WHtR showed better correlation with %BF than BMI-based measurements. WHtR >0.5 was also more accurate than BMI in detecting overfat adolescents. Moreover, using this boundary value, 8% of adolescents in the normal BMI range were shown to be over-fat. WHtR is a good anthropometric proxy to detect overfat adolescents. Detecting overfat adolescents within the normal BMI range is particularly important for preventing non communicable diseases. We therefore recommend using WHtR for health education programs in the Pacific area and more generally. Copyright © 2017 Asia Oceania Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.

  18. A Multilayer Perceptron-Based Impulsive Noise Detector with Application to Power-Line-Based Sensor Networks

    KAUST Repository

    Chien, Ying-Ren

    2018-04-10

    For power-line-based sensor networks, impulsive noise (IN) will dramatically degrade the data transmission rate in the power line. In this paper, we present a multilayer perceptron (MLP)-based approach to detect IN in orthogonal frequency-division multiplexing (OFDM)-based baseband power line communications (PLCs). Combining the MLP-based IN detection method with the outlier detection theory allows more accurate identification of the harmful residual IN. For OFDM-based PLC systems, the high peak-to-average power ratio (PAPR) of the received signal makes detection of harmful residual IN more challenging. The detection mechanism works in an iterative receiver that contains a pre-IN mitigation and a post-IN mitigation. The pre-IN mitigation is meant to null the stronger portion of IN, while the post-IN mitigation suppresses the residual portion of IN using an iterative process. Compared with previously reported IN detectors, the simulation results show that our MLP-based IN detector improves the resulting bit error rate (BER) performance.

  19. Uranium isotope ratio measurements in field settings

    International Nuclear Information System (INIS)

    Shaw, R.W.; Barshick, C.M.; Young, J.P.; Ramsey, J.M.

    1997-01-01

    The authors have developed a technique for uranium isotope ratio measurements of powder samples in field settings. Such a method will be invaluable for environmental studies, radioactive waste operations, and decommissioning and decontamination operations. Immediate field data can help guide an ongoing sampling campaign. The measurement encompasses glow discharge sputtering from pressed sample hollow cathodes, high resolution laser spectroscopy using conveniently tunable diode lasers, and optogalvanic detection. At 10% 235 U enrichment and above, the measurement precision for 235 U/( 235 U+ 238 U) isotope ratios was ±3%; it declined to ±15% for 0.3% (i.e., depleted) samples. A prototype instrument was constructed and is described

  20. Men and Women Wage Differences in Spain and Poland

    Directory of Open Access Journals (Sweden)

    Aleksandra Matuszewska-Janica

    2018-03-01

    Full Text Available Men and women wage differences is a widely discuss topic in the literature. Some authors point the fact that size of gender wage gap (GPG is not the same across wage distribution. GPG ratio is accelerating at the top of it. Thus, the main goal of presented study is to analyse impact of outliers (top earners on the values of GPG ratio and results of its decomposition. In addition we compare outcomes obtained for Spain and Poland. Elimination outliers from the sample will reduce values of men and women wage gap ratios not only in unadjusted form but in adjusted form as well. Study is based upon the Eurostat’s Structure of Earnings Survey (SES individual data in respect of 2014. In the paper we discuss results of Oaxaca-Blinder decomposition (in extension proposed by Oaxaca and Ransom obtained for Spain and Poland. Obtained results indicated two points above all. Firstly, although unadjusted GPG ratios for Spain and Poland differ significantly, the adjusted GPG ratios are at the same level (about 15%. Such situation shows the real men and women wage differences are at the same level in both countries. This is an additional indication that women situation on Polish labour market in similar to the Spanish one. Secondly, after elimination of outliers the values of GPG measures (in adjusted and unadjusted form decreased, as was expected. These falls came to approximately 3 p.p. It can be considered as significant change.

  1. Diagnostic value of transmural perfusion ratio derived from dynamic CT-based myocardial perfusion imaging for the detection of haemodynamically relevant coronary artery stenosis

    Energy Technology Data Exchange (ETDEWEB)

    Coenen, Adriaan; Lubbers, Marisa M.; Dedic, Admir; Chelu, Raluca G.; Geuns, Robert-Jan M. van; Nieman, Koen [Erasmus University Medical Center, Department of Radiology, Rotterdam (Netherlands); Erasmus University Medical Center, Department of Cardiology, Rotterdam (Netherlands); Kurata, Akira; Kono, Atsushi; Dijkshoorn, Marcel L. [Erasmus University Medical Center, Department of Radiology, Rotterdam (Netherlands); Rossi, Alexia [Erasmus University Medical Center, Department of Radiology, Rotterdam (Netherlands); Barts Health NHS Trust, NIHR Cardiovascular Biomedical Research Unit at Barts, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London and Department of Cardiology, London (United Kingdom)

    2017-06-15

    To investigate the additional value of transmural perfusion ratio (TPR) in dynamic CT myocardial perfusion imaging for detection of haemodynamically significant coronary artery disease compared with fractional flow reserve (FFR). Subjects with suspected or known coronary artery disease were prospectively included and underwent a CT-MPI examination. From the CT-MPI time-point data absolute myocardial blood flow (MBF) values were temporally resolved using a hybrid deconvolution model. An absolute MBF value was measured in the suspected perfusion defect. TPR was defined as the ratio between the subendocardial and subepicardial MBF. TPR and MBF results were compared with invasive FFR using a threshold of 0.80. Forty-three patients and 94 territories were analysed. The area under the receiver operator curve was larger for MBF (0.78) compared with TPR (0.65, P = 0.026). No significant differences were found in diagnostic classification between MBF and TPR with a territory-based accuracy of 77 % (67-86 %) for MBF compared with 70 % (60-81 %) for TPR. Combined MBF and TPR classification did not improve the diagnostic classification. Dynamic CT-MPI-based transmural perfusion ratio predicts haemodynamically significant coronary artery disease. However, diagnostic performance of dynamic CT-MPI-derived TPR is inferior to quantified MBF and has limited incremental value. (orig.)

  2. Incremental Activation Detection for Real-Time fMRI Series Using Robust Kalman Filter

    Directory of Open Access Journals (Sweden)

    Liang Li

    2014-01-01

    Full Text Available Real-time functional magnetic resonance imaging (rt-fMRI is a technique that enables us to observe human brain activations in real time. However, some unexpected noises that emerged in fMRI data collecting, such as acute swallowing, head moving and human manipulations, will cause much confusion and unrobustness for the activation analysis. In this paper, a new activation detection method for rt-fMRI data is proposed based on robust Kalman filter. The idea is to add a variation to the extended kalman filter to handle the additional sparse measurement noise and a sparse noise term to the measurement update step. Hence, the robust Kalman filter is designed to improve the robustness for the outliers and can be computed separately for each voxel. The algorithm can compute activation maps on each scan within a repetition time, which meets the requirement for real-time analysis. Experimental results show that this new algorithm can bring out high performance in robustness and in real-time activation detection.

  3. Serum Free Light Chains in Neoplastic Monoclonal Gammopathies: Relative Under-Detection of Lambda Dominant Kappa/Lambda Ratio, and Underproduction of Free Lambda Light Chains, as Compared to Kappa Light Chains, in Patients With Neoplastic Monoclonal Gammopathies.

    Science.gov (United States)

    Lee, Won Sok; Singh, Gurmukh

    2018-07-01

    Quantitative evaluation of serum free light chains is recommended for the work up of monoclonal gammopathies. Immunoglobulin light chains are generally produced in excess of heavy chains. In patients with monoclonal gammopathy, κ/λ ratio is abnormal less frequently with lambda chain lesions. This study was undertaken to ascertain if the levels of overproduction of the two light chain types and their detection rates are different in patients with neoplastic monoclonal gammopathies. Results of serum protein electrophoresis (SPEP), serum protein immunofixation electrophoresis (SIFE), urine protein electrophoresis (UPEP), urine protein immunofixation electrophoresis (UIFE), and serum free light chain assay (SFLCA) in patients with monoclonal gammopathies were examined retrospectively. The κ/λ ratios were appropriately abnormal more often in kappa chain lesions. Ratios of κ/λ were normal in about 25% of patients with lambda chain lesions in whom free homogenous lambda light chains were detectable in urine. An illustrative case suggests underproduction of free lambda light chains, in some instances. The lower prevalence of lambda dominant κ/λ ratio in lesions with lambda light chains is estimated to be due to relative under-detection of lambda dominant κ/λ ratio in about 25% of the patients and because lambda chains are not produced in as much excess of heavy chains as are kappa chains, in about 5% of the patients. The results question the medical necessity and clinical usefulness of the serum free light chain assay. UPEP/UIFE is under-utilized.

  4. Combination of sugar analysis and stable isotope ratio mass spectrometry to detect the use of artificial sugars in royal jelly production.

    Science.gov (United States)

    Wytrychowski, Marine; Daniele, Gaëlle; Casabianca, Hervé

    2012-05-01

    The effects of feeding bees artificial sugars and/or proteins on the sugar compositions and (13)C isotopic measurements of royal jellies (RJs) were evaluated. The sugars fed to the bees were two C4 sugars (cane sugar and maize hydrolysate), two C3 sugars (sugar beet, cereal starch hydrolysate), and honey. The proteins fed to them were pollen, soybean, and yeast powder proteins. To evaluate the influence of the sugar and/or protein feeding over time, samples were collected during six consecutive harvests. (13)C isotopic ratio measurements of natural RJs gave values of around -25 ‰, which were also seen for RJs obtained when the bees were fed honey or C3 sugars. However, the RJs obtained when the bees were fed cane sugar or corn hydrolysate (regardless of whether they were also fed proteins) gave values of up to -17 ‰. Sugar content analysis revealed that the composition of maltose, maltotriose, sucrose, and erlose varied significantly over time in accordance with the composition of the syrup fed to the bees. When corn and cereal starch hydrolysates were fed to the bees, the maltose and maltotriose contents of the RJs increased up to 5.0 and 1.3 %, respectively, compared to the levels seen in authentic samples (i.e., samples obtained when the bees were fed natural food: honey and pollen) that were inferior to 0.2% and not detected, respectively. The sucrose and erlose contents of natural RJs were around 0.2 %, whereas those in RJs obtained when the bees were fed cane or beet sugar were as much as 4.0 and 1.3 %, respectively. The combination of sugar analysis and (13)C isotopic ratio measurements represents a very efficient analytical methodology for detecting (from early harvests onward) the use of C4 and C3 artificial sugars in the production of RJ.

  5. Real-Time Monitoring System Using Smartphone-Based Sensors and NoSQL Database for Perishable Supply Chain

    Directory of Open Access Journals (Sweden)

    Ganjar Alfian

    2017-11-01

    Full Text Available Since customer attention is increasing due to growing customer health awareness, it is important for the perishable food supply chain to monitor food quality and safety. This study proposes a real-time monitoring system that utilizes smartphone-based sensors and a big data platform. Firstly, we develop a smartphone-based sensor to gather temperature, humidity, GPS, and image data. The IoT-generated sensor on the smartphone has characteristics such as a large amount of storage, an unstructured format, and continuous data generation. Thus, in this study, we propose an effective big data platform design to handle IoT-generated sensor data. Furthermore, the abnormal sensor data generated by failed sensors is called outliers and may arise in real cases. The proposed system utilizes outlier detection based on statistical and clustering approaches to filter out the outlier data. The proposed system was evaluated for system and gateway performance and tested on the kimchi supply chain in Korea. The results showed that the proposed system is capable of processing a massive input/output of sensor data efficiently when the number of sensors and clients increases. The current commercial smartphones are sufficiently capable of combining their normal operations with simultaneous performance as gateways for transmitting sensor data to the server. In addition, the outlier detection based on the 3-sigma and DBSCAN were used to successfully detect/classify outlier data as separate from normal sensor data. This study is expected to help those who are responsible for developing the real-time monitoring system and implementing critical strategies related to the perishable supply chain.

  6. Computational Methods for Large Spatio-temporal Datasets and Functional Data Ranking

    KAUST Repository

    Huang, Huang

    2017-07-16

    This thesis focuses on two topics, computational methods for large spatial datasets and functional data ranking. Both are tackling the challenges of big and high-dimensional data. The first topic is motivated by the prohibitive computational burden in fitting Gaussian process models to large and irregularly spaced spatial datasets. Various approximation methods have been introduced to reduce the computational cost, but many rely on unrealistic assumptions about the process and retaining statistical efficiency remains an issue. We propose a new scheme to approximate the maximum likelihood estimator and the kriging predictor when the exact computation is infeasible. The proposed method provides different types of hierarchical low-rank approximations that are both computationally and statistically efficient. We explore the improvement of the approximation theoretically and investigate the performance by simulations. For real applications, we analyze a soil moisture dataset with 2 million measurements with the hierarchical low-rank approximation and apply the proposed fast kriging to fill gaps for satellite images. The second topic is motivated by rank-based outlier detection methods for functional data. Compared to magnitude outliers, it is more challenging to detect shape outliers as they are often masked among samples. We develop a new notion of functional data depth by taking the integration of a univariate depth function. Having a form of the integrated depth, it shares many desirable features. Furthermore, the novel formation leads to a useful decomposition for detecting both shape and magnitude outliers. Our simulation studies show the proposed outlier detection procedure outperforms competitors in various outlier models. We also illustrate our methodology using real datasets of curves, images, and video frames. Finally, we introduce the functional data ranking technique to spatio-temporal statistics for visualizing and assessing covariance properties, such as

  7. Evaluation of electrical impedance ratio measurements in accuracy of electronic apex locators

    Directory of Open Access Journals (Sweden)

    Pil-Jong Kim

    2015-05-01

    Full Text Available Objectives The aim of this paper was evaluating the ratios of electrical impedance measurements reported in previous studies through a correlation analysis in order to explicit it as the contributing factor to the accuracy of electronic apex locator (EAL. Materials and Methods The literature regarding electrical property measurements of EALs was screened using Medline and Embase. All data acquired were plotted to identify correlations between impedance and log-scaled frequency. The accuracy of the impedance ratio method used to detect the apical constriction (APC in most EALs was evaluated using linear ramp function fitting. Changes of impedance ratios for various frequencies were evaluated for a variety of file positions. Results Among the ten papers selected in the search process, the first-order equations between log-scaled frequency and impedance were in the negative direction. When the model for the ratios was assumed to be a linear ramp function, the ratio values decreased if the file went deeper and the average ratio values of the left and right horizontal zones were significantly different in 8 out of 9 studies. The APC was located within the interval of linear relation between the left and right horizontal zones of the linear ramp model. Conclusions Using the ratio method, the APC was located within a linear interval. Therefore, using the impedance ratio between electrical impedance measurements at different frequencies was a robust method for detection of the APC.

  8. Short-term change detection for UAV video

    Science.gov (United States)

    Saur, Günter; Krüger, Wolfgang

    2012-11-01

    In the last years, there has been an increased use of unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. An important application in this context is change detection in UAV video data. Here we address short-term change detection, in which the time between observations ranges from several minutes to a few hours. We distinguish this task from video motion detection (shorter time scale) and from long-term change detection, based on time series of still images taken between several days, weeks, or even years. Examples for relevant changes we are looking for are recently parked or moved vehicles. As a pre-requisite, a precise image-to-image registration is needed. Images are selected on the basis of the geo-coordinates of the sensor's footprint and with respect to a certain minimal overlap. The automatic imagebased fine-registration adjusts the image pair to a common geometry by using a robust matching approach to handle outliers. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed length of shadows, and compression or transmission artifacts. To detect changes in image pairs we analyzed image differencing, local image correlation, and a transformation-based approach (multivariate alteration detection). As input we used color and gradient magnitude images. To cope with local misalignment of image structures we extended the approaches by a local neighborhood search. The algorithms are applied to several examples covering both urban and rural scenes. The local neighborhood search in combination with intensity and gradient magnitude differencing clearly improved the results. Extended image differencing performed better than both the correlation based approach and the multivariate alternation detection. The algorithms are adapted to be used in semi-automatic workflows for the ABUL video exploitation system of Fraunhofer

  9. Strontium isotope detection of brine contamination in the East Poplar oil field, Montana

    Science.gov (United States)

    Peterman, Zell E.; Thamke, Joanna N.; Futa, Kiyoto; Oliver, Thomas A.

    2010-01-01

    Brine contamination of groundwater in the East Poplar oil field was first documented in the mid-1980s by the U.S. Geological Survey by using hydrochemistry, with an emphasis on chloride (Cl) and total dissolved solids concentrations. Supply wells for the City of Poplar are located downgradient from the oil field, are completed in the same shallow aquifers that are documented as contaminated, and therefore are potentially at risk of being contaminated. In cooperation with the Office of Environmental Protection of the Fort Peck Tribes, groundwater samples were collected in 2009 and 2010 from supply wells, monitor wells, and the Poplar River for analyses of major and trace elements, including strontium (Sr) concentrations and isotopic compositions. The ratio of strontium-87 to strontium-86 (87Sr/86Sr) is used extensively as a natural tracer in groundwater to detect mixing among waters from different sources and to study the effects of water/rock interaction. On a plot of the reciprocal strontium concentration against the 87Sr/86Sr ratio, mixtures of two end members will produce a linear array. Using this plotting method, data for samples from most of the wells, including the City of Poplar wells, define an array with reciprocal strontium values ranging from 0.08 to 4.15 and 87Sr/86Sr ratios ranging from 0.70811 to 0.70828. This array is composed of a brine end member with an average 87Sr/86Sr of 0.70822, strontium concentrations in excess of 12.5 milligrams per liter (mg/L), and chloride concentrations exceeding 8,000 mg/L mixing with uncontaminated water similar to that in USGS06-08 with 18.0 mg/L chloride, 0.24 mg/L strontium, and a 87Sr/86Sr ratio of 0.70811. The position of samples from the City of Poplar public-water supply wells within this array indicates that brine contamination has reached all three wells. Outliers from this array are EPU-4G (groundwater from the Cretaceous Judith River Formation), brine samples from disposal wells (Huber 5-D and EPU 1-D

  10. Feynman-α correlation analysis by prompt-photon detection

    International Nuclear Information System (INIS)

    Hashimoto, Kengo; Yamada, Sumasu; Hasegawa, Yasuhiro; Horiguchi, Tetsuo

    1998-01-01

    Two-detector Feynman-α measurements were carried out using the UTR-KINKI reactor, a light-water-moderated and graphite-reflected reactor, by detecting high-energy, prompt gamma rays. For comparison, the conventional measurements by detecting neutrons were also performed. These measurements were carried out in the subcriticality range from 0 to $1.8. The gate-time dependence of the variance-and covariance-to-mean ratios measured by gamma-ray detection were nearly identical with those obtained using standard neutron-detection techniques. Consequently, the prompt-neutron decay constants inferred from the gamma-ray correlation data agreed with those from the neutron data. Furthermore, the correlated-to-uncorrelated amplitude ratios obtained by gamma-ray detection significantly depended on the low-energy discriminator level of the single-channel analyzer. The discriminator level was determined as optimum for obtaining a maximum value of the amplitude ratio. The maximum amplitude ratio was much larger than that obtained by neutron detection. The subcriticality dependence of the decay constant obtained by gamma-ray detection was consistent with that obtained by neutron detection and followed the linear relation based on the one-point kinetic model in the vicinity of delayed critical. These experimental results suggest that the gamma-ray correlation technique can be applied to measure reactor kinetic parameters more efficiently

  11. A coupled classification - evolutionary optimization model for contamination event detection in water distribution systems.

    Science.gov (United States)

    Oliker, Nurit; Ostfeld, Avi

    2014-03-15

    This study describes a decision support system, alerts for contamination events in water distribution systems. The developed model comprises a weighted support vector machine (SVM) for the detection of outliers, and a following sequence analysis for the classification of contamination events. The contribution of this study is an improvement of contamination events detection ability and a multi-dimensional analysis of the data, differing from the parallel one-dimensional analysis conducted so far. The multivariate analysis examines the relationships between water quality parameters and detects changes in their mutual patterns. The weights of the SVM model accomplish two goals: blurring the difference between sizes of the two classes' data sets (as there are much more normal/regular than event time measurements), and adhering the time factor attribute by a time decay coefficient, ascribing higher importance to recent observations when classifying a time step measurement. All model parameters were determined by data driven optimization so the calibration of the model was completely autonomic. The model was trained and tested on a real water distribution system (WDS) data set with randomly simulated events superimposed on the original measurements. The model is prominent in its ability to detect events that were only partly expressed in the data (i.e., affecting only some of the measured parameters). The model showed high accuracy and better detection ability as compared to previous modeling attempts of contamination event detection. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Outliers in American juvenile justice: the need for statutory reform in North Carolina and New York.

    Science.gov (United States)

    Tedeschi, Frank; Ford, Elizabeth

    2015-05-01

    There is a well-established and growing body of evidence from research that adolescents who commit crimes differ in many regards from their adult counterparts and are more susceptible to the negative effects of adjudication and incarceration in adult criminal justice systems. The age of criminal court jurisdiction in the United States has varied throughout history; yet, there are only two remaining states, New York and North Carolina, that continue to automatically charge 16 year olds as adults. This review traces the statutory history of juvenile justice in these two states with an emphasis on political and social factors that have contributed to their outlier status related to the age of criminal court jurisdiction. The neurobiological, psychological, and developmental aspects of the adolescent brain and personality, and how those issues relate both to a greater likelihood of rehabilitation in appropriate settings and to greater vulnerability in adult correctional facilities, are also reviewed. The importance of raising the age in New York and North Carolina not only lies in protecting incarcerated youths but also in preventing the associated stigma following release. Mental health practitioners are vital to the process of local and national juvenile justice reform. They can serve as experts on and advocates for appropriate mental health care and as experts on the adverse effects of the adult criminal justice system on adolescents.

  13. System and method for high precision isotope ratio destructive analysis

    Science.gov (United States)

    Bushaw, Bruce A; Anheier, Norman C; Phillips, Jon R

    2013-07-02

    A system and process are disclosed that provide high accuracy and high precision destructive analysis measurements for isotope ratio determination of relative isotope abundance distributions in liquids, solids, and particulate samples. The invention utilizes a collinear probe beam to interrogate a laser ablated plume. This invention provides enhanced single-shot detection sensitivity approaching the femtogram range, and isotope ratios that can be determined at approximately 1% or better precision and accuracy (relative standard deviation).

  14. High-precision branching-ratio measurement for the superallowed β+ emitter 74Rb

    Science.gov (United States)

    Dunlop, R.; Ball, G. C.; Leslie, J. R.; Svensson, C. E.; Towner, I. S.; Andreoiu, C.; Chagnon-Lessard, S.; Chester, A.; Cross, D. S.; Finlay, P.; Garnsworthy, A. B.; Garrett, P. E.; Glister, J.; Hackman, G.; Hadinia, B.; Leach, K. G.; Rand, E. T.; Starosta, K.; Tardiff, E. R.; Triambak, S.; Williams, S. J.; Wong, J.; Yates, S. W.; Zganjar, E. F.

    2013-10-01

    A high-precision branching-ratio measurement for the superallowed β+ decay of 74Rb was performed at the TRIUMF Isotope Separator and Accelerator (ISAC) radioactive ion-beam facility. The scintillating electron-positron tagging array (SCEPTAR), composed of 10 thin plastic scintillators, was used to detect the emitted β particles; the 8π spectrometer, an array of 20 Compton-suppressed HPGe detectors, was used for detecting γ rays that were emitted following Gamow-Teller and nonanalog Fermi β+ decays of 74Rb; and the Pentagonal Array of Conversion Electron Spectrometers (PACES), an array of 5 Si(Li) detectors, was employed for measuring β-delayed conversion electrons. Twenty-three excited states were identified in 74Kr following 8.241(4)×108 detected 74Rb β decays. A total of 58 γ-ray and electron transitions were placed in the decay scheme, allowing the superallowed branching ratio to be determined as B0=99.545(31)%. Combined with previous half-life and Q-value measurements, the superallowed branching ratio measured in this work leads to a superallowed ft value of 3082.8(65) s. Comparisons between this superallowed ft value and the world-average-corrected Ft¯ value, as well as the nonanalog Fermi branching ratios determined in this work, provide guidance for theoretical models of the isospin-symmetry-breaking corrections in this mass region.

  15. Locally Grown, Natural Ingredients? The Isotope Ratio Can Reveal a Lot!

    Science.gov (United States)

    Rossier, Joël S; Maury, Valérie; Pfammatter, Elmar

    2016-01-01

    This communication gives an overview of selected isotope analyses applied to food authenticity assessment. Different isotope ratio detection technologies such as isotope ratio mass spectrometry (IRMS) and cavity ring down spectroscopy (CRDS) are briefly described. It will be explained how δ(18)O of water contained in fruits and vegetables can be used to assess their country of production. It will be explained why asparagus grown in Valais, in the centre of the Alps carries much less heavy water than asparagus grown closer to the sea coast. On the other hand, the use of δ(13)C can reveal whether a product is natural or adulterated. Applications including honey or sparkling wine adulteration detection will be briefly presented.

  16. 40 CFR 53.35 - Test procedure for Class II and Class III methods for PM2.5 and PM-2.5

    Science.gov (United States)

    2010-07-01

    ... purposes of this outlier test only. (i) Calculate the quantities 2 × R1,j/(R1,j + R2,j) and 2 × R1,j/(R1,j... of the interval, (0.93, 1.07), then R2,j is an outlier. (iii) Calculate the quantities 2 × R3,j/(R3,j... site B shall be in a western city characterized by a high ratio of PM10−2.5 to PM2.5, with exposure to...

  17. Clustering analysis of line indices for LAMOST spectra with AstroStat

    Science.gov (United States)

    Chen, Shu-Xin; Sun, Wei-Min; Yan, Qi

    2018-06-01

    The application of data mining in astronomical surveys, such as the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, provides an effective approach to automatically analyze a large amount of complex survey data. Unsupervised clustering could help astronomers find the associations and outliers in a big data set. In this paper, we employ the k-means method to perform clustering for the line index of LAMOST spectra with the powerful software AstroStat. Implementing the line index approach for analyzing astronomical spectra is an effective way to extract spectral features for low resolution spectra, which can represent the main spectral characteristics of stars. A total of 144 340 line indices for A type stars is analyzed through calculating their intra and inter distances between pairs of stars. For intra distance, we use the definition of Mahalanobis distance to explore the degree of clustering for each class, while for outlier detection, we define a local outlier factor for each spectrum. AstroStat furnishes a set of visualization tools for illustrating the analysis results. Checking the spectra detected as outliers, we find that most of them are problematic data and only a few correspond to rare astronomical objects. We show two examples of these outliers, a spectrum with abnormal continuumand a spectrum with emission lines. Our work demonstrates that line index clustering is a good method for examining data quality and identifying rare objects.

  18. Image fusion in dual energy computed tomography for detection of various anatomic structures - Effect on contrast enhancement, contrast-to-noise ratio, signal-to-noise ratio and image quality

    Energy Technology Data Exchange (ETDEWEB)

    Paul, Jijo, E-mail: jijopaul1980@gmail.com [Department of Diagnostic Radiology, Goethe University Hospital, Theodor-Stern-Kai 7, 60590 Frankfurt am Main (Germany); Department of Biophysics, Goethe University, Max von Laue-Str.1, 60438 Frankfurt am Main (Germany); Bauer, Ralf W. [Department of Diagnostic Radiology, Goethe University Hospital, Theodor-Stern-Kai 7, 60590 Frankfurt am Main (Germany); Maentele, Werner [Department of Biophysics, Goethe University, Max von Laue-Str.1, 60438 Frankfurt am Main (Germany); Vogl, Thomas J. [Department of Diagnostic Radiology, Goethe University Hospital, Theodor-Stern-Kai 7, 60590 Frankfurt am Main (Germany)

    2011-11-15

    Objective: The purpose of this study was to evaluate image fusion in dual energy computed tomography for detecting various anatomic structures based on the effect on contrast enhancement, contrast-to-noise ratio, signal-to-noise ratio and image quality. Material and methods: Forty patients underwent a CT neck with dual energy mode (DECT under a Somatom Definition flash Dual Source CT scanner (Siemens, Forchheim, Germany)). Tube voltage: 80-kV and Sn140-kV; tube current: 110 and 290 mA s; collimation-2 x 32 x 0.6 mm. Raw data were reconstructed using a soft convolution kernel (D30f). Fused images were calculated using a spectrum of weighting factors (0.0, 0.3, 0.6 0.8 and 1.0) generating different ratios between the 80- and Sn140-kV images (e.g. factor 0.6 corresponds to 60% of their information from the 80-kV image, and 40% from the Sn140-kV image). CT values and SNRs measured in the ascending aorta, thyroid gland, fat, muscle, CSF, spinal cord, bone marrow and brain. In addition, CNR values calculated for aorta, thyroid, muscle and brain. Subjective image quality evaluated using a 5-point grading scale. Results compared using paired t-tests and nonparametric-paired Wilcoxon-Wilcox-test. Results: Statistically significant increases in mean CT values noted in anatomic structures when increasing weighting factors used (all P {<=} 0.001). For example, mean CT values derived from the contrast enhanced aorta were 149.2 {+-} 12.8 Hounsfield Units (HU), 204.8 {+-} 14.4 HU, 267.5 {+-} 18.6 HU, 311.9 {+-} 22.3 HU, 347.3 {+-} 24.7 HU, when the weighting factors 0.0, 0.3, 0.6, 0.8 and 1.0 were used. The highest SNR and CNR values were found in materials when the weighting factor 0.6 used. The difference CNR between the weighting factors 0.6 and 0.3 was statistically significant in the contrast enhanced aorta and thyroid gland (P = 0.012 and P = 0.016, respectively). Visual image assessment for image quality showed the highest score for the data reconstructed using the

  19. Image fusion in dual energy computed tomography for detection of various anatomic structures - Effect on contrast enhancement, contrast-to-noise ratio, signal-to-noise ratio and image quality

    International Nuclear Information System (INIS)

    Paul, Jijo; Bauer, Ralf W.; Maentele, Werner; Vogl, Thomas J.

    2011-01-01

    Objective: The purpose of this study was to evaluate image fusion in dual energy computed tomography for detecting various anatomic structures based on the effect on contrast enhancement, contrast-to-noise ratio, signal-to-noise ratio and image quality. Material and methods: Forty patients underwent a CT neck with dual energy mode (DECT under a Somatom Definition flash Dual Source CT scanner (Siemens, Forchheim, Germany)). Tube voltage: 80-kV and Sn140-kV; tube current: 110 and 290 mA s; collimation-2 x 32 x 0.6 mm. Raw data were reconstructed using a soft convolution kernel (D30f). Fused images were calculated using a spectrum of weighting factors (0.0, 0.3, 0.6 0.8 and 1.0) generating different ratios between the 80- and Sn140-kV images (e.g. factor 0.6 corresponds to 60% of their information from the 80-kV image, and 40% from the Sn140-kV image). CT values and SNRs measured in the ascending aorta, thyroid gland, fat, muscle, CSF, spinal cord, bone marrow and brain. In addition, CNR values calculated for aorta, thyroid, muscle and brain. Subjective image quality evaluated using a 5-point grading scale. Results compared using paired t-tests and nonparametric-paired Wilcoxon-Wilcox-test. Results: Statistically significant increases in mean CT values noted in anatomic structures when increasing weighting factors used (all P ≤ 0.001). For example, mean CT values derived from the contrast enhanced aorta were 149.2 ± 12.8 Hounsfield Units (HU), 204.8 ± 14.4 HU, 267.5 ± 18.6 HU, 311.9 ± 22.3 HU, 347.3 ± 24.7 HU, when the weighting factors 0.0, 0.3, 0.6, 0.8 and 1.0 were used. The highest SNR and CNR values were found in materials when the weighting factor 0.6 used. The difference CNR between the weighting factors 0.6 and 0.3 was statistically significant in the contrast enhanced aorta and thyroid gland (P = 0.012 and P = 0.016, respectively). Visual image assessment for image quality showed the highest score for the data reconstructed using the weighting factor 0

  20. Raman scattering method and apparatus for measuring isotope ratios and isotopic abundances

    International Nuclear Information System (INIS)

    Harney, R.C.; Bloom, S.D.

    1978-01-01

    Raman scattering is used to measure isotope ratios and/or isotopic abundances. A beam of quasi-monochromatic photons is directed onto the sample to be analyzed, and the resulting Raman-scattered photons are detected and counted for each isotopic species of interest. These photon counts are treated mathematically to yield the desired isotope ratios or isotopic abundances

  1. Sex ratios of Mountain Plovers from egg production to fledging

    Directory of Open Access Journals (Sweden)

    Margaret M. Riordan

    2015-12-01

    Full Text Available Skewed sex ratios can have negative implications for population growth if they do not match a species' life history. A skewed tertiary sex ratio has been detected in a population of Mountain Plover (Charadrius montanus, a grassland shorebird experiencing population declines. To study the cause of the observed male skew, we examined three early life stages between egg and fledgling in eastern Colorado from 2010 to 2012. This allows us to distinguish between egg production and chick survival as an explanation for the observed skew. We examined the primary sex ratio in eggs produced and the secondary sex ratio in hatched chicks to see if the sex ratio bias occurs before hatching. We also determined the sex ratio at fledging to reveal sex-specific mortality of nestlings. The primary sex ratio was 1.01 (± 0.01 males per female. The secondary sex ratio consisted of 1.10 (± 0.02 males per female. The probability of a chick surviving to fledging differed between males (0.55 ± 0.13 and females (0.47 ± 0.15, but the precision of these survival estimates was low. Sex ratios in early life stages of the Mountain Plover do not explain the skewed sex ratio observed in adults in this breeding population.

  2. A Comparison of the Cheater Detection and the Unrelated Question Models: A Randomized Response Survey on Physical and Cognitive Doping in Recreational Triathletes

    Science.gov (United States)

    Schröter, Hannes; Studzinski, Beatrix; Dietz, Pavel; Ulrich, Rolf; Striegel, Heiko; Simon, Perikles

    2016-01-01

    Purpose This study assessed the prevalence of physical and cognitive doping in recreational triathletes with two different randomized response models, that is, the Cheater Detection Model (CDM) and the Unrelated Question Model (UQM). Since both models have been employed in assessing doping, the major objective of this study was to investigate whether the estimates of these two models converge. Material and Methods An anonymous questionnaire was distributed to 2,967 athletes at two triathlon events (Frankfurt and Wiesbaden, Germany). Doping behavior was assessed either with the CDM (Frankfurt sample, one Wiesbaden subsample) or the UQM (one Wiesbaden subsample). A generalized likelihood-ratio test was employed to check whether the prevalence estimates differed significantly between models. In addition, we compared the prevalence rates of the present survey with those of a previous study on a comparable sample. Results After exclusion of incomplete questionnaires and outliers, the data of 2,017 athletes entered the final data analysis. Twelve-month prevalence for physical doping ranged from 4% (Wiesbaden, CDM and UQM) to 12% (Frankfurt CDM), and for cognitive doping from 1% (Wiesbaden, CDM) to 9% (Frankfurt CDM). The generalized likelihood-ratio test indicated no differences in prevalence rates between the two methods. Furthermore, there were no significant differences in prevalences between the present (undertaken in 2014) and the previous survey (undertaken in 2011), although the estimates tended to be smaller in the present survey. Discussion The results suggest that the two models can provide converging prevalence estimates. The high rate of cheaters estimated by the CDM, however, suggests that the present results must be seen as a lower bound and that the true prevalence of doping might be considerably higher. PMID:27218830

  3. Multicenter validation of PCR-based method for detection of Salmonella in chicken and pig samples

    DEFF Research Database (Denmark)

    Malorny, B.; Cook, N.; D'Agostino, M.

    2004-01-01

    As part of a standardization project, an interlaboratory trial including 15 laboratories from 13 European countries was conducted to evaluate the performance of a noproprietary polymerase chain reaction (PCR)-based method for the detection of Salmonella on artificially contaminated chicken rinse...... or positive. Outlier results caused, for example, by gross departures from the experimental protocol, were omitted from the analysis. For both the chicken rinse and the pig swab samples, the diagnostic sensitivity was 100%, with 100% accordance (repeatability) and concordance (reproducibility). The diagnostic...... specificity was 80.1% (with 85.7% accordance and 67.5% concordance) for chicken rinse, and 91.7% (with 100% accordance and 83.3% concordance) for pig swab. Thus, the interlaboratory variation due to personnel, reagents, thermal cyclers, etc., did not affect the performance of the method, which...

  4. checkCIF/PLATON report Datablock: I

    Indian Academy of Sciences (India)

    Moiety formula C24 H16 Cl2 Mn N4 O8 ... has ADP max/min Ratio . ... 4.0 Ratio. PLAT241_ALERT_2_C High 'MainMol' Ueq as Compared to Neighbors of ... carefully designed to identify outliers and unusual parameters, but every test has its ...

  5. 42 CFR 419.43 - Adjustments to national program payment and beneficiary copayment amounts.

    Science.gov (United States)

    2010-10-01

    ... departments within the hospital. (5) Cost-to-charge ratios for calculating charges adjusted to cost. For... to calculate an overall ancillary cost-to-charge ratio are not available to the Medicare contractor... calculating copayment amounts. (6) Outliers. The payment adjustment in paragraph (g)(2) of this section is...

  6. checkCIF/PLATON report Datablock: I

    Indian Academy of Sciences (India)

    Moiety formula C14 H18 Mo O2 S2 ? ... has ADP max/min Ratio . ... PLAT250_ALERT_2_C Large U3/U1 Ratio for Average U(i,j) Tensor . .... carefully designed to identify outliers and unusual parameters, but every test has its limitations and.

  7. Radionuclide angiocardiography. Improved diagnosis and quantitation of left-to-right shunts using area ratio techniques in children

    International Nuclear Information System (INIS)

    Alderson, P.O.; Jost, R.G.; Strauss, A.W.; Boonvisut, S.; Markham, J.

    1975-01-01

    A comparison of several reported methods for detection and quantitation of left-to-right shunts by radionuclides was performed in 50 children. Count ratio (C2/C1) techniques were compared with the exponential extrapolation and gamma function area ratio techniques. C2/C1 ratios accurately detected shunts and could reliably separate shunts from normals, but there was a high rate of false positives in children with valvular heart disease. The area ratio methods provided more accurate shunt quantitation and a better separation of patients with valvular heart disease than did the C2/C1 ratio. The gamma function method showed a higher correlation with oximetry than the exponential method, but the difference was not statistically significant. For accurate shunt quantitation and a reliable separation of patients with valvular heart disease from those with shunts, area ratio calculations are preferable to the C2/C1 ratio

  8. Experience from long-term monitoring of RAKR ratios in 192Ir brachytherapy

    International Nuclear Information System (INIS)

    Carlsson Tedgren, Asa; Bengtsson, Emil; Hedtjaern, Hakan; Johansson, Asa; Karlsson, Leif; Lamm, Inger-Lena; Lundell, Marie; Mejaddem, Younes; Munck af Rosenschoeld, Per; Nilsson, Josef; Wieslander, Elinore; Wolke, Jeanette

    2008-01-01

    Background: Ratios of values of brachytherapy source strengths, as measured by hospitals and vendors, comprise constant differences as, e.g., systematic errors in ion chamber calibration factors and measurement setup. Such ratios therefore have the potential to reveal the systematic changes in routines or calibration services at either the hospital or the vendor laboratory, which could otherwise be hidden by the uncertainty in the source strength values. Methods: The RAKR of each new source in 13 afterloading units at five hospitals were measured by well-type ion chambers and compared to values for the same source stated on vendor certificates. Results: Differences from unity in the ratios of RAKR values determined by hospitals and vendors are most often small and stable around their mean values to within ±1.5%. Larger deviations are rare but occur. A decreasing ratio, seen at two hospitals for the same source, was useful in detecting an erroneous pressure gauge at the vendor's site. Conclusions: Establishing a mean ratio of RAKR values, as measured at the hospital and supplied on the vendor certificate, and monitoring this as a function of time are an easy way for the early detection of problems with equipment or routines at either the hospital or the vendor site

  9. Effective case/infection ratio of poliomyelitis in vaccinated populations.

    Science.gov (United States)

    Bencskó, G; Ferenci, T

    2016-07-01

    Recent polio outbreaks in Syria and Ukraine, and isolation of poliovirus from asymptomatic carriers in Israel have raised concerns that polio might endanger Europe. We devised a model to calculate the time needed to detect the first case should the disease be imported into Europe, taking the effect of vaccine coverage - both from inactivated and oral polio vaccines, also considering their differences - on the length of silent transmission into account by deriving an 'effective' case/infection ratio that is applicable for vaccinated populations. Using vaccine coverage data and the newly developed model, the relationship between this ratio and vaccine coverage is derived theoretically and is also numerically determined for European countries. This shows that unnoticed transmission is longer for countries with higher vaccine coverage and a higher proportion of IPV-vaccinated individuals among those vaccinated. Assuming borderline transmission (R = 1·1), the expected time to detect the first case is between 326 days and 512 days in different countries, with the number of infected individuals between 235 and 1439. Imperfect surveillance further increases these numbers, especially the number of infected until detection. While longer silent transmission does not increase the number of clinical diseases, it can make the application of traditional outbreak response methods more complicated, among others.

  10. checkCIF/PLATON report Datablock: exp_1c

    Indian Academy of Sciences (India)

    Sum formula. C42 H34 Cl4 I4 ... has ADP max/min Ratio ..... 3.5 prolat ... PLAT250_ALERT_2_C Large U3/U1 Ratio for Average U(i,j) Tensor . ... outliers and unusual parameters, but every test has its limitations and alerts that are not important.

  11. checkCIF/PLATON report Datablock: 8

    Indian Academy of Sciences (India)

    Moiety formula C88 H36 F20 O4 S4, 2(O) ... 6.5 Ratio. PLAT241_ALERT_2_B High 'MainMol' Ueq as Compared to Neighbors of ... has ADP max/min Ratio . ... outliers and unusual parameters, but every test has its limitations and alerts that ...

  12. Robust Non-Local TV-L1 Optical Flow Estimation with Occlusion Detection.

    Science.gov (United States)

    Zhang, Congxuan; Chen, Zhen; Wang, Mingrun; Li, Ming; Jiang, Shaofeng

    2017-06-05

    In this paper, we propose a robust non-local TV-L1 optical flow method with occlusion detection to address the problem of weak robustness of optical flow estimation with motion occlusion. Firstly, a TV-L1 form for flow estimation is defined using a combination of the brightness constancy and gradient constancy assumptions in the data term and by varying the weight under the Charbonnier function in the smoothing term. Secondly, to handle the potential risk of the outlier in the flow field, a general non-local term is added in the TV-L1 optical flow model to engender the typical non-local TV-L1 form. Thirdly, an occlusion detection method based on triangulation is presented to detect the occlusion regions of the sequence. The proposed non-local TV-L1 optical flow model is performed in a linearizing iterative scheme using improved median filtering and a coarse-to-fine computing strategy. The results of the complex experiment indicate that the proposed method can overcome the significant influence of non-rigid motion, motion occlusion, and large displacement motion. Results of experiments comparing the proposed method and existing state-of-the-art methods by respectively using Middlebury and MPI Sintel database test sequences show that the proposed method has higher accuracy and better robustness.

  13. High-precision branching ratio measurement for the superallowed β+ emitter 62Ga

    International Nuclear Information System (INIS)

    Finlay, P.E.J.

    2007-01-01

    A high-precision branching ratio measurement for the superallowed β + decay of 62 Ga was performed at the Isotope Separator and Accelerator radioactive ion beam facility. An array of 20 high-purity germanium detectors known as the 8π spectrometer was employed to detect the rays emitted following the Gamow-Teller and non-analog Fermi decays of 62 Ga, while the plastic scintillator array known as SCEPTAR was used to detect the emitted particles. A total of 32 γ rays were identified, establishing the superallowed branching ratio to be 99:859(8)%. Combined with the most recent half-life and Q-value measurements for 62 Ga, this branching ratio yields an ft-value of 3074.3 ± 1.1 s. Comparisons between the superallowed ft-value determined in this work and the world average Ft-bar are made, providing a benchmark for the refinement of theoretical models used to describe isospin-symmetry breaking in A ≥ 62 nuclei. (author)

  14. Identifying genetic signatures of selection in a non-model species, alpine gentian (Gentiana nivalis L.), using a landscape genetic approach

    DEFF Research Database (Denmark)

    Bothwell, H.; Bisbing, S.; Therkildsen, Nina Overgaard

    2013-01-01

    It is generally accepted that most plant populations are locally adapted. Yet, understanding how environmental forces give rise to adaptive genetic variation is a challenge in conservation genetics and crucial to the preservation of species under rapidly changing climatic conditions. Environmental...... loci, we compared outlier locus detection methods with a recently-developed landscape genetic approach. We analyzed 157 loci from samples of the alpine herb Gentiana nivalis collected across the European Alps. Principle coordinates of neighbor matrices (PCNM), eigenvectors that quantify multi...... variables identified eight more potentially adaptive loci than models run without spatial variables. 3) When compared to outlier detection methods, the landscape genetic approach detected four of the same loci plus 11 additional loci. 4) Temperature, precipitation, and solar radiation were the three major...

  15. Shallow Transits—Deep Learning. I. Feasibility Study of Deep Learning to Detect Periodic Transits of Exoplanets

    Science.gov (United States)

    Zucker, Shay; Giryes, Raja

    2018-04-01

    Transits of habitable planets around solar-like stars are expected to be shallow, and to have long periods, which means low information content. The current bottleneck in the detection of such transits is caused in large part by the presence of red (correlated) noise in the light curves obtained from the dedicated space telescopes. Based on the groundbreaking results deep learning achieves in many signal and image processing applications, we propose to use deep neural networks to solve this problem. We present a feasibility study, in which we applied a convolutional neural network on a simulated training set. The training set comprised light curves received from a hypothetical high-cadence space-based telescope. We simulated the red noise by using Gaussian Processes with a wide variety of hyper-parameters. We then tested the network on a completely different test set simulated in the same way. Our study proves that very difficult cases can indeed be detected. Furthermore, we show how detection trends can be studied and detection biases quantified. We have also checked the robustness of the neural-network performance against practical artifacts such as outliers and discontinuities, which are known to affect space-based high-cadence light curves. Future work will allow us to use the neural networks to characterize the transit model and identify individual transits. This new approach will certainly be an indispensable tool for the detection of habitable planets in the future planet-detection space missions such as PLATO.

  16. Seasonal Adjustment with the R Packages x12 and x12GUI

    OpenAIRE

    Kowarik, Alexander; Meraner, Angelika; Templ, Matthias; Schopfhauser, Daniel

    2014-01-01

    The X-12-ARIMA seasonal adjustment program of the US Census Bureau extracts the different components (mainly: seasonal component, trend component, outlier component and irregular component) of a monthly or quarterly time series. It is the state-of-the- art technology for seasonal adjustment used in many statistical offices. It is possible to include a moving holiday effect, a trading day effect and user-defined regressors, and additionally incorporates automatic outlier detection. The procedu...

  17. RATIO_TOOL - SOFTWARE FOR COMPUTING IMAGE RATIOS

    Science.gov (United States)

    Yates, G. L.

    1994-01-01

    Geological studies analyze spectral data in order to gain information on surface materials. RATIO_TOOL is an interactive program for viewing and analyzing large multispectral image data sets that have been created by an imaging spectrometer. While the standard approach to classification of multispectral data is to match the spectrum for each input pixel against a library of known mineral spectra, RATIO_TOOL uses ratios of spectral bands in order to spot significant areas of interest within a multispectral image. Each image band can be viewed iteratively, or a selected image band of the data set can be requested and displayed. When the image ratios are computed, the result is displayed as a gray scale image. At this point a histogram option helps in viewing the distribution of values. A thresholding option can then be used to segment the ratio image result into two to four classes. The segmented image is then color coded to indicate threshold classes and displayed alongside the gray scale image. RATIO_TOOL is written in C language for Sun series computers running SunOS 4.0 and later. It requires the XView toolkit and the OpenWindows window manager (version 2.0 or 3.0). The XView toolkit is distributed with Open Windows. A color monitor is also required. The standard distribution medium for RATIO_TOOL is a .25 inch streaming magnetic tape cartridge in UNIX tar format. An electronic copy of the documentation is included on the program media. RATIO_TOOL was developed in 1992 and is a copyrighted work with all copyright vested in NASA. Sun, SunOS, and OpenWindows are trademarks of Sun Microsystems, Inc. UNIX is a registered trademark of AT&T Bell Laboratories.

  18. Evolution of Sangiovese Wines With Varied Tannin and Anthocyanin Ratios During Oxidative Aging

    Science.gov (United States)

    Gambuti, Angelita; Picariello, Luigi; Rinaldi, Alessandra; Moio, Luigi

    2018-01-01

    Changes in phenolic compounds, chromatic characteristics, acetaldehyde, and protein-reactive tannins associated with oxidative aging were studied in Sangiovese wines with varied tannin T/anthocyanin A ratios. For this purpose, three Sangiovese vineyards located in Tuscany were considered in the 2016 vintage. To obtain wines with different T/A ratios, two red wines were produced from each vinification batch: a free run juice with a lower T/A ratio and a marc pressed wine with a higher T/A ratio. An overall of six wines with T/A ratios ranging between 5 and 23 were produced. An oxidation treatment (four saturation cycles) was applied to each wine. Average and initial oxygen consumption rates (OCR) were positively correlated to VRF/mA (vanilline reactive flavans/monomeric anthocyanins) and T/A ratios while OCRs were negatively related to the wine content in monomeric and total anthocyanins. The higher the A content was, the greater the loss of total and free anthocyanins. A significant lower production of polymeric pigments was detected in all pressed wines with respect to the correspondant free run one. A gradual decrease of tannin reactivity toward saliva proteins after the application of oxygen saturation cycles was detected. The results obtained in this experiment indicate that VRF/mA and T/A ratios are among the fundamental parameters to evaluate before choosing the antioxidant protection to be used and the right oxidation level to apply for a longer shelf-life of red wine. PMID:29600246

  19. Evolution of Sangiovese Wines With Varied Tannin and Anthocyanin Ratios During Oxidative Aging.

    Science.gov (United States)

    Gambuti, Angelita; Picariello, Luigi; Rinaldi, Alessandra; Moio, Luigi

    2018-01-01

    Changes in phenolic compounds, chromatic characteristics, acetaldehyde, and protein-reactive tannins associated with oxidative aging were studied in Sangiovese wines with varied tannin T/anthocyanin A ratios. For this purpose, three Sangiovese vineyards located in Tuscany were considered in the 2016 vintage. To obtain wines with different T/A ratios, two red wines were produced from each vinification batch: a free run juice with a lower T/A ratio and a marc pressed wine with a higher T/A ratio. An overall of six wines with T/A ratios ranging between 5 and 23 were produced. An oxidation treatment (four saturation cycles) was applied to each wine. Average and initial oxygen consumption rates (OCR) were positively correlated to VRF/mA (vanilline reactive flavans/monomeric anthocyanins) and T/A ratios while OCRs were negatively related to the wine content in monomeric and total anthocyanins. The higher the A content was, the greater the loss of total and free anthocyanins. A significant lower production of polymeric pigments was detected in all pressed wines with respect to the correspondant free run one. A gradual decrease of tannin reactivity toward saliva proteins after the application of oxygen saturation cycles was detected. The results obtained in this experiment indicate that VRF/mA and T/A ratios are among the fundamental parameters to evaluate before choosing the antioxidant protection to be used and the right oxidation level to apply for a longer shelf-life of red wine.

  20. Evolution of Sangiovese Wines with Varied Tannin and Anthocyanin Ratios during Oxidative Aging

    Science.gov (United States)

    Gambuti, Angelita; Picariello, Luigi; Rinaldi, Alessandra; Moio, Luigi

    2018-03-01

    Changes in phenolic compounds, chromatic characteristics, acetaldehyde, and protein-reactive tannins associated with oxidative aging were studied in Sangiovese wines with varied tannin T/anthocyanin A ratios. For this purpose, three Sangiovese vineyards located in Tuscany were considered in the 2016 vintage. To obtain wines with different T/A ratios, two red wines were produced from each vinification batch: a free run juice with a lower T/A ratio and a marc pressed wine with a higher T/A ratio. An overall of 6 wines with T/A ratios ranging between 5 and 23 were produced. An oxidation treatment (four saturation cycles) was applied to each wine. Average and initial oxygen consumption rates (OCR) were positively correlated to VRF/mA (vanilline reactive flavans/monomeric anthocyanins) and T/A ratios while OCRs were negatively related to the wine content in monomeric and total anthocyanins. The higher the A content was, the greater the loss of total and free anthocyanins. A significant lower production of polymeric pigments was detected in all pressed wines with respect to the correspondant free run one. A gradual decrease of tannin reactivity towards saliva proteins after the application of oxygen saturation cycles was detected. The results obtained in this experiment indicate that VRF/mA and T/A ratios are among the fundamental parameters to evaluate before choosing the antioxidant protection to be used and the right oxidation level to apply for a longer shelf-life of red wine.

  1. Evolution of Sangiovese Wines With Varied Tannin and Anthocyanin Ratios During Oxidative Aging

    Directory of Open Access Journals (Sweden)

    Angelita Gambuti

    2018-03-01

    Full Text Available Changes in phenolic compounds, chromatic characteristics, acetaldehyde, and protein-reactive tannins associated with oxidative aging were studied in Sangiovese wines with varied tannin T/anthocyanin A ratios. For this purpose, three Sangiovese vineyards located in Tuscany were considered in the 2016 vintage. To obtain wines with different T/A ratios, two red wines were produced from each vinification batch: a free run juice with a lower T/A ratio and a marc pressed wine with a higher T/A ratio. An overall of six wines with T/A ratios ranging between 5 and 23 were produced. An oxidation treatment (four saturation cycles was applied to each wine. Average and initial oxygen consumption rates (OCR were positively correlated to VRF/mA (vanilline reactive flavans/monomeric anthocyanins and T/A ratios while OCRs were negatively related to the wine content in monomeric and total anthocyanins. The higher the A content was, the greater the loss of total and free anthocyanins. A significant lower production of polymeric pigments was detected in all pressed wines with respect to the correspondant free run one. A gradual decrease of tannin reactivity toward saliva proteins after the application of oxygen saturation cycles was detected. The results obtained in this experiment indicate that VRF/mA and T/A ratios are among the fundamental parameters to evaluate before choosing the antioxidant protection to be used and the right oxidation level to apply for a longer shelf-life of red wine.

  2. Isotope ratios of lead as pollutant source indicators

    International Nuclear Information System (INIS)

    Chow, T.J.; Snyder, C.B.; Earal, J.L.

    1975-01-01

    Each lead ore deposit has its characteristic isotope ratios which are fixed during mineral ore genesis, and this unique property can be used to indicate the source of lead pollutants in the environment. The wolld production of primary lead is tabulated, and the geochemical significances of lead isotope ratios are discussed. The manufacture of lead alkyl additives for gasoline, which is the major source of lead pollutants, utilizes about 10% of the world annual consumption of lead. The isotope ratios of lead in gasoline, aerosols, soils and plants are correlated. Lead additives in various brands of gasoline sold in one region do not have the same isotope ratios. Regional variations in isotope ratios of lead additives were observed. This reflects the fact that petroleum refineries obtained the additives from various lead alkyl manufacturers which utilized lead from different mining districts. A definite changing trend of isotope ratios of lead pollutants in the San Diego, California (USA), area was detected. The lead shows a gradual increase in its radiogenic components during the past decade. This trend can be explained by the change of lead sources used by the additive manufacturers: Lead isotope ratios of the mid-1960's gasoline additives in the United States of America reflected those of less radiogenic leads imported from Canada, Australia, Peru and Mexico. Since then, the U.S. lead production has doubled-mainly from the Missouri district of highly radiogenic lead. Meanwhile, there has been a decrease in total lead imports. These combined effects result in changes in isotope ratios, from the less to more radiogenic, of the pooled lead. (aothor)

  3. The stopping rules for winsorized tree

    Science.gov (United States)

    Ch'ng, Chee Keong; Mahat, Nor Idayu

    2017-11-01

    Winsorized tree is a modified tree-based classifier that is able to investigate and to handle all outliers in all nodes along the process of constructing the tree. It overcomes the tedious process of constructing a classical tree where the splitting of branches and pruning go concurrently so that the constructed tree would not grow bushy. This mechanism is controlled by the proposed algorithm. In winsorized tree, data are screened for identifying outlier. If outlier is detected, the value is neutralized using winsorize approach. Both outlier identification and value neutralization are executed recursively in every node until predetermined stopping criterion is met. The aim of this paper is to search for significant stopping criterion to stop the tree from further splitting before overfitting. The result obtained from the conducted experiment on pima indian dataset proved that the node could produce the final successor nodes (leaves) when it has achieved the range of 70% in information gain.

  4. Optimization of a Compton-suppression system by escape-peak ratio

    International Nuclear Information System (INIS)

    Niu, H.; Chao, J.H.; Wu, S.-C.

    1996-01-01

    A Compton-suppression system consisting of an HPGe central detector surrounded by eight BGO scintillators in an annular geometry was assembled. This system is dedicated to in-beam γ-ray measurements. The ratios of full-energy to single-escape peak and full-energy of double-escape peak, at γ-rays of 2754, 4443 and 6130 keV, were used to derive associated suppression factors in order to optimize detection conditions of the system. The suppression factors derived both from the escape peak ratios and the corresponding peak-to-Compton ratios of the γ-ray spectra are compared and discussed. This optimization technique may be of great significance for analyzing complicated spectra, where high-energy γ-rays are considered for analytical use. (Author)

  5. Laser assisted ratio analysis - An alternative to GC/IRMS for CO2

    International Nuclear Information System (INIS)

    Murnick, D.E.

    2001-01-01

    A new technique for laser based analysis of carbon isotope ratios, with the acronym LARA, based on large isotope shifts in molecular spectra, the use of fixed frequency isotopic lasers, and sensitive detection via the laser optogalvanic effect is reviewed and compared with GC/IRMS for carbon dioxide in specific applications. The possibility for development of new classes of isotope ratio measurement systems with LARA is explored. (author)

  6. Statistical analysis of COMPTEL maximum likelihood-ratio distributions: evidence for a signal from previously undetected AGN

    International Nuclear Information System (INIS)

    Williams, O. R.; Bennett, K.; Much, R.; Schoenfelder, V.; Blom, J. J.; Ryan, J.

    1997-01-01

    The maximum likelihood-ratio method is frequently used in COMPTEL analysis to determine the significance of a point source at a given location. In this paper we do not consider whether the likelihood-ratio at a particular location indicates a detection, but rather whether distributions of likelihood-ratios derived from many locations depart from that expected for source free data. We have constructed distributions of likelihood-ratios by reading values from standard COMPTEL maximum-likelihood ratio maps at positions corresponding to the locations of different categories of AGN. Distributions derived from the locations of Seyfert galaxies are indistinguishable, according to a Kolmogorov-Smirnov test, from those obtained from ''random'' locations, but differ slightly from those obtained from the locations of flat spectrum radio loud quasars, OVVs, and BL Lac objects. This difference is not due to known COMPTEL sources, since regions near these sources are excluded from the analysis. We suggest that it might arise from a number of sources with fluxes below the COMPTEL detection threshold

  7. Detecting New Pedestrian Facilities from VGI Data Sources

    Science.gov (United States)

    Zhong, S.; Xie, Z.

    2017-12-01

    Pedestrian facility (e.g. footbridge, pedestrian crossing and underground passage) information is an important basic data of location based service (LBS) for pedestrians. However, timely updating pedestrian facility information challenges due to facilities change frequently. Previous pedestrian facility information collecting and updating tasks are mainly completed by highly trained specialized persons. However, this conventional approach has several disadvantages such as high cost, long update cycle and so on. Volunteered Geographic Information (VGI) has proven efficiency to provide new, free and fast growing spatial data. Pedestrian trajectory, which can be seen as measurements of real pedestrian road, is one of the most valuable information of VGI data. Although the accuracy of the trajectories is not too high, due to the large number of measurements, an improvement of quality of the road information can be achieved. Thus, we develop a method for detecting new pedestrian facilities based on the current road network and pedestrian trajectories. Specifically, 1) by analyzing speed, distance and direction, those outliers of pedestrian trajectories are removed, 2) a road network matching algorithm is developed for eliminating redundant trajectories, and 3) a space-time cluster algorithm is adopted for detecting new walking facilities. The performance of the method is evaluated with a series of experiments conducted on a part of the road network of Heifei and a large number of real pedestrian trajectories, and verified the results by using Tencent Street map. The results show that the proposed method is able to detecting new pedestrian facilities from VGI data accurately. We believe that the proposed method provides an alternative way for general road data acquisition, and can improve the quality of LBS for pedestrians.

  8. Predictive value of spot urine albumin-to-creatinine ratio for ...

    African Journals Online (AJOL)

    ABEOLUGBENGAS

    diagnosed hypertensive patients. 1. 2. 1. 3. 4. 1. 1 ... Keywords: Hypertension, microalbuminuria, albumin-to-creatinine ratio, left ventricular hypertrophy .... an average blood pressure of ≥140mmHg .... be due to variation in methods of detecting .... Unexpectedly high prevalence of target organ damage in newly diagnosed.

  9. Adaptive Framework for Classification and Novel Class Detection over Evolving Data Streams with Limited Labeled Data.

    Energy Technology Data Exchange (ETDEWEB)

    Haque, Ahsanul [Univ. of Texas, Dallas, TX (United States); Khan, Latifur [Univ. of Texas, Dallas, TX (United States); Baron, Michael [Univ. of Texas, Dallas, TX (United States); Ingram, Joey Burton [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    Most approaches to classifying evolving data streams either divide the stream of data into fixed-size chunks or use gradual forgetting to address the problems of infinite length and concept drift. Finding the fixed size of the chunks or choosing a forgetting rate without prior knowledge about time-scale of change is not a trivial task. As a result, these approaches suffer from a trade-off between performance and sensitivity. To address this problem, we present a framework which uses change detection techniques on the classifier performance to determine chunk boundaries dynamically. Though this framework exhibits good performance, it is heavily dependent on the availability of true labels of data instances. However, labeled data instances are scarce in realistic settings and not readily available. Therefore, we present a second framework which is unsupervised in nature, and exploits change detection on classifier confidence values to determine chunk boundaries dynamically. In this way, it avoids the use of labeled data while still addressing the problems of infinite length and concept drift. Moreover, both of our proposed frameworks address the concept evolution problem by detecting outliers having similar values for the attributes. We provide theoretical proof that our change detection method works better than other state-of-the-art approaches in this particular scenario. Results from experiments on various benchmark and synthetic data sets also show the efficiency of our proposed frameworks.

  10. Detection of material property errors in handbooks and databases using artificial neural networks with hidden correlations

    Science.gov (United States)

    Zhang, Y. M.; Evans, J. R. G.; Yang, S. F.

    2010-11-01

    The authors have discovered a systematic, intelligent and potentially automatic method to detect errors in handbooks and stop their transmission using unrecognised relationships between materials properties. The scientific community relies on the veracity of scientific data in handbooks and databases, some of which have a long pedigree covering several decades. Although various outlier-detection procedures are employed to detect and, where appropriate, remove contaminated data, errors, which had not been discovered by established methods, were easily detected by our artificial neural network in tables of properties of the elements. We started using neural networks to discover unrecognised relationships between materials properties and quickly found that they were very good at finding inconsistencies in groups of data. They reveal variations from 10 to 900% in tables of property data for the elements and point out those that are most probably correct. Compared with the statistical method adopted by Ashby and co-workers [Proc. R. Soc. Lond. Ser. A 454 (1998) p. 1301, 1323], this method locates more inconsistencies and could be embedded in database software for automatic self-checking. We anticipate that our suggestion will be a starting point to deal with this basic problem that affects researchers in every field. The authors believe it may eventually moderate the current expectation that data field error rates will persist at between 1 and 5%.

  11. VISION BASED OBSTACLE DETECTION IN UAV IMAGING

    Directory of Open Access Journals (Sweden)

    S. Badrloo

    2017-08-01

    Full Text Available Detecting and preventing incidence with obstacles is crucial in UAV navigation and control. Most of the common obstacle detection techniques are currently sensor-based. Small UAVs are not able to carry obstacle detection sensors such as radar; therefore, vision-based methods are considered, which can be divided into stereo-based and mono-based techniques. Mono-based methods are classified into two groups: Foreground-background separation, and brain-inspired methods. Brain-inspired methods are highly efficient in obstacle detection; hence, this research aims to detect obstacles using brain-inspired techniques, which try to enlarge the obstacle by approaching it. A recent research in this field, has concentrated on matching the SIFT points along with, SIFT size-ratio factor and area-ratio of convex hulls in two consecutive frames to detect obstacles. This method is not able to distinguish between near and far obstacles or the obstacles in complex environment, and is sensitive to wrong matched points. In order to solve the above mentioned problems, this research calculates the dist-ratio of matched points. Then, each and every point is investigated for Distinguishing between far and close obstacles. The results demonstrated the high efficiency of the proposed method in complex environments.

  12. Download this PDF file

    African Journals Online (AJOL)

    OLUWOLE

    included clay, silt, silt + clay and silt/clay ratio (SCR), whereas fine sand and saturated hydraulic conductivity were highly .... Semivariance was calculated using: ( ). ( ). ( ) (. ) .... outliers did not dominate the measure of central tendency. Shulka et ...

  13. Association between triglyceride/HDL cholesterol ratio and carotid atherosclerosis in postmenopausal middle-aged women.

    Science.gov (United States)

    Masson, Walter; Siniawski, Daniel; Lobo, Martín; Molinero, Graciela; Huerín, Melina

    2016-01-01

    The triglyceride/HDL cholesterol ratio, as a surrogate marker of insulin resistance, may be associated to presence of subclinical carotid atherosclerosis in postmenopausal women. The aim of this study was to explore this association. Women (last menstrual period≥2 years) in primary prevention up to 65 years of age were recruited. Association between the triglyceride/HDL cholesterol (HDL-C) ratio and presence of carotid plaque, assessed by ultrasonography, was analyzed. ROC analysis was performed, determining the precision of this ratio to detect carotid plaque. A total of 332 women (age 57±5 years) were recruited. Triglyceride/HDL-C ratio was 2.35±1.6. Prevalence of carotid plaque was 29%. Women with carotid plaque had higher triglyceride/HDL-C ratios (3.33±1.96 vs. 2.1±1.2, P<.001) than women with no carotid plaque. A positive relationship was seen between quintiles of this ratio and prevalence of carotid plaque (p<.001). Regardless of other risk factors, women with higher triglyceride/HDL-C ratios were more likely to have carotid plaque (odds ratio 1.47, 95% confidence interval 1.20-1.79, P<.001). The area under the curve of the triglyceride/HDL-C ratio to detect carotid plaque was .71 (95% confidence interval .65 to .76), and the optimal cut-off point was 2.04. In postmenopausal women in primary prevention, insulin resistance, estimated from the triglyceride/HDL-C ratio, was independently associated to a greater probability of carotid plaque. A value of such ratio greater than 2 may be used for assessing cardiovascular risk in this particular group of women. Copyright © 2016 SEEN. Publicado por Elsevier España, S.L.U. All rights reserved.

  14. Oxygenated hemoglobin diffuse reflectance ratio for in vitro detection of human gastric pre-cancer

    Science.gov (United States)

    Li, L. Q.; Wei, H. J.; Guo, Z. Y.; Yang, H. Q.; Wu, G. Y.; Xie, S. S.; Zhong, H. Q.; Li, X. Y.; Zhao, Q. L.; Guo, X.

    2010-07-01

    Oxygenated hemoglobin diffuse reflectance (DR) ratio (R540/R575) method based on DR spectral signatures is used for early diagnosis of malignant lesions of human gastric epithelial tissues in vitro. The DR spectra for four different kinds of gastric epithelial tissues were measured using a spectrometer with an integrating sphere detector in the spectral range from 400 to 650 nm. The results of measurement showed that the average DR spectral intensity for the epithelial tissues of normal stomach is higher than that for the epithelial tissues of chronic and malignant stomach and that for the epithelial tissues of chronic gastric ulcer is higher than that for the epithelial tissues of malignant stomach. The average DR spectra for four different kinds of gastric epithelial tissues show dips at 542 and 577 nm owing to absorption from oxygenated Hemoglobin (HbO2). The differences in the mean R540/R575 ratios of HbO2 bands are 6.84% between the epithelial tissues of normal stomach and chronic gastric ulcer, 14.7% between the epithelial tissues of normal stomach and poorly differentiated gastric adenocarcinoma and 22.6% between the epithelial tissues of normal stomach and undifferentiated gastric adenocarcinoma. It is evident from results that there were significant differences in the mean R540/R575 ratios of HbO2 bands for four different kinds of gastric epithelial tissues in vitro ( P < 0.01).

  15. Quality assurance tool for organ at risk delineation in radiation therapy using a parametric statistical approach.

    Science.gov (United States)

    Hui, Cheukkai B; Nourzadeh, Hamidreza; Watkins, William T; Trifiletti, Daniel M; Alonso, Clayton E; Dutta, Sunil W; Siebers, Jeffrey V

    2018-02-26

    To develop a quality assurance (QA) tool that identifies inaccurate organ at risk (OAR) delineations. The QA tool computed volumetric features from prior OAR delineation data from 73 thoracic patients to construct a reference database. All volumetric features of the OAR delineation are computed in three-dimensional space. Volumetric features of a new OAR are compared with respect to those in the reference database to discern delineation outliers. A multicriteria outlier detection system warns users of specific delineation outliers based on combinations of deviant features. Fifteen independent experimental sets including automatic, propagated, and clinically approved manual delineation sets were used for verification. The verification OARs included manipulations to mimic common errors. Three experts reviewed the experimental sets to identify and classify errors, first without; and then 1 week after with the QA tool. In the cohort of manual delineations with manual manipulations, the QA tool detected 94% of the mimicked errors. Overall, it detected 37% of the minor and 85% of the major errors. The QA tool improved reviewer error detection sensitivity from 61% to 68% for minor errors (P = 0.17), and from 78% to 87% for major errors (P = 0.02). The QA tool assists users to detect potential delineation errors. QA tool integration into clinical procedures may reduce the frequency of inaccurate OAR delineation, and potentially improve safety and quality of radiation treatment planning. © 2018 American Association of Physicists in Medicine.

  16. Screening Test for Shed Skin Cells by Measuring the Ratio of Human DNA to Staphylococcus epidermidis DNA.

    Science.gov (United States)

    Nakanishi, Hiroaki; Ohmori, Takeshi; Hara, Masaaki; Takahashi, Shirushi; Kurosu, Akira; Takada, Aya; Saito, Kazuyuki

    2016-05-01

    A novel screening method for shed skin cells by detecting Staphylococcus epidermidis (S. epidermidis), which is a resident bacterium on skin, was developed. Staphylococcus epidermidis was detected using real-time PCR. Staphylococcus epidermidis was detected in all 20 human skin surface samples. Although not present in blood and urine samples, S. epidermidis was detected in 6 of 20 saliva samples, and 5 of 18 semen samples. The ratio of human DNA to S. epidermidisDNA was significantly smaller in human skin surface samples than in saliva and semen samples in which S. epidermidis was detected. Therefore, although skin cells could not be identified by detecting only S. epidermidis, they could be distinguished by measuring the S. epidermidis to human DNA ratio. This method could be applied to casework touch samples, which suggests that it is useful for screening whether skin cells and human DNA are present on potential evidentiary touch samples. © 2016 American Academy of Forensic Sciences.

  17. The Liquidity Coverage Ratio: the need for further complementary ratios?

    OpenAIRE

    Ojo, Marianne

    2013-01-01

    This paper considers components of the Liquidity Coverage Ratio – as well as certain prevailing gaps which may necessitate the introduction of a complementary liquidity ratio. The definitions and objectives accorded to the Liquidity Coverage Ratio (LCR) and Net Stable Funding Ratio (NSFR) highlight the focus which is accorded to time horizons for funding bank operations. A ratio which would focus on the rate of liquidity transformations and which could also serve as a complementary metric gi...

  18. Sources of Artefacts in Synthetic Aperture Radar Interferometry Data Sets

    Science.gov (United States)

    Becek, K.; Borkowski, A.

    2012-07-01

    In recent years, much attention has been devoted to digital elevation models (DEMs) produced using Synthetic Aperture Radar Interferometry (InSAR). This has been triggered by the relative novelty of the InSAR method and its world-famous product—the Shuttle Radar Topography Mission (SRTM) DEM. However, much less attention, if at all, has been paid to sources of artefacts in SRTM. In this work, we focus not on the missing pixels (null pixels) due to shadows or the layover effect, but rather on outliers that were undetected by the SRTM validation process. The aim of this study is to identify some of the causes of the elevation outliers in SRTM. Such knowledge may be helpful to mitigate similar problems in future InSAR DEMs, notably the ones currently being developed from data acquired by the TanDEM-X mission. We analysed many cross-sections derived from SRTM. These cross-sections were extracted over the elevation test areas, which are available from the Global Elevation Data Testing Facility (GEDTF) whose database contains about 8,500 runways with known vertical profiles. Whenever a significant discrepancy between the known runway profile and the SRTM cross-section was detected, a visual interpretation of the high-resolution satellite image was carried out to identify the objects causing the irregularities. A distance and a bearing from the outlier to the object were recorded. Moreover, we considered the SRTM look direction parameter. A comprehensive analysis of the acquired data allows us to establish that large metallic structures, such as hangars or car parking lots, are causing the outliers. Water areas or plain wet terrains may also cause an InSAR outlier. The look direction and the depression angle of the InSAR system in relation to the suspected objects influence the magnitude of the outliers. We hope that these findings will be helpful in designing the error detection routines of future InSAR or, in fact, any microwave aerial- or space-based survey. The

  19. SOURCES OF ARTEFACTS IN SYNTHETIC APERTURE RADAR INTERFEROMETRY DATA SETS

    Directory of Open Access Journals (Sweden)

    K. Becek

    2012-07-01

    Full Text Available In recent years, much attention has been devoted to digital elevation models (DEMs produced using Synthetic Aperture Radar Interferometry (InSAR. This has been triggered by the relative novelty of the InSAR method and its world-famous product—the Shuttle Radar Topography Mission (SRTM DEM. However, much less attention, if at all, has been paid to sources of artefacts in SRTM. In this work, we focus not on the missing pixels (null pixels due to shadows or the layover effect, but rather on outliers that were undetected by the SRTM validation process. The aim of this study is to identify some of the causes of the elevation outliers in SRTM. Such knowledge may be helpful to mitigate similar problems in future InSAR DEMs, notably the ones currently being developed from data acquired by the TanDEM-X mission. We analysed many cross-sections derived from SRTM. These cross-sections were extracted over the elevation test areas, which are available from the Global Elevation Data Testing Facility (GEDTF whose database contains about 8,500 runways with known vertical profiles. Whenever a significant discrepancy between the known runway profile and the SRTM cross-section was detected, a visual interpretation of the high-resolution satellite image was carried out to identify the objects causing the irregularities. A distance and a bearing from the outlier to the object were recorded. Moreover, we considered the SRTM look direction parameter. A comprehensive analysis of the acquired data allows us to establish that large metallic structures, such as hangars or car parking lots, are causing the outliers. Water areas or plain wet terrains may also cause an InSAR outlier. The look direction and the depression angle of the InSAR system in relation to the suspected objects influence the magnitude of the outliers. We hope that these findings will be helpful in designing the error detection routines of future InSAR or, in fact, any microwave aerial- or space

  20. The lumbar lordosis index: a new ratio to detect spinal malalignment with a therapeutic impact for sagittal balance correction decisions in adult scoliosis surgery.

    Science.gov (United States)

    Boissière, Louis; Bourghli, Anouar; Vital, Jean-Marc; Gille, Olivier; Obeid, Ibrahim

    2013-06-01

    Sagittal malalignment is frequently observed in adult scoliosis. C7 plumb line, lumbar lordosis and pelvic tilt are the main factors to evaluate sagittal balance and the need of a vertebral osteotomy to correct it. We described a ratio: the lumbar lordosis index (ratio lumbar lordosis/pelvic incidence) (LLI) and analyzed its relationships with spinal malalignment and vertebral osteotomies. 53 consecutive patients with a surgical adult scoliosis had preoperative and postoperative full spine EOS radiographies to measure spino-pelvic parameters and LLI. The lack of lordosis was calculated after prediction of theoretical lumbar lordosis. Correlation analysis between the different parameters was performed. All parameters were correlated with spinal malalignment but LLI is the most correlated parameter (r = -0.978). It is also the best parameter in this study to predict the need of a spinal osteotomy (r = 1 if LLI <0.5). LLI is a statistically validated parameter for sagittal malalignment analysis. It can be used as a mathematical tool to detect spinal malalignment in adult scoliosis and guides the surgeon decision of realizing a vertebral osteotomy for adult scoliosis sagittal correction. It can be used as well for the interpretation of clinical series in adult scoliosis.