WorldWideScience

Sample records for nearest neighbor classification

  1. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    National Research Council Canada - National Science Library

    Han, Euihong; Karypis, George; Kumar, Vipin

    1999-01-01

    .... The authors present a nearest neighbor classification scheme for text categorization in which the importance of discriminating words is learned using mutual information and weight adjustment techniques...

  2. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  3. On Competitiveness of Nearest-Neighbor-Based Music Classification: A Methodological Critique

    DEFF Research Database (Denmark)

    Pálmason, Haukur; Jónsson, Björn Thór; Amsaleg, Laurent

    2017-01-01

    The traditional role of nearest-neighbor classification in music classification research is that of a straw man opponent for the learning approach of the hour. Recent work in high-dimensional indexing has shown that approximate nearest-neighbor algorithms are extremely scalable, yielding results...... of reasonable quality from billions of high-dimensional features. With such efficient large-scale classifiers, the traditional music classification methodology of aggregating and compressing the audio features is incorrect; instead the approximate nearest-neighbor classifier should be given an extensive data...... collection to work with. We present a case study, using a well-known MIR classification benchmark with well-known music features, which shows that a simple nearest-neighbor classifier performs very competitively when given ample data. In this position paper, we therefore argue that nearest...

  4. Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm

    Directory of Open Access Journals (Sweden)

    E. Parvinnia

    2014-01-01

    Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.

  5. A new approach to very short term wind speed prediction using k-nearest neighbor classification

    International Nuclear Information System (INIS)

    Yesilbudak, Mehmet; Sagiroglu, Seref; Colak, Ilhami

    2013-01-01

    Highlights: ► Wind speed parameter was predicted in an n-tupled inputs using k-NN classification. ► The effects of input parameters, nearest neighbors and distance metrics were analyzed. ► Many useful and reasonable inferences were uncovered using the developed model. - Abstract: Wind energy is an inexhaustible energy source and wind power production has been growing rapidly in recent years. However, wind power has a non-schedulable nature due to wind speed variations. Hence, wind speed prediction is an indispensable requirement for power system operators. This paper predicts wind speed parameter in an n-tupled inputs using k-nearest neighbor (k-NN) classification and analyzes the effects of input parameters, nearest neighbors and distance metrics on wind speed prediction. The k-NN classification model was developed using the object oriented programming techniques and includes Manhattan and Minkowski distance metrics except from Euclidean distance metric on the contrary of literature. The k-NN classification model which uses wind direction, air temperature, atmospheric pressure and relative humidity parameters in a 4-tupled space achieved the best wind speed prediction for k = 5 in the Manhattan distance metric. Differently, the k-NN classification model which uses wind direction, air temperature and atmospheric pressure parameters in a 3-tupled inputs gave the worst wind speed prediction for k = 1 in the Minkowski distance metric

  6. Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance

    Science.gov (United States)

    Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi

    2017-11-01

    K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).

  7. Predicting Audience Location on the Basis of the k-Nearest Neighbor Multilabel Classification

    Directory of Open Access Journals (Sweden)

    Haitao Wu

    2014-01-01

    Full Text Available Understanding audience location information in online social networks is important in designing recommendation systems, improving information dissemination, and so on. In this paper, we focus on predicting the location distribution of audiences on YouTube. And we transform this problem to a multilabel classification problem, while we find there exist three problems when the classical k-nearest neighbor based algorithm for multilabel classification (ML-kNN is used to predict location distribution. Firstly, the feature weights are not considered in measuring the similarity degree. Secondly, it consumes considerable computing time in finding similar items by traversing all the training set. Thirdly, the goal of ML-kNN is to find relevant labels for every sample which is different from audience location prediction. To solve these problems, we propose the methods of measuring similarity based on weight, quickly finding similar items, and ranking a specific number of labels. On the basis of these methods and the ML-kNN, the k-nearest neighbor based model for audience location prediction (AL-kNN is proposed for predicting audience location. The experiments based on massive YouTube data show that the proposed model can more accurately predict the location of YouTube video audience than the ML-kNN, MLNB, and Rank-SVM methods.

  8. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN classification method

    Directory of Open Access Journals (Sweden)

    D.A. Adeniyi

    2016-01-01

    Full Text Available The major problem of many on-line web sites is the presentation of many choices to the client at a time; this usually results to strenuous and time consuming task in finding the right product or information on the site. In this work, we present a study of automatic web usage data mining and recommendation system based on current user behavior through his/her click stream data on the newly developed Really Simple Syndication (RSS reader website, in order to provide relevant information to the individual without explicitly asking for it. The K-Nearest-Neighbor (KNN classification method has been trained to be used on-line and in Real-Time to identify clients/visitors click stream data, matching it to a particular user group and recommend a tailored browsing option that meet the need of the specific user at a particular time. To achieve this, web users RSS address file was extracted, cleansed, formatted and grouped into meaningful session and data mart was developed. Our result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.

  9. Prototype Generation Using Multiobjective Particle Swarm Optimization for Nearest Neighbor Classification.

    Science.gov (United States)

    Hu, Weiwei; Tan, Ying

    2016-12-01

    The nearest neighbor (NN) classifier suffers from high time complexity when classifying a test instance since the need of searching the whole training set. Prototype generation is a widely used approach to reduce the classification time, which generates a small set of prototypes to classify a test instance instead of using the whole training set. In this paper, particle swarm optimization is applied to prototype generation and two novel methods for improving the classification performance are presented: 1) a fitness function named error rank and 2) the multiobjective (MO) optimization strategy. Error rank is proposed to enhance the generation ability of the NN classifier, which takes the ranks of misclassified instances into consideration when designing the fitness function. The MO optimization strategy pursues the performance on multiple subsets of data simultaneously, in order to keep the classifier from overfitting the training set. Experimental results over 31 UCI data sets and 59 additional data sets show that the proposed algorithm outperforms nearly 30 existing prototype generation algorithms.

  10. Nearest neighbors by neighborhood counting.

    Science.gov (United States)

    Wang, Hui

    2006-06-01

    Finding nearest neighbors is a general idea that underlies many artificial intelligence tasks, including machine learning, data mining, natural language understanding, and information retrieval. This idea is explicitly used in the k-nearest neighbors algorithm (kNN), a popular classification method. In this paper, this idea is adopted in the development of a general methodology, neighborhood counting, for devising similarity functions. We turn our focus from neighbors to neighborhoods, a region in the data space covering the data point in question. To measure the similarity between two data points, we consider all neighborhoods that cover both data points. We propose to use the number of such neighborhoods as a measure of similarity. Neighborhood can be defined for different types of data in different ways. Here, we consider one definition of neighborhood for multivariate data and derive a formula for such similarity, called neighborhood counting measure or NCM. NCM was tested experimentally in the framework of kNN. Experiments show that NCM is generally comparable to VDM and its variants, the state-of-the-art distance functions for multivariate data, and, at the same time, is consistently better for relatively large k values. Additionally, NCM consistently outperforms HEOM (a mixture of Euclidean and Hamming distances), the "standard" and most widely used distance function for multivariate data. NCM has a computational complexity in the same order as the standard Euclidean distance function and NCM is task independent and works for numerical and categorical data in a conceptually uniform way. The neighborhood counting methodology is proven sound for multivariate data experimentally. We hope it will work for other types of data.

  11. Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds

    Directory of Open Access Journals (Sweden)

    Chin-Hsing Chen

    2015-06-01

    Full Text Available A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician’s subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications.

  12. Dimensionality reduction with unsupervised nearest neighbors

    CERN Document Server

    Kramer, Oliver

    2013-01-01

    This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustr...

  13. A Coupled k-Nearest Neighbor Algorithm for Multi-Label Classification

    Science.gov (United States)

    2015-05-22

    classification, an image may contain several concepts simultaneously, such as beach, sunset and kangaroo . Such tasks are usually denoted as multi-label...informatics, a gene can belong to both metabolism and transcription classes; and in music categorization, a song may labeled as Mozart and sad. In the

  14. Classification of matrix-product ground states corresponding to one-dimensional chains of two-state sites of nearest neighbor interactions

    International Nuclear Information System (INIS)

    Fatollahi, Amir H.; Khorrami, Mohammad; Shariati, Ahmad; Aghamohammadi, Amir

    2011-01-01

    A complete classification is given for one-dimensional chains with nearest-neighbor interactions having two states in each site, for which a matrix product ground state exists. The Hamiltonians and their corresponding matrix product ground states are explicitly obtained.

  15. Frog sound identification using extended k-nearest neighbor classifier

    Science.gov (United States)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  16. Lectures on the nearest neighbor method

    CERN Document Server

    Biau, Gérard

    2015-01-01

    This text presents a wide-ranging and rigorous overview of nearest neighbor methods, one of the most important paradigms in machine learning. Now in one self-contained volume, this book systematically covers key statistical, probabilistic, combinatorial and geometric ideas for understanding, analyzing and developing nearest neighbor methods. Gérard Biau is a professor at Université Pierre et Marie Curie (Paris). Luc Devroye is a professor at the School of Computer Science at McGill University (Montreal).   .

  17. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications

    Directory of Open Access Journals (Sweden)

    Jaime Vitola

    2017-02-01

    Full Text Available Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM system based on the use of a piezoelectric (PZT active system. The SHM system includes: (i the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii data organization; (iii advanced signal processing techniques to define the feature vectors; and finally; (iv the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.

  18. [Classification of Children with Attention-Deficit/Hyperactivity Disorder and Typically Developing Children Based on Electroencephalogram Principal Component Analysis and k-Nearest Neighbor].

    Science.gov (United States)

    Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling

    2016-04-01

    This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention

  19. The nearest neighbor and the bayes error rates.

    Science.gov (United States)

    Loizou, G; Maybank, S J

    1987-02-01

    The (k, l) nearest neighbor method of pattern classification is compared to the Bayes method. If the two acceptance rates are equal then the asymptotic error rates satisfy the inequalities Ek,l + 1 ¿ E*(¿) ¿ Ek,l dE*(¿), where d is a function of k, l, and the number of pattern classes, and ¿ is the reject threshold for the Bayes method. An explicit expression for d is given which is optimal in the sense that for some probability distributions Ek,l and dE* (¿) are equal.

  20. Common Nearest Neighbor Clustering—A Benchmark

    Directory of Open Access Journals (Sweden)

    Oliver Lemke

    2018-02-01

    Full Text Available Cluster analyses are often conducted with the goal to characterize an underlying probability density, for which the data-point density serves as an estimate for this probability density. We here test and benchmark the common nearest neighbor (CNN cluster algorithm. This algorithm assigns a spherical neighborhood R to each data point and estimates the data-point density between two data points as the number of data points N in the overlapping region of their neighborhoods (step 1. The main principle in the CNN cluster algorithm is cluster growing. This grows the clusters by sequentially adding data points and thereby effectively positions the border of the clusters along an iso-surface of the underlying probability density. This yields a strict partitioning with outliers, for which the cluster represents peaks in the underlying probability density—termed core sets (step 2. The removal of the outliers on the basis of a threshold criterion is optional (step 3. The benchmark datasets address a series of typical challenges, including datasets with a very high dimensional state space and datasets in which the cluster centroids are aligned along an underlying structure (Birch sets. The performance of the CNN algorithm is evaluated with respect to these challenges. The results indicate that the CNN cluster algorithm can be useful in a wide range of settings. Cluster algorithms are particularly important for the analysis of molecular dynamics (MD simulations. We demonstrate how the CNN cluster results can be used as a discretization of the molecular state space for the construction of a core-set model of the MD improving the accuracy compared to conventional full-partitioning models. The software for the CNN clustering is available on GitHub.

  1. Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio

    Science.gov (United States)

    Nababan, A. A.; Sitompul, O. S.; Tulus

    2018-04-01

    K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.

  2. The Islands Approach to Nearest Neighbor Querying in Spatial Networks

    DEFF Research Database (Denmark)

    Huang, Xuegang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2005-01-01

    , and versatile approach to k nearest neighbor computation that obviates the need for using several k nearest neighbor approaches for supporting a single service scenario. The experimental comparison with the existing techniques uses real-world road network data and considers both I/O and CPU performance...

  3. Dimensional testing for reverse k-nearest neighbor search

    DEFF Research Database (Denmark)

    Casanova, Guillaume; Englmeier, Elias; Houle, Michael E.

    2017-01-01

    Given a query object q, reverse k-nearest neighbor (RkNN) search aims to locate those objects of the database that have q among their k-nearest neighbors. In this paper, we propose an approximation method for solving RkNN queries, where the pruning operations and termination tests are guided...... by a characterization of the intrinsic dimensionality of the data. The method can accommodate any index structure supporting incremental (forward) nearest-neighbor search for the generation and verification of candidates, while avoiding impractically-high preprocessing costs. We also provide experimental evidence...

  4. Scalable Nearest Neighbor Algorithms for High Dimensional Data.

    Science.gov (United States)

    Muja, Marius; Lowe, David G

    2014-11-01

    For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

  5. Implementation of Nearest Neighbor using HSV to Identify Skin Disease

    Science.gov (United States)

    Gerhana, Y. A.; Zulfikar, W. B.; Ramdani, A. H.; Ramdhani, M. A.

    2018-01-01

    Today, Android is one of the most widely used operating system in the world. Most of android device has a camera that could capture an image, this feature could be optimized to identify skin disease. The disease is one of health problem caused by bacterium, fungi, and virus. The symptoms of skin disease usually visible. In this work, the symptoms that captured as image contains HSV in every pixel of the image. HSV can extracted and then calculate to earn euclidean value. The value compared using nearest neighbor algorithm to discover closer value between image testing and image training to get highest value that decide class label or type of skin disease. The testing result show that 166 of 200 or about 80% is accurate. There are some reasons that influence the result of classification model like number of image training and quality of android device’s camera.

  6. Haldane to Dimer Phase Transition in the Spin-1 Haldane System with Bond-Alternating Nearest-Neighbor and Uniform Next-Nearest-Neighbor Exchange Interactions

    OpenAIRE

    Takashi, Tonegawa; Makoto, Kaburagi; Takeshi, Nakao; Department of Physics, Faculty of Science, Kobe University; Faculty of Cross-Cultural Studies, Kobe University; Department of Physics, Faculty of Science, Kobe University

    1995-01-01

    The Haldane to dimer phase transition is studied in the spin-1 Haldane system with bond-alternating nearest-neighbor and uniform next-nearest-neighbor exchange interactions, where both interactions are antiferromagnetic and thus compete with each other. By using a method of exact diagonalization, the ground-state phase diagram on the ratio of the next-nearest-neighbor interaction constant to the nearest-neighbor one versus the bond-alternation parameter of the nearest-neighbor interactions is...

  7. Multiple k Nearest Neighbor Query Processing in Spatial Network Databases

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2006-01-01

    This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... where an upper bound on k is known a priori and then extends the techniques to the case where this is not so. Based on empirical studies with real-world data, the paper offers insight into the circumstances under which the different proposed techniques can be used with advantage for multiple k nearest...

  8. The Application of Determining Students’ Graduation Status of STMIK Palangkaraya Using K-Nearest Neighbors Method

    Science.gov (United States)

    Rusdiana, Lili; Marfuah

    2017-12-01

    K-Nearest Neighbors method is one of methods used for classification which calculate a value to find out the closest in distance. It is used to group a set of data such as students’ graduation status that are got from the amount of course credits taken by them, the grade point average (AVG), and the mini-thesis grade. The study is conducted to know the results of using K-Nearest Neighbors method on the application of determining students’ graduation status, so it can be analyzed from the method used, the data, and the application constructed. The aim of this study is to find out the application results by using K-Nearest Neighbors concept to determine students’ graduation status using the data of STMIK Palangkaraya students. The development of the software used Extreme Programming, since it was appropriate and precise for this study which was to quickly finish the project. The application was created using Microsoft Office Excel 2007 for the training data and Matlab 7 to implement the application. The result of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5%. It could determine the predicate graduation of 94 data used from the initial data before the processing as many as 136 data which the maximal training data was 50data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study. The results of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5% could determine the predicate graduation which is the maximal training data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study.

  9. Nearest unlike neighbor (NUN): an aid to decision confidence estimation

    Science.gov (United States)

    Dasarathy, Belur V.

    1995-09-01

    The concept of nearest unlike neighbor (NUN), proposed and explored previously in the design of nearest neighbor (NN) based decision systems, is further exploited in this study to develop a measure of confidence in the decisions made by NN-based decision systems. This measure of confidence, on the basis of comparison with a user-defined threshold, may be used to determine the acceptability of the decision provided by the NN-based decision system. The concepts, associated methodology, and some illustrative numerical examples using the now classical Iris data to bring out the ease of implementation and effectiveness of the proposed innovations are presented.

  10. Finger vein identification using fuzzy-based k-nearest centroid neighbor classifier

    Science.gov (United States)

    Rosdi, Bakhtiar Affendi; Jaafar, Haryati; Ramli, Dzati Athiar

    2015-02-01

    In this paper, a new approach for personal identification using finger vein image is presented. Finger vein is an emerging type of biometrics that attracts attention of researchers in biometrics area. As compared to other biometric traits such as face, fingerprint and iris, finger vein is more secured and hard to counterfeit since the features are inside the human body. So far, most of the researchers focus on how to extract robust features from the captured vein images. Not much research was conducted on the classification of the extracted features. In this paper, a new classifier called fuzzy-based k-nearest centroid neighbor (FkNCN) is applied to classify the finger vein image. The proposed FkNCN employs a surrounding rule to obtain the k-nearest centroid neighbors based on the spatial distributions of the training images and their distance to the test image. Then, the fuzzy membership function is utilized to assign the test image to the class which is frequently represented by the k-nearest centroid neighbors. Experimental evaluation using our own database which was collected from 492 fingers shows that the proposed FkNCN has better performance than the k-nearest neighbor, k-nearest-centroid neighbor and fuzzy-based-k-nearest neighbor classifiers. This shows that the proposed classifier is able to identify the finger vein image effectively.

  11. Secure Nearest Neighbor Query on Crowd-Sensing Data

    Directory of Open Access Journals (Sweden)

    Ke Cheng

    2016-09-01

    Full Text Available Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.

  12. k-Nearest Neighbors Algorithm in Profiling Power Analysis Attacks

    Directory of Open Access Journals (Sweden)

    Z. Martinasek

    2016-06-01

    Full Text Available Power analysis presents the typical example of successful attacks against trusted cryptographic devices such as RFID (Radio-Frequency IDentifications and contact smart cards. In recent years, the cryptographic community has explored new approaches in power analysis based on machine learning models such as Support Vector Machine (SVM, RF (Random Forest and Multi-Layer Perceptron (MLP. In this paper, we made an extensive comparison of machine learning algorithms in the power analysis. For this purpose, we implemented a verification program that always chooses the optimal settings of individual machine learning models in order to obtain the best classification accuracy. In our research, we used three datasets, the first containing the power traces of an unprotected AES (Advanced Encryption Standard implementation. The second and third datasets are created independently from public available power traces corresponding to a masked AES implementation (DPA Contest v4. The obtained results revealed some interesting facts, namely, an elementary k-NN (k-Nearest Neighbors algorithm, which has not been commonly used in power analysis yet, shows great application potential in practice.

  13. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  14. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Science.gov (United States)

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  15. Diagnosis of Diabetes Diseases Using an Artificial Immune Recognition System2 (AIRS2) with Fuzzy K-nearest Neighbor

    OpenAIRE

    CHIKH, Mohamed Amine; SAIDI, Meryem; SETTOUTI, Nesma

    2012-01-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disea...

  16. Using K-Nearest Neighbor in Optical Character Recognition

    Directory of Open Access Journals (Sweden)

    Veronica Ong

    2016-03-01

    Full Text Available The growth in computer vision technology has aided society with various kinds of tasks. One of these tasks is the ability of recognizing text contained in an image, or usually referred to as Optical Character Recognition (OCR. There are many kinds of algorithms that can be implemented into an OCR. The K-Nearest Neighbor is one such algorithm. This research aims to find out the process behind the OCR mechanism by using K-Nearest Neighbor algorithm; one of the most influential machine learning algorithms. It also aims to find out how precise the algorithm is in an OCR program. To do that, a simple OCR program to classify alphabets of capital letters is made to produce and compare real results. The result of this research yielded a maximum of 76.9% accuracy with 200 training samples per alphabet. A set of reasons are also given as to why the program is able to reach said level of accuracy.

  17. An Improvement To The k-Nearest Neighbor Classifier For ECG Database

    Science.gov (United States)

    Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul

    2018-03-01

    The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.

  18. Thermodynamics of alternating spin chains with competing nearest- and next-nearest-neighbor interactions: Ising model

    Science.gov (United States)

    Pini, Maria Gloria; Rettori, Angelo

    1993-08-01

    The thermodynamical properties of an alternating spin (S,s) one-dimensional (1D) Ising model with competing nearest- and next-nearest-neighbor interactions are exactly calculated using a transfer-matrix technique. In contrast to the case S=s=1/2, previously investigated by Harada, the alternation of different spins (S≠s) along the chain is found to give rise to two-peaked static structure factors, signaling the coexistence of different short-range-order configurations. The relevance of our calculations with regard to recent experimental data by Gatteschi et al. in quasi-1D molecular magnetic materials, R (hfac)3 NITEt (R=Gd, Tb, Dy, Ho, Er, . . .), is discussed; hfac is hexafluoro-acetylacetonate and NlTEt is 2-Ethyl-4,4,5,5-tetramethyl-4,5-dihydro-1H-imidazolyl-1-oxyl-3-oxide.

  19. Enhanced Approximate Nearest Neighbor via Local Area Focused Search.

    Energy Technology Data Exchange (ETDEWEB)

    Gonzales, Antonio [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Blazier, Nicholas Paul [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-02-01

    Approximate Nearest Neighbor (ANN) algorithms are increasingly important in machine learning, data mining, and image processing applications. There is a large family of space- partitioning ANN algorithms, such as randomized KD-Trees, that work well in practice but are limited by an exponential increase in similarity comparisons required to optimize recall. Additionally, they only support a small set of similarity metrics. We present Local Area Fo- cused Search (LAFS), a method that enhances the way queries are performed using an existing ANN index. Instead of a single query, LAFS performs a number of smaller (fewer similarity comparisons) queries and focuses on a local neighborhood which is refined as candidates are identified. We show that our technique improves performance on several well known datasets and is easily extended to general similarity metrics using kernel projection techniques.

  20. Nearest Neighbor Estimates of Entropy for Multivariate Circular Distributions

    Directory of Open Access Journals (Sweden)

    Neeraj Misra

    2010-05-01

    Full Text Available In molecular sciences, the estimation of entropies of molecules is important for the understanding of many chemical and biological processes. Motivated by these applications, we consider the problem of estimating the entropies of circular random vectors and introduce non-parametric estimators based on circular distances between n sample points and their k th nearest neighbors (NN, where k (≤ n – 1 is a fixed positive integer. The proposed NN estimators are based on two different circular distances, and are proven to be asymptotically unbiased and consistent. The performance of one of the circular-distance estimators is investigated and compared with that of the already established Euclidean-distance NN estimator using Monte Carlo samples from an analytic distribution of six circular variables of an exactly known entropy and a large sample of seven internal-rotation angles in the molecule of tartaric acid, obtained by a realistic molecular-dynamics simulation.

  1. Introduction to machine learning: k-nearest neighbors.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-06-01

    Machine learning techniques have been widely used in many scientific fields, but its use in medical literature is limited partly because of technical difficulties. k-nearest neighbors (kNN) is a simple method of machine learning. The article introduces some basic ideas underlying the kNN algorithm, and then focuses on how to perform kNN modeling with R. The dataset should be prepared before running the knn() function in R. After prediction of outcome with kNN algorithm, the diagnostic performance of the model should be checked. Average accuracy is the mostly widely used statistic to reflect the kNN algorithm. Factors such as k value, distance calculation and choice of appropriate predictors all have significant impact on the model performance.

  2. Morphological type correlation between nearest neighbor pairs of galaxies

    Science.gov (United States)

    Yamagata, Tomohiko

    1990-01-01

    Although the morphological type of galaxies is one of the most fundamental properties of galaxies, its origin and evolutionary processes, if any, are not yet fully understood. It has been established that the galaxy morphology strongly depends on the environment in which the galaxy resides (e.g., Dressler 1980). Galaxy pairs correspond to the smallest scales of galaxy clustering and may provide important clues to how the environment influences the formation and evolution of galaxies. Several investigators pointed out that there is a tendency for pair galaxies to have similar morphological types (Karachentsev and Karachentseva 1974, Page 1975, Noerdlinger 1979). Here, researchers analyze morphological type correlation for 18,364 nearest neighbor pairs of galaxies identified in the magnetic tape version of the Center for Astrophysics Redshift Catalogue.

  3. Designing lattice structures with maximal nearest-neighbor entanglement

    Energy Technology Data Exchange (ETDEWEB)

    Navarro-Munoz, J C; Lopez-Sandoval, R [Instituto Potosino de Investigacion CientIfica y Tecnologica, Camino a la presa San Jose 2055, 78216 San Luis Potosi (Mexico); Garcia, M E [Theoretische Physik, FB 18, Universitaet Kassel and Center for Interdisciplinary Nanostructure Science and Technology (CINSaT), Heinrich-Plett-Str.40, 34132 Kassel (Germany)

    2009-08-07

    In this paper, we study the numerical optimization of nearest-neighbor concurrence of bipartite one- and two-dimensional lattices, as well as non-bipartite two-dimensional lattices. These systems are described in the framework of a tight-binding Hamiltonian while the optimization of concurrence was performed using genetic algorithms. Our results show that the concurrence of the optimized lattice structures is considerably higher than that of non-optimized systems. In the case of one-dimensional chains, the concurrence increases dramatically when the system begins to dimerize, i.e., it undergoes a structural phase transition (Peierls distortion). This result is consistent with the idea that entanglement is maximal or shows a singularity near quantum phase transitions. Moreover, the optimization of concurrence in two-dimensional bipartite and non-bipartite lattices is achieved when the structures break into smaller subsystems, which are arranged in geometrically distinguishable configurations.

  4. Credit scoring analysis using weighted k nearest neighbor

    Science.gov (United States)

    Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.

    2018-05-01

    Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.

  5. Quality and efficiency in high dimensional Nearest neighbor search

    KAUST Repository

    Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos

    2009-01-01

    Nearest neighbor (NN) search in high dimensional space is an important problem in many applications. Ideally, a practical solution (i) should be implementable in a relational database, and (ii) its query cost should grow sub-linearly with the dataset size, regardless of the data and query distributions. Despite the bulk of NN literature, no solution fulfills both requirements, except locality sensitive hashing (LSH). The existing LSH implementations are either rigorous or adhoc. Rigorous-LSH ensures good quality of query results, but requires expensive space and query cost. Although adhoc-LSH is more efficient, it abandons quality control, i.e., the neighbor it outputs can be arbitrarily bad. As a result, currently no method is able to ensure both quality and efficiency simultaneously in practice. Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast highdimensional NN search with excellent quality. The combination of several LSB-trees leads to a structure called the LSB-forest that ensures the same result quality as rigorous-LSH, but reduces its space and query cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though the latter has no quality guarantee. Besides its appealing theoretical properties, the LSB-tree itself also serves as an effective index that consumes linear space, and supports efficient updates. Our extensive experiments confirm that the LSB-tree is faster than (i) the state of the art of exact NN search by two orders of magnitude, and (ii) the best (linear-space) method of approximate retrieval by an order of magnitude, and at the same time, returns neighbors with much better quality. © 2009 ACM.

  6. A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction

    Institute of Scientific and Technical Information of China (English)

    Duksan Ryu; Jong-In Jang; Jongmoon Baik; Member; ACM; IEEE

    2015-01-01

    Software defect prediction (SDP) is an active research field in software engineering to identify defect-prone modules. Thanks to SDP, limited testing resources can be effectively allocated to defect-prone modules. Although SDP requires suffcient local data within a company, there are cases where local data are not available, e.g., pilot projects. Companies without local data can employ cross-project defect prediction (CPDP) using external data to build classifiers. The major challenge of CPDP is different distributions between training and test data. To tackle this, instances of source data similar to target data are selected to build classifiers. Software datasets have a class imbalance problem meaning the ratio of defective class to clean class is far low. It usually lowers the performance of classifiers. We propose a Hybrid Instance Selection Using Nearest-Neighbor (HISNN) method that performs a hybrid classification selectively learning local knowledge (via k-nearest neighbor) and global knowledge (via na¨ıve Bayes). Instances having strong local knowledge are identified via nearest-neighbors with the same class label. Previous studies showed low PD (probability of detection) or high PF (probability of false alarm) which is impractical to use. The experimental results show that HISNN produces high overall performance as well as high PD and low PF.

  7. Forecasting of steel consumption with use of nearest neighbors method

    Directory of Open Access Journals (Sweden)

    Rogalewicz Michał

    2017-01-01

    Full Text Available In the process of building a steel construction, its design is usually commissioned to the design office. Then a quotation is made and the finished offer is delivered to the customer. Its final shape is influenced by steel consumption to a great extent. Correct determination of the potential consumption of this material most often determines the profitability of the project. Because of a long waiting time for a final project from the design office, it is worthwhile to pre-analyze the project’s profitability and feasibility using historical data on already realized orders. The paper presents an innovative approach to decision-making support in one of the Polish construction companies. The authors have defined and prioritized the most important factors that differentiate the executed orders and have the greatest impact on steel consumption. These are, among others: height and width of steel structure, number of aisles, type of roof, etc. Then they applied and adapted the method of k-nearest neighbors to the specificity of the discussed problem. The goal was to search a set of historical orders and find the most similar to the analyzed one. On this basis, consumption of steel can be estimated. The method was programmed within the EXPLOR application.

  8. River Flow Prediction Using the Nearest Neighbor Probabilistic Ensemble Method

    Directory of Open Access Journals (Sweden)

    H. Sanikhani

    2016-02-01

    Full Text Available Introduction: In the recent years, researchers interested on probabilistic forecasting of hydrologic variables such river flow.A probabilistic approach aims at quantifying the prediction reliability through a probability distribution function or a prediction interval for the unknown future value. The evaluation of the uncertainty associated to the forecast is seen as a fundamental information, not only to correctly assess the prediction, but also to compare forecasts from different methods and to evaluate actions and decisions conditionally on the expected values. Several probabilistic approaches have been proposed in the literature, including (1 methods that use resampling techniques to assess parameter and model uncertainty, such as the Metropolis algorithm or the Generalized Likelihood Uncertainty Estimation (GLUE methodology for an application to runoff prediction, (2 methods based on processing the forecast errors of past data to produce the probability distributions of future values and (3 methods that evaluate how the uncertainty propagates from the rainfall forecast to the river discharge prediction, as the Bayesian forecasting system. Materials and Methods: In this study, two different probabilistic methods are used for river flow prediction.Then the uncertainty related to the forecast is quantified. One approach is based on linear predictors and in the other, nearest neighbor was used. The nonlinear probabilistic ensemble can be used for nonlinear time series analysis using locally linear predictors, while NNPE utilize a method adapted for one step ahead nearest neighbor methods. In this regard, daily river discharge (twelve years of Dizaj and Mashin Stations on Baranduz-Chay basin in west Azerbijan and Zard-River basin in Khouzestan provinces were used, respectively. The first six years of data was applied for fitting the model. The next three years was used to calibration and the remained three yeas utilized for testing the models

  9. Nearest neighbor 3D segmentation with context features

    Science.gov (United States)

    Hristova, Evelin; Schulz, Heinrich; Brosch, Tom; Heinrich, Mattias P.; Nickisch, Hannes

    2018-03-01

    Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are - for a nearest neighbor method - surprisingly accurate, robust as well as data and time efficient.

  10. Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor

    OpenAIRE

    Samir Brahim Belhaouari

    2009-01-01

    By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less ...

  11. Polymers with nearest- and next nearest-neighbor interactions on the Husimi lattice

    Science.gov (United States)

    Oliveira, Tiago J.

    2016-04-01

    The exact grand-canonical solution of a generalized interacting self-avoid walk (ISAW) model, placed on a Husimi lattice built with squares, is presented. In this model, beyond the traditional interaction {ω }1={{{e}}}{ɛ 1/{k}BT} between (nonconsecutive) monomers on nearest-neighbor (NN) sites, an additional energy {ɛ }2 is associated to next-NN (NNN) monomers. Three definitions of NNN sites/interactions are considered, where each monomer can have, effectively, at most two, four, or six NNN monomers on the Husimi lattice. The phase diagrams found in all cases have (qualitatively) the same thermodynamic properties: a non-polymerized (NP) and a polymerized (P) phase separated by a critical and a coexistence surface that meet at a tricritical (θ-) line. This θ-line is found even when one of the interactions is repulsive, existing for {ω }1 in the range [0,∞ ), i.e., for {ɛ }1/{k}BT in the range [-∞ ,∞ ). Thus, counterintuitively, a θ-point exists even for an infinite repulsion between NN monomers ({ω }1=0), being associated to a coil-‘soft globule’ transition. In the limit of an infinite repulsive force between NNN monomers, however, the coil-globule transition disappears, and only NP-P continuous transition is observed. This particular case, with {ω }2=0, is also solved exactly on the square lattice, using a transfer matrix calculation where a discontinuous NP-P transition is found. For attractive and repulsive forces between NN and NNN monomers, respectively, the model becomes quite similar to the semiflexible-ISAW one, whose crystalline phase is not observed here, as a consequence of the frustration due to competing NN and NNN forces. The mapping of the phase diagrams in canonical ones is discussed and compared with recent results from Monte Carlo simulations on the square lattice.

  12. Nearest Neighbor Search in the Metric Space of a Complex Network for Community Detection

    Directory of Open Access Journals (Sweden)

    Suman Saha

    2016-03-01

    Full Text Available The objective of this article is to bridge the gap between two important research directions: (1 nearest neighbor search, which is a fundamental computational tool for large data analysis; and (2 complex network analysis, which deals with large real graphs but is generally studied via graph theoretic analysis or spectral analysis. In this article, we have studied the nearest neighbor search problem in a complex network by the development of a suitable notion of nearness. The computation of efficient nearest neighbor search among the nodes of a complex network using the metric tree and locality sensitive hashing (LSH are also studied and experimented. For evaluation of the proposed nearest neighbor search in a complex network, we applied it to a network community detection problem. Experiments are performed to verify the usefulness of nearness measures for the complex networks, the role of metric tree and LSH to compute fast and approximate node nearness and the the efficiency of community detection using nearest neighbor search. We observed that nearest neighbor between network nodes is a very efficient tool to explore better the community structure of the real networks. Several efficient approximation schemes are very useful for large networks, which hardly made any degradation of results, whereas they save lot of computational times, and nearest neighbor based community detection approach is very competitive in terms of efficiency and time.

  13. Diagnosis of diabetes diseases using an Artificial Immune Recognition System2 (AIRS2) with fuzzy K-nearest neighbor.

    Science.gov (United States)

    Chikh, Mohamed Amine; Saidi, Meryem; Settouti, Nesma

    2012-10-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disease dataset used in our work is retrieved from UCI machine learning repository. The performances of the AIRS2 and MAIRS2 are evaluated regarding classification accuracy, sensitivity and specificity values. The highest classification accuracy obtained when applying the AIRS2 and MAIRS2 using 10-fold cross-validation was, respectively 82.69% and 89.10%.

  14. Anderson localization in one-dimensional quasiperiodic lattice models with nearest- and next-nearest-neighbor hopping

    International Nuclear Information System (INIS)

    Gong, Longyan; Feng, Yan; Ding, Yougen

    2017-01-01

    Highlights: • Quasiperiodic lattice models with next-nearest-neighbor hopping are studied. • Shannon information entropies are used to reflect state localization properties. • Phase diagrams are obtained for the inverse bronze and golden means, respectively. • Our studies present a more complete picture than existing works. - Abstract: We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.

  15. Anderson localization in one-dimensional quasiperiodic lattice models with nearest- and next-nearest-neighbor hopping

    Energy Technology Data Exchange (ETDEWEB)

    Gong, Longyan, E-mail: lygong@njupt.edu.cn [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); National Laboratory of Solid State Microstructures, Nanjing University, Nanjing 210093 (China); Feng, Yan; Ding, Yougen [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China)

    2017-02-12

    Highlights: • Quasiperiodic lattice models with next-nearest-neighbor hopping are studied. • Shannon information entropies are used to reflect state localization properties. • Phase diagrams are obtained for the inverse bronze and golden means, respectively. • Our studies present a more complete picture than existing works. - Abstract: We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.

  16. Nearest neighbors EPR superhyperfine interaction in divalent iridium complexes in alkali halide host lattice

    International Nuclear Information System (INIS)

    Pinhal, N.M.; Vugman, N.V.

    1983-01-01

    Further splitting of chlorine superhyperfine lines on the EPR spectrum of the [Ir (CN) 4 Cl 2 ] 4 - molecular species in NaCl latice indicates a super-superhyperfine interaction with the nearest neighbors sodium atoms. (Author) [pt

  17. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

    KAUST Repository

    Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos

    2010-01-01

    Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii

  18. Mixed random walks with a trap in scale-free networks including nearest-neighbor and next-nearest-neighbor jumps

    Science.gov (United States)

    Zhang, Zhongzhi; Dong, Yuze; Sheng, Yibin

    2015-10-01

    Random walks including non-nearest-neighbor jumps appear in many real situations such as the diffusion of adatoms and have found numerous applications including PageRank search algorithm; however, related theoretical results are much less for this dynamical process. In this paper, we present a study of mixed random walks in a family of fractal scale-free networks, where both nearest-neighbor and next-nearest-neighbor jumps are included. We focus on trapping problem in the network family, which is a particular case of random walks with a perfect trap fixed at the central high-degree node. We derive analytical expressions for the average trapping time (ATT), a quantitative indicator measuring the efficiency of the trapping process, by using two different methods, the results of which are consistent with each other. Furthermore, we analytically determine all the eigenvalues and their multiplicities for the fundamental matrix characterizing the dynamical process. Our results show that although next-nearest-neighbor jumps have no effect on the leading scaling of the trapping efficiency, they can strongly affect the prefactor of ATT, providing insight into better understanding of random-walk process in complex systems.

  19. Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization

    Science.gov (United States)

    Jamaluddin; Siringoringo, Rimbun

    2017-12-01

    Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%

  20. Clustered K nearest neighbor algorithm for daily inflow forecasting

    NARCIS (Netherlands)

    Akbari, M.; Van Overloop, P.J.A.T.M.; Afshar, A.

    2010-01-01

    Instance based learning (IBL) algorithms are a common choice among data driven algorithms for inflow forecasting. They are based on the similarity principle and prediction is made by the finite number of similar neighbors. In this sense, the similarity of a query instance is estimated according to

  1. Estimating forest attribute parameters for small areas using nearest neighbors techniques

    Science.gov (United States)

    Ronald E. McRoberts

    2012-01-01

    Nearest neighbors techniques have become extremely popular, particularly for use with forest inventory data. With these techniques, a population unit prediction is calculated as a linear combination of observations for a selected number of population units in a sample that are most similar, or nearest, in a space of ancillary variables to the population unit requiring...

  2. Kinetic Models for Topological Nearest-Neighbor Interactions

    Science.gov (United States)

    Blanchet, Adrien; Degond, Pierre

    2017-12-01

    We consider systems of agents interacting through topological interactions. These have been shown to play an important part in animal and human behavior. Precisely, the system consists of a finite number of particles characterized by their positions and velocities. At random times a randomly chosen particle, the follower, adopts the velocity of its closest neighbor, the leader. We study the limit of a system size going to infinity and, under the assumption of propagation of chaos, show that the limit kinetic equation is a non-standard spatial diffusion equation for the particle distribution function. We also study the case wherein the particles interact with their K closest neighbors and show that the corresponding kinetic equation is the same. Finally, we prove that these models can be seen as a singular limit of the smooth rank-based model previously studied in Blanchet and Degond (J Stat Phys 163:41-60, 2016). The proofs are based on a combinatorial interpretation of the rank as well as some concentration of measure arguments.

  3. Utilization of Singularity Exponent in Nearest Neighbor Based Classifier

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2013-01-01

    Roč. 30, č. 1 (2013), s. 3-29 ISSN 0176-4268 Grant - others:Czech Technical University(CZ) CZ68407700 Institutional support: RVO:67985807 Keywords : multivariate data * probability density estimation * classification * probability distribution mapping function * probability density mapping function * power approximation Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.571, year: 2013

  4. Alpha centauri unveiling the secrets of our nearest stellar neighbor

    CERN Document Server

    Beech, Martin

    2015-01-01

    As our closest stellar companion and composed of two Sun-like stars and a third small dwarf star, Alpha Centauri is an ideal testing ground of astrophysical models and has played a central role in the history and development of modern astronomy—from the first guesses at stellar distances to understanding how our own star, the Sun, might have evolved. It is also the host of the nearest known exoplanet, an ultra-hot, Earth-like planet recently discovered. Just 4.4 light years away Alpha Centauri is also the most obvious target for humanity’s first directed interstellar space probe. Such a mission could reveal the small-scale structure of a new planetary system and also represent the first step in what must surely be humanity’s greatest future adventure—exploration of the Milky Way Galaxy itself. For all of its closeness, α Centauri continues to tantalize astronomers with many unresolved mysteries, such as how did it form, how many planets does it contain and where are they, and how might we view its ex...

  5. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  6. A Novel Preferential Diffusion Recommendation Algorithm Based on User’s Nearest Neighbors

    Directory of Open Access Journals (Sweden)

    Fuguo Zhang

    2017-01-01

    Full Text Available Recommender system is a very efficient way to deal with the problem of information overload for online users. In recent years, network based recommendation algorithms have demonstrated much better performance than the standard collaborative filtering methods. However, most of network based algorithms do not give a high enough weight to the influence of the target user’s nearest neighbors in the resource diffusion process, while a user or an object with high degree will obtain larger influence in the standard mass diffusion algorithm. In this paper, we propose a novel preferential diffusion recommendation algorithm considering the significance of the target user’s nearest neighbors and evaluate it in the three real-world data sets: MovieLens 100k, MovieLens 1M, and Epinions. Experiments results demonstrate that the novel preferential diffusion recommendation algorithm based on user’s nearest neighbors can significantly improve the recommendation accuracy and diversity.

  7. A Nearest Neighbor Classifier Employing Critical Boundary Vectors for Efficient On-Chip Template Reduction.

    Science.gov (United States)

    Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi

    2016-05-01

    Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.

  8. CATEGORIZATION OF GELAM, ACACIA AND TUALANG HONEY ODORPROFILE USING K-NEAREST NEIGHBORS

    Directory of Open Access Journals (Sweden)

    Nurdiyana Zahed

    2018-02-01

    Full Text Available Honey authenticity refer to honey types is of great importance issue and interest in agriculture. In current research, several documents of specific types of honey have their own usage in medical field. However, it is quite challenging task to classify different types of honey by simply using our naked eye. This work demostrated a successful an electronic nose (E-nose application as an instrument for identifying odor profile pattern of three common honey in Malaysia (Gelam, Acacia and Tualang honey. The applied E-nose has produced signal for odor measurement in form of numeric resistance (Ω. The data reading have been pre-processed using normalization technique for standardized scale of unique features. Mean features is extracted and boxplot used as the statistical tool to present the data pattern according to three types of honey. Mean features that have been extracted were employed into K-Nearest Neighbors classifier as an input features and evaluated using several splitting ratio. Excellent results were obtained by showing 100% rate of accuracy, sensitivity and specificity of classification from KNN using weigh (k=1, ratio 90:10 and Euclidean distance. The findings confirmed the ability of KNN classifier as intelligent classification to classify different honey types from E-nose calibration. Outperform of other classifier, KNN required less parameter optimization and achieved promising result.

  9. Applying an efficient K-nearest neighbor search to forest attribute imputation

    Science.gov (United States)

    Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek

    2006-01-01

    This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...

  10. Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbor imputation methods

    Science.gov (United States)

    Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett

    2009-01-01

    Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....

  11. Mapping change of older forest with nearest-neighbor imputation and Landsat time-series

    Science.gov (United States)

    Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Warren B. Cohen; Robert E. Kennedy; Zhiqiang. Yang

    2012-01-01

    The Northwest Forest Plan (NWFP), which aims to conserve late-successional and old-growth forests (older forests) and associated species, established new policies on federal lands in the Pacific Northwest USA. As part of monitoring for the NWFP, we tested nearest-neighbor imputation for mapping change in older forest, defined by threshold values for forest attributes...

  12. Moderate-resolution data and gradient nearest neighbor imputation for regional-national risk assessment

    Science.gov (United States)

    Kenneth B. Jr. Pierce; C. Kenneth Brewer; Janet L. Ohmann

    2010-01-01

    This study was designed to test the feasibility of combining a method designed to populate pixels with inventory plot data at the 30-m scale with a new national predictor data set. The new national predictor data set was developed by the USDA Forest Service Remote Sensing Applications Center (hereafter RSAC) at the 250-m scale. Gradient Nearest Neighbor (GNN)...

  13. Recursive nearest neighbor search in a sparse and multiscale domain for comparing audio signals

    DEFF Research Database (Denmark)

    Sturm, Bob L.; Daudet, Laurent

    2011-01-01

    We investigate recursive nearest neighbor search in a sparse domain at the scale of audio signals. Essentially, to approximate the cosine distance between the signals we make pairwise comparisons between the elements of localized sparse models built from large and redundant multiscale dictionaries...

  14. Collective Behaviors of Mobile Robots Beyond the Nearest Neighbor Rules With Switching Topology.

    Science.gov (United States)

    Ning, Boda; Han, Qing-Long; Zuo, Zongyu; Jin, Jiong; Zheng, Jinchuan

    2018-05-01

    This paper is concerned with the collective behaviors of robots beyond the nearest neighbor rules, i.e., dispersion and flocking, when robots interact with others by applying an acute angle test (AAT)-based interaction rule. Different from a conventional nearest neighbor rule or its variations, the AAT-based interaction rule allows interactions with some far-neighbors and excludes unnecessary nearest neighbors. The resulting dispersion and flocking hold the advantages of scalability, connectivity, robustness, and effective area coverage. For the dispersion, a spring-like controller is proposed to achieve collision-free coordination. With switching topology, a new fixed-time consensus-based energy function is developed to guarantee the system stability. An upper bound of settling time for energy consensus is obtained, and a uniform time interval is accordingly set so that energy distribution is conducted in a fair manner. For the flocking, based on a class of generalized potential functions taking nonsmooth switching into account, a new controller is proposed to ensure that the same velocity for all robots is eventually reached. A co-optimizing problem is further investigated to accomplish additional tasks, such as enhancing communication performance, while maintaining the collective behaviors of mobile robots. Simulation results are presented to show the effectiveness of the theoretical results.

  15. A two-step nearest neighbors algorithm using satellite imagery for predicting forest structure within species composition classes

    Science.gov (United States)

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques have been shown to be useful for predicting multiple forest attributes from forest inventory and Landsat satellite image data. However, in regions lacking good digital land cover information, nearest neighbors selected to predict continuous variables such as tree volume must be selected without regard to relevant categorical variables such...

  16. Multi-strategy based quantum cost reduction of linear nearest-neighbor quantum circuit

    Science.gov (United States)

    Tan, Ying-ying; Cheng, Xue-yun; Guan, Zhi-jin; Liu, Yang; Ma, Haiying

    2018-03-01

    With the development of reversible and quantum computing, study of reversible and quantum circuits has also developed rapidly. Due to physical constraints, most quantum circuits require quantum gates to interact on adjacent quantum bits. However, many existing quantum circuits nearest-neighbor have large quantum cost. Therefore, how to effectively reduce quantum cost is becoming a popular research topic. In this paper, we proposed multiple optimization strategies to reduce the quantum cost of the circuit, that is, we reduce quantum cost from MCT gates decomposition, nearest neighbor and circuit simplification, respectively. The experimental results show that the proposed strategies can effectively reduce the quantum cost, and the maximum optimization rate is 30.61% compared to the corresponding results.

  17. ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

    DEFF Research Database (Denmark)

    Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander

    2017-01-01

    This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory approximate nearest neighbor algorithms. It provides a standard interface for measuring the performance and quality achieved by nearest neighbor algorithms on different standard data sets. It supports several...... visualise these as images, Open image in new window plots, and websites with interactive plots. ANN-Benchmarks aims to provide a constantly updated overview of the current state of the art of k-NN algorithms. In the short term, this overview allows users to choose the correct k-NN algorithm and parameters...... for their similarity search task; in the longer term, algorithm designers will be able to use this overview to test and refine automatic parameter tuning. The paper gives an overview of the system, evaluates the results of the benchmark, and points out directions for future work. Interestingly, very different...

  18. Chaotic Synchronization in Nearest-Neighbor Coupled Networks of 3D CNNs

    OpenAIRE

    Serrano-Guerrero, H.; Cruz-Hernández, C.; López-Gutiérrez, R.M.; Cardoza-Avendaño, L.; Chávez-Pérez, R.A.

    2013-01-01

    In this paper, a synchronization of Cellular Neural Networks (CNNs) in nearest-neighbor coupled arrays, is numerically studied. Synchronization of multiple chaotic CNNs is achieved by appealing to complex systems theory. In particular, we consider dynamical networks composed by 3D CNNs, as interconnected nodes, where the interactions in the networks are defined by coupling the first state of each node. Four cases of interest are considered: i) synchronization without chaotic master, ii) maste...

  19. FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

    OpenAIRE

    Lu Si; Jie Yu; Shasha Li; Jun Ma; Lei Luo; Qingbo Wu; Yongqi Ma; Zhengji Liu

    2017-01-01

    Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rul...

  20. Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting

    Science.gov (United States)

    Zhang, Ningning; Lin, Aijing; Shang, Pengjian

    2017-07-01

    In this paper, we propose a new two-stage methodology that combines the ensemble empirical mode decomposition (EEMD) with multidimensional k-nearest neighbor model (MKNN) in order to forecast the closing price and high price of the stocks simultaneously. The modified algorithm of k-nearest neighbors (KNN) has an increasingly wide application in the prediction of all fields. Empirical mode decomposition (EMD) decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs), however, it cannot reveal characteristic information of the signal with much accuracy as a result of mode mixing. So ensemble empirical mode decomposition (EEMD), an improved method of EMD, is presented to resolve the weaknesses of EMD by adding white noise to the original data. With EEMD, the components with true physical meaning can be extracted from the time series. Utilizing the advantage of EEMD and MKNN, the new proposed ensemble empirical mode decomposition combined with multidimensional k-nearest neighbor model (EEMD-MKNN) has high predictive precision for short-term forecasting. Moreover, we extend this methodology to the case of two-dimensions to forecast the closing price and high price of the four stocks (NAS, S&P500, DJI and STI stock indices) at the same time. The results indicate that the proposed EEMD-MKNN model has a higher forecast precision than EMD-KNN, KNN method and ARIMA.

  1. Distance-Constraint k-Nearest Neighbor Searching in Mobile Sensor Networks.

    Science.gov (United States)

    Han, Yongkoo; Park, Kisung; Hong, Jihye; Ulamin, Noor; Lee, Young-Koo

    2015-07-27

    The κ-Nearest Neighbors ( κNN) query is an important spatial query in mobile sensor networks. In this work we extend κNN to include a distance constraint, calling it a l-distant κ-nearest-neighbors (l-κNN) query, which finds the κ sensor nodes nearest to a query point that are also at or greater distance from each other. The query results indicate the objects nearest to the area of interest that are scattered from each other by at least distance l. The l-κNN query can be used in most κNN applications for the case of well distributed query results. To process an l-κNN query, we must discover all sets of κNN sensor nodes and then find all pairs of sensor nodes in each set that are separated by at least a distance l. Given the limited battery and computing power of sensor nodes, this l-κNN query processing is problematically expensive in terms of energy consumption. In this paper, we propose a greedy approach for l-κNN query processing in mobile sensor networks. The key idea of the proposed approach is to divide the search space into subspaces whose all sides are l. By selecting κ sensor nodes from the other subspaces near the query point, we guarantee accurate query results for l-κNN. In our experiments, we show that the proposed method exhibits superior performance compared with a post-processing based method using the κNN query in terms of energy efficiency, query latency, and accuracy.

  2. A γ dose distribution evaluation technique using the k-d tree for nearest neighbor searching

    International Nuclear Information System (INIS)

    Yuan Jiankui; Chen Weimin

    2010-01-01

    Purpose: The authors propose an algorithm based on the k-d tree for nearest neighbor searching to improve the γ calculation time for 2D and 3D dose distributions. Methods: The γ calculation method has been widely used for comparisons of dose distributions in clinical treatment plans and quality assurances. By specifying the acceptable dose and distance-to-agreement criteria, the method provides quantitative measurement of the agreement between the reference and evaluation dose distributions. The γ value indicates the acceptability. In regions where γ≤1, the predefined criterion is satisfied and thus the agreement is acceptable; otherwise, the agreement fails. Although the concept of the method is not complicated and a quick naieve implementation is straightforward, an efficient and robust implementation is not trivial. Recent algorithms based on exhaustive searching within a maximum radius, the geometric Euclidean distance, and the table lookup method have been proposed to improve the computational time for multidimensional dose distributions. Motivated by the fact that the least searching time for finding a nearest neighbor can be an O(log N) operation with a k-d tree, where N is the total number of the dose points, the authors propose an algorithm based on the k-d tree for the γ evaluation in this work. Results: In the experiment, the authors found that the average k-d tree construction time per reference point is O(log N), while the nearest neighbor searching time per evaluation point is proportional to O(N 1/k ), where k is between 2 and 3 for two-dimensional and three-dimensional dose distributions, respectively. Conclusions: Comparing with other algorithms such as exhaustive search and sorted list O(N), the k-d tree algorithm for γ evaluation is much more efficient.

  3. Penerapan Metode K-nearest Neighbor pada Penentuan Grade Dealer Sepeda Motor

    OpenAIRE

    Leidiyana, Henny

    2017-01-01

    The mutually beneficial cooperation is a very important thing for a leasing and dealer. Incentives for marketing is given in order to get consumers as much as possible. But sometimes the surveyor objectivity is lost due to the conspiracy on the field of marketing and surveyors. To overcome this, leasing a variety of ways one of them is doing ranking against the dealer. In this study the application of the k-Nearest Neighbor method and Euclidean distance measurement to determine the grade deal...

  4. Seismic clusters analysis in Northeastern Italy by the nearest-neighbor approach

    Science.gov (United States)

    Peresan, Antonella; Gentili, Stefania

    2018-01-01

    The main features of earthquake clusters in Northeastern Italy are explored, with the aim to get new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters, which are identified by a statistical method, based on nearest-neighbor distances of events in the space-time-energy domain. The method permits us to highlight and investigate the internal structure of earthquake sequences, and to differentiate the spatial properties of seismicity according to the different topological features of the clusters structure. To analyze seismicity of Northeastern Italy, we use information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. A preliminary reappraisal of the earthquake bulletins is carried out and the area of sufficient completeness is outlined. Various techniques are considered to estimate the scaling parameters that characterize earthquakes occurrence in the region, namely the b-value and the fractal dimension of epicenters distribution, required for the application of the nearest-neighbor technique. Specifically, average robust estimates of the parameters of the Unified Scaling Law for Earthquakes, USLE, are assessed for the whole outlined region and are used to compute the nearest-neighbor distances. Clusters identification by the nearest-neighbor method turn out quite reliable and robust with respect to the minimum magnitude cutoff of the input catalog; the identified clusters are well consistent with those obtained from manual aftershocks identification of selected sequences. We demonstrate that the earthquake clusters have distinct preferred geographic locations, and we identify two areas that differ substantially in the examined clustering properties. Specifically, burst-like sequences are associated with the north-western part and swarm-like sequences with the south-eastern part of the study

  5. A Novel Quantum Solution to Privacy-Preserving Nearest Neighbor Query in Location-Based Services

    Science.gov (United States)

    Luo, Zhen-yu; Shi, Run-hua; Xu, Min; Zhang, Shun

    2018-04-01

    We present a cheating-sensitive quantum protocol for Privacy-Preserving Nearest Neighbor Query based on Oblivious Quantum Key Distribution and Quantum Encryption. Compared with the classical related protocols, our proposed protocol has higher security, because the security of our protocol is based on basic physical principles of quantum mechanics, instead of difficulty assumptions. Especially, our protocol takes single photons as quantum resources and only needs to perform single-photon projective measurement. Therefore, it is feasible to implement this protocol with the present technologies.

  6. Chaotic synchronization of nearest-neighbor diffusive coupling Hindmarsh-Rose neural networks in noisy environments

    International Nuclear Information System (INIS)

    Fang Xiaoling; Yu Hongjie; Jiang Zonglai

    2009-01-01

    The chaotic synchronization of Hindmarsh-Rose neural networks linked by a nonlinear coupling function is discussed. The HR neural networks with nearest-neighbor diffusive coupling form are treated as numerical examples. By the construction of a special nonlinear-coupled term, the chaotic system is coupled symmetrically. For three and four neurons network, a certain region of coupling strength corresponding to full synchronization is given, and the effect of network structure and noise position are analyzed. For five and more neurons network, the full synchronization is very difficult to realize. All the results have been proved by the calculation of the maximum conditional Lyapunov exponent.

  7. Aftershock identification problem via the nearest-neighbor analysis for marked point processes

    Science.gov (United States)

    Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.

    2007-12-01

    The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.

  8. Sistem Rekomendasi Pada E-Commerce Menggunakan K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    Chandra Saha Dewa Prasetya

    2017-09-01

    The growing number of product information available on the internet brings challenges to both customer and online businesses in the e-commerce environment. Customer often have difficulty when looking for products on the internet because of the number of products sold on the internet. In addition, online businessman often experience difficulties because they has much data about products, customers and transactions, thus causing online businessman have difficulty to promote the right product to a particular customer target. A recommendation system was developed to address those problem with various methods such as Collaborative Filtering, ContentBased, and Hybrid. Collaborative filtering method uses customer’s rating data, content based using product content such as title or description, and hybrid using both as the basis of the recommendation. In this research, the k-nearest neighbor algorithm is used to determine the top-n product recommendations for each buyer. The result of this research method Content Based outperforms other methods because the sparse data, that is the condition where the number of rating given by the customers is relatively little compared the number of products available in e-commerce. Keywords: recomendation system, k-nearest neighbor, collaborative filtering, content based.

  9. Competing growth processes induced by next-nearest-neighbor interactions: Effects on meandering wavelength and stiffness

    Science.gov (United States)

    Blel, Sonia; Hamouda, Ajmi BH.; Mahjoub, B.; Einstein, T. L.

    2017-02-01

    In this paper we explore the meandering instability of vicinal steps with a kinetic Monte Carlo simulations (kMC) model including the attractive next-nearest-neighbor (NNN) interactions. kMC simulations show that increase of the NNN interaction strength leads to considerable reduction of the meandering wavelength and to weaker dependence of the wavelength on the deposition rate F. The dependences of the meandering wavelength on the temperature and the deposition rate obtained with simulations are in good quantitative agreement with the experimental result on the meandering instability of Cu(0 2 24) [T. Maroutian et al., Phys. Rev. B 64, 165401 (2001), 10.1103/PhysRevB.64.165401]. The effective step stiffness is found to depend not only on the strength of NNN interactions and the Ehrlich-Schwoebel barrier, but also on F. We argue that attractive NNN interactions intensify the incorporation of adatoms at step edges and enhance step roughening. Competition between NNN and nearest-neighbor interactions results in an alternative form of meandering instability which we call "roughening-limited" growth, rather than attachment-detachment-limited growth that governs the Bales-Zangwill instability. The computed effective wavelength and the effective stiffness behave as λeff˜F-q and β˜eff˜F-p , respectively, with q ≈p /2 .

  10. Sequential nearest-neighbor effects on computed {sup 13}C{sup {alpha}} chemical shifts

    Energy Technology Data Exchange (ETDEWEB)

    Vila, Jorge A. [Cornell University, Baker Laboratory of Chemistry and Chemical Biology (United States); Serrano, Pedro; Wuethrich, Kurt [The Scripps Research Institute, Department of Molecular Biology (United States); Scheraga, Harold A., E-mail: has5@cornell.ed [Cornell University, Baker Laboratory of Chemistry and Chemical Biology (United States)

    2010-09-15

    To evaluate sequential nearest-neighbor effects on quantum-chemical calculations of {sup 13}C{sup {alpha}} chemical shifts, we selected the structure of the nucleic acid binding (NAB) protein from the SARS coronavirus determined by NMR in solution (PDB id 2K87). NAB is a 116-residue {alpha}/{beta} protein, which contains 9 prolines and has 50% of its residues located in loops and turns. Overall, the results presented here show that sizeable nearest-neighbor effects are seen only for residues preceding proline, where Pro introduces an overestimation, on average, of 1.73 ppm in the computed {sup 13}C{sup {alpha}} chemical shifts. A new ensemble of 20 conformers representing the NMR structure of the NAB, which was calculated with an input containing backbone torsion angle constraints derived from the theoretical {sup 13}C{sup {alpha}} chemical shifts as supplementary data to the NOE distance constraints, exhibits very similar topology and comparable agreement with the NOE constraints as the published NMR structure. However, the two structures differ in the patterns of differences between observed and computed {sup 13}C{sup {alpha}} chemical shifts, {Delta}{sub ca,i}, for the individual residues along the sequence. This indicates that the {Delta}{sub ca,i} -values for the NAB protein are primarily a consequence of the limited sampling by the bundles of 20 conformers used, as in common practice, to represent the two NMR structures, rather than of local flaws in the structures.

  11. Predicting the severity of nuclear power plant transients using nearest neighbors modeling optimized by genetic algorithms on a parallel computer

    International Nuclear Information System (INIS)

    Lin, J.; Bartal, Y.; Uhrig, R.E.

    1995-01-01

    The importance of automatic diagnostic systems for nuclear power plants (NPPs) has been discussed in numerous studies, and various such systems have been proposed. None of those systems were designed to predict the severity of the diagnosed scenario. A classification and severity prediction system for NPP transients is developed. The system is based on nearest neighbors modeling, which is optimized using genetic algorithms. The optimization process is used to determine the most important variables for each of the transient types analyzed. An enhanced version of the genetic algorithms is used in which a local downhill search is performed to further increase the accuracy achieved. The genetic algorithms search was implemented on a massively parallel supercomputer, the KSR1-64, to perform the analysis in a reasonable time. The data for this study were supplied by the high-fidelity simulator of the San Onofre unit 1 pressurized water reactor

  12. Obstacle Detection for Intelligent Transportation Systems Using Deep Stacked Autoencoder and k-Nearest Neighbor Scheme

    KAUST Repository

    Dairi, Abdelkader; Harrou, Fouzi; Sun, Ying; Senouci, Mohamed

    2018-01-01

    Obstacle detection is an essential element for the development of intelligent transportation systems so that accidents can be avoided. In this study, we propose a stereovisionbased method for detecting obstacles in urban environment. The proposed method uses a deep stacked auto-encoders (DSA) model that combines the greedy learning features with the dimensionality reduction capacity and employs an unsupervised k-nearest neighbors algorithm (KNN) to accurately and reliably detect the presence of obstacles. We consider obstacle detection as an anomaly detection problem. We evaluated the proposed method by using practical data from three publicly available datasets, the Malaga stereovision urban dataset (MSVUD), the Daimler urban segmentation dataset (DUSD), and Bahnhof dataset. Also, we compared the efficiency of DSA-KNN approach to the deep belief network (DBN)-based clustering schemes. Results show that the DSA-KNN is suitable to visually monitor urban scenes.

  13. Phosphorous vacancy nearest neighbor hopping induced instabilities in InP capacitors II. Computer simulation

    International Nuclear Information System (INIS)

    Juang, M.T.; Wager, J.F.; Van Vechten, J.A.

    1988-01-01

    Drain current drift in InP metal insulator semiconductor devices display distinct activation energies and pre-exponential factors. The authors have given evidence that these result from two physical mechanisms: thermionic tunneling of electrons into native oxide traps and phosphorous vacancy nearest neighbor hopping (PVNNH). They here present a computer simulation of the effect of the PVNHH mechanism on flatband voltage shift vs. bias stress time measurements. The simulation is based on an analysis of the kinetics of the PVNNH defect reaction sequence in which the electron concentration in the channel is related to the applied bias by a solution of the Poisson equation. The simulation demonstrates quantitatively that the temperature dependence of the flatband shift is associated with PVNNH for temperatures above room temperature

  14. False-nearest-neighbors algorithm and noise-corrupted time series

    International Nuclear Information System (INIS)

    Rhodes, C.; Morari, M.

    1997-01-01

    The false-nearest-neighbors (FNN) algorithm was originally developed to determine the embedding dimension for autonomous time series. For noise-free computer-generated time series, the algorithm does a good job in predicting the embedding dimension. However, the problem of predicting the embedding dimension when the time-series data are corrupted by noise was not fully examined in the original studies of the FNN algorithm. Here it is shown that with large data sets, even small amounts of noise can lead to incorrect prediction of the embedding dimension. Surprisingly, as the length of the time series analyzed by FNN grows larger, the cause of incorrect prediction becomes more pronounced. An analysis of the effect of noise on the FNN algorithm and a solution for dealing with the effects of noise are given here. Some results on the theoretically correct choice of the FNN threshold are also presented. copyright 1997 The American Physical Society

  15. Algoritma Interpolasi Nearest-Neighbor untuk Pendeteksian Sampul Pulsa Oscilometri Menggunakan Mikrokontroler Berbiaya Rendah

    Directory of Open Access Journals (Sweden)

    Firdaus Firdaus

    2017-12-01

    Full Text Available Non-invasive blood pressure measurement devices are widely available in the marketplace. Most of these devices use the oscillometric principle that store and analyze oscillometric waveforms during cuff deflation to obtain mean arterial pressure, systolic blood pressure and diastolic blood pressure. Those pressure values are determined from the oscillometric waveform envelope. Several methods to detect the envelope of oscillometric pulses utilize a complex algorithm that requires a large capacity memory and certainly difficult to process by a low memory capacity embedded system. A simple nearest-neighbor interpolation method is applied for oscillometric pulse envelope detection in non-invasive blood pressure measurement using microcontroller such ATmega328. The experiment yields 59 seconds average time to process the computation with 3.6% average percent error in blood pressure measurement.

  16. Nearest neighbor spacing distributions of low-lying levels of vibrational nuclei

    International Nuclear Information System (INIS)

    Abul-Magd, A.Y.; Simbel, M.H.

    1996-01-01

    Energy-level statistics are considered for nuclei whose Hamiltonian is divided into intrinsic and collective-vibrational terms. The levels are described as a random superposition of independent sequences, each corresponding to a given number of phonons. The intrinsic motion is assumed chaotic. The level spacing distribution is found to be intermediate between the Wigner and Poisson distributions and similar in form to the spacing distribution of a system with classical phase space divided into separate regular and chaotic domains. We have obtained approximate expressions for the nearest neighbor spacing and cumulative spacing distribution valid when the level density is described by a constant-temperature formula and not involving additional free parameters. These expressions have been able to achieve good agreement with the experimental spacing distributions. copyright 1996 The American Physical Society

  17. K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

    Directory of Open Access Journals (Sweden)

    Cheng Lu

    2015-01-01

    Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

  18. Obstacle Detection for Intelligent Transportation Systems Using Deep Stacked Autoencoder and k-Nearest Neighbor Scheme

    KAUST Repository

    Dairi, Abdelkader

    2018-04-30

    Obstacle detection is an essential element for the development of intelligent transportation systems so that accidents can be avoided. In this study, we propose a stereovisionbased method for detecting obstacles in urban environment. The proposed method uses a deep stacked auto-encoders (DSA) model that combines the greedy learning features with the dimensionality reduction capacity and employs an unsupervised k-nearest neighbors algorithm (KNN) to accurately and reliably detect the presence of obstacles. We consider obstacle detection as an anomaly detection problem. We evaluated the proposed method by using practical data from three publicly available datasets, the Malaga stereovision urban dataset (MSVUD), the Daimler urban segmentation dataset (DUSD), and Bahnhof dataset. Also, we compared the efficiency of DSA-KNN approach to the deep belief network (DBN)-based clustering schemes. Results show that the DSA-KNN is suitable to visually monitor urban scenes.

  19. Rapid and Robust Cross-Correlation-Based Seismic Phase Identification Using an Approximate Nearest Neighbor Method

    Science.gov (United States)

    Tibi, R.; Young, C. J.; Gonzales, A.; Ballard, S.; Encarnacao, A. V.

    2016-12-01

    The matched filtering technique involving the cross-correlation of a waveform of interest with archived signals from a template library has proven to be a powerful tool for detecting events in regions with repeating seismicity. However, waveform correlation is computationally expensive, and therefore impractical for large template sets unless dedicated distributed computing hardware and software are used. In this study, we introduce an Approximate Nearest Neighbor (ANN) approach that enables the use of very large template libraries for waveform correlation without requiring a complex distributed computing system. Our method begins with a projection into a reduced dimensionality space based on correlation with a randomized subset of the full template archive. Searching for a specified number of nearest neighbors is accomplished by using randomized K-dimensional trees. We used the approach to search for matches to each of 2700 analyst-reviewed signal detections reported for May 2010 for the IMS station MKAR. The template library in this case consists of a dataset of more than 200,000 analyst-reviewed signal detections for the same station from 2002-2014 (excluding May 2010). Of these signal detections, 60% are teleseismic first P, and 15% regional phases (Pn, Pg, Sn, and Lg). The analyses performed on a standard desktop computer shows that the proposed approach performs the search of the large template libraries about 20 times faster than the standard full linear search, while achieving recall rates greater than 80%, with the recall rate increasing for higher correlation values. To decide whether to confirm a match, we use a hybrid method involving a cluster approach for queries with two or more matches, and correlation score for single matches. Of the signal detections that passed our confirmation process, 52% were teleseismic first P, and 30% were regional phases.

  20. A Distributed Approach to Continuous Monitoring of Constrained k-Nearest Neighbor Queries in Road Networks

    Directory of Open Access Journals (Sweden)

    Hyung-Ju Cho

    2012-01-01

    Full Text Available Given two positive parameters k and r, a constrained k-nearest neighbor (CkNN query returns the k closest objects within a network distance r of the query location in road networks. In terms of the scalability of monitoring these CkNN queries, existing solutions based on central processing at a server suffer from a sudden and sharp rise in server load as well as messaging cost as the number of queries increases. In this paper, we propose a distributed and scalable scheme called DAEMON for the continuous monitoring of CkNN queries in road networks. Our query processing is distributed among clients (query objects and server. Specifically, the server evaluates CkNN queries issued at intersections of road segments, retrieves the objects on the road segments between neighboring intersections, and sends responses to the query objects. Finally, each client makes its own query result using this server response. As a result, our distributed scheme achieves close-to-optimal communication costs and scales well to large numbers of monitoring queries. Exhaustive experimental results demonstrate that our scheme substantially outperforms its competitor in terms of query processing time and messaging cost.

  1. A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality.

    Science.gov (United States)

    Wang, Xueyi

    2012-02-08

    The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.

  2. Elliptic Painlevé equations from next-nearest-neighbor translations on the E_8^{(1)} lattice

    Science.gov (United States)

    Joshi, Nalini; Nakazono, Nobutaka

    2017-07-01

    The well known elliptic discrete Painlevé equation of Sakai is constructed by a standard translation on the E_8(1) lattice, given by nearest neighbor vectors. In this paper, we give a new elliptic discrete Painlevé equation obtained by translations along next-nearest-neighbor vectors. This equation is a generic (8-parameter) version of a 2-parameter elliptic difference equation found by reduction from Adler’s partial difference equation, the so-called Q4 equation. We also provide a projective reduction of the well known equation of Sakai.

  3. Linear perturbation renormalization group for the two-dimensional Ising model with nearest- and next-nearest-neighbor interactions in a field

    Science.gov (United States)

    Sznajd, J.

    2016-12-01

    The linear perturbation renormalization group (LPRG) is used to study the phase transition of the weakly coupled Ising chains with intrachain (J ) and interchain nearest-neighbor (J1) and next-nearest-neighbor (J2) interactions forming the triangular and rectangular lattices in a field. The phase diagrams with the frustration point at J2=-J1/2 for a rectangular lattice and J2=-J1 for a triangular lattice have been found. The LPRG calculations support the idea that the phase transition is always continuous except for the frustration point and is accompanied by a divergence of the specific heat. For the antiferromagnetic chains, the external field does not change substantially the shape of the phase diagram. The critical temperature is suppressed to zero according to the power law when approaching the frustration point with an exponent dependent on the value of the field.

  4. Monte Carlo study of a ferrimagnetic mixed-spin (2, 5/2) system with the nearest and next-nearest neighbors exchange couplings

    Science.gov (United States)

    Bi, Jiang-lin; Wang, Wei; Li, Qi

    2017-07-01

    In this paper, the effects of the next-nearest neighbors exchange couplings on the magnetic and thermal properties of the ferrimagnetic mixed-spin (2, 5/2) Ising model on a 3D honeycomb lattice have been investigated by the use of Monte Carlo simulation. In particular, the influences of exchange couplings (Ja, Jb, Jan) and the single-ion anisotropy(Da) on the phase diagrams, the total magnetization, the sublattice magnetization, the total susceptibility, the internal energy and the specific heat have been discussed in detail. The results clearly show that the system can express the critical and compensation behavior within the next-nearest neighbors exchange coupling. Great deals of the M curves such as N-, Q-, P- and L-types have been discovered, owing to the competition between the exchange coupling and the temperature. Compared with other theoretical and experimental works, our results have an excellent consistency with theirs.

  5. Local Order in the Unfolded State: Conformational Biases and Nearest Neighbor Interactions

    Directory of Open Access Journals (Sweden)

    Siobhan Toal

    2014-07-01

    Full Text Available The discovery of Intrinsically Disordered Proteins, which contain significant levels of disorder yet perform complex biologically functions, as well as unwanted aggregation, has motivated numerous experimental and theoretical studies aimed at describing residue-level conformational ensembles. Multiple lines of evidence gathered over the last 15 years strongly suggest that amino acids residues display unique and restricted conformational preferences in the unfolded state of peptides and proteins, contrary to one of the basic assumptions of the canonical random coil model. To fully understand residue level order/disorder, however, one has to gain a quantitative, experimentally based picture of conformational distributions and to determine the physical basis underlying residue-level conformational biases. Here, we review the experimental, computational and bioinformatic evidence for conformational preferences of amino acid residues in (mostly short peptides that can be utilized as suitable model systems for unfolded states of peptides and proteins. In this context particular attention is paid to the alleged high polyproline II preference of alanine. We discuss how these conformational propensities may be modulated by peptide solvent interactions and so called nearest-neighbor interactions. The relevance of conformational propensities for the protein folding problem and the understanding of IDPs is briefly discussed.

  6. Disordering scaling and generalized nearest-neighbor approach in the thermodynamics of Lennard-Jones systems

    International Nuclear Information System (INIS)

    Vorob'ev, V.S.

    2003-01-01

    We suggest a concept of multiple disordering scaling of the crystalline state. Such a scaling procedure applied to a crystal leads to the liquid and (in low density limit) gas states. This approach provides an explanation to a high value of configuration (common) entropy of liquefied noble gases, which can be deduced from experimental data. We use the generalized nearest-neighbor approach to calculate free energy and pressure of the Lennard-Jones systems after performing this scaling procedure. These thermodynamic functions depend on one parameter characterizing the disordering only. Condensed states of the system (liquid and solid) correspond to small values of this parameter. When this parameter tends to unity, we get an asymptotically exact equation of state for a gas involving the second virial coefficient. A reasonable choice of the values for the disordering parameter (ranging between zero and unity) allows us to find the lines of coexistence between different phase states in the Lennard-Jones systems, which are in a good agreement with the available experimental data

  7. Fracton topological order from nearest-neighbor two-spin interactions and dualities

    Science.gov (United States)

    Slagle, Kevin; Kim, Yong Baek

    2017-10-01

    Fracton topological order describes a remarkable phase of matter, which can be characterized by fracton excitations with constrained dynamics and a ground-state degeneracy that increases exponentially with the length of the system on a three-dimensional torus. However, previous models exhibiting this order require many-spin interactions, which may be very difficult to realize in a real material or cold atom system. In this work, we present a more physically realistic model which has the so-called X-cube fracton topological order [Vijay, Haah, and Fu, Phys. Rev. B 94, 235157 (2016), 10.1103/PhysRevB.94.235157] but only requires nearest-neighbor two-spin interactions. The model lives on a three-dimensional honeycomb-based lattice with one to two spin-1/2 degrees of freedom on each site and a unit cell of six sites. The model is constructed from two orthogonal stacks of Z2 topologically ordered Kitaev honeycomb layers [Kitaev, Ann. Phys. 321, 2 (2006), 10.1016/j.aop.2005.10.005], which are coupled together by a two-spin interaction. It is also shown that a four-spin interaction can be included to instead stabilize 3+1D Z2 topological order. We also find dual descriptions of four quantum phase transitions in our model, all of which appear to be discontinuous first-order transitions.

  8. Third nearest neighbor parameterized tight binding model for graphene nano-ribbons

    Directory of Open Access Journals (Sweden)

    Van-Truong Tran

    2017-07-01

    Full Text Available The existing tight binding models can very well reproduce the ab initio band structure of a 2D graphene sheet. For graphene nano-ribbons (GNRs, the current sets of tight binding parameters can successfully describe the semi-conducting behavior of all armchair GNRs. However, they are still failing in reproducing accurately the slope of the bands that is directly associated with the group velocity and the effective mass of electrons. In this work, both density functional theory and tight binding calculations were performed and a new set of tight binding parameters up to the third nearest neighbors including overlap terms is introduced. The results obtained with this model offer excellent agreement with the predictions of the density functional theory in most cases of ribbon structures, even in the high-energy region. Moreover, this set can induce electron-hole asymmetry as manifested in results from density functional theory. Relevant outcomes are also achieved for armchair ribbons of various widths as well as for zigzag structures, thus opening a route for multi-scale atomistic simulation of large systems that cannot be considered using density functional theory.

  9. Spatiotemporal distribution of Oklahoma earthquakes: Exploring relationships using a nearest-neighbor approach

    Science.gov (United States)

    Vasylkivska, Veronika S.; Huerta, Nicolas J.

    2017-07-01

    Determining the spatiotemporal characteristics of natural and induced seismic events holds the opportunity to gain new insights into why these events occur. Linking the seismicity characteristics with other geologic, geographic, natural, or anthropogenic factors could help to identify the causes and suggest mitigation strategies that reduce the risk associated with such events. The nearest-neighbor approach utilized in this work represents a practical first step toward identifying statistically correlated clusters of recorded earthquake events. Detailed study of the Oklahoma earthquake catalog's inherent errors, empirical model parameters, and model assumptions is presented. We found that the cluster analysis results are stable with respect to empirical parameters (e.g., fractal dimension) but were sensitive to epicenter location errors and seismicity rates. Most critically, we show that the patterns in the distribution of earthquake clusters in Oklahoma are primarily defined by spatial relationships between events. This observation is a stark contrast to California (also known for induced seismicity) where a comparable cluster distribution is defined by both spatial and temporal interactions between events. These results highlight the difficulty in understanding the mechanisms and behavior of induced seismicity but provide insights for future work.

  10. Geometric k-nearest neighbor estimation of entropy and mutual information

    Science.gov (United States)

    Lord, Warren M.; Sun, Jie; Bollt, Erik M.

    2018-03-01

    Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induce a bias due to a poor description of the local geometry of the underlying probability measure. We introduce a new class of knn estimators that we call geometric knn estimators (g-knn), which use more complex local volume elements to better model the local geometry of the probability measures. As an example of this class of estimators, we develop a g-knn estimator of entropy and mutual information based on elliptical volume elements, capturing the local stretching and compression common to a wide range of dynamical system attractors. A series of numerical examples in which the thickness of the underlying distribution and the sample sizes are varied suggest that local geometry is a source of problems for knn methods such as the Kraskov-Stögbauer-Grassberger estimator when local geometric effects cannot be removed by global preprocessing of the data. The g-knn method performs well despite the manipulation of the local geometry. In addition, the examples suggest that the g-knn estimators can be of particular relevance to applications in which the system is large, but the data size is limited.

  11. An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.

    Science.gov (United States)

    Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu

    2017-08-05

    The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.

  12. An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    He Xu

    2017-08-01

    Full Text Available The Global Positioning System (GPS is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID, etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K-Nearest Neighbor (BKNN. The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.

  13. Reentrant behavior in the nearest-neighbor Ising antiferromagnet in a magnetic field

    Science.gov (United States)

    Neto, Minos A.; de Sousa, J. Ricardo

    2004-12-01

    Motived by the H-T phase diagram in the bcc Ising antiferromagnetic with nearest-neighbor interactions obtained by Monte Carlo simulation [Landau, Phys. Rev. B 16, 4164 (1977)] that shows a reentrant behavior at low temperature, with two critical temperatures in magnetic field about 2% greater than the critical value Hc=8J , we apply the effective field renormalization group (EFRG) approach in this model on three-dimensional lattices (simple cubic-sc and body centered cubic-bcc). We find that the critical curve TN(H) exhibits a maximum point around of H≃Hc only in the bcc lattice case. We also discuss the critical behavior by the effective field theory in clusters with one (EFT-1) and two (EFT-2) spins, and a reentrant behavior is observed for the sc and bcc lattices. We have compared our results of EFRG in the bcc lattice with Monte Carlo and series expansion, and we observe a good accordance between the methods.

  14. Magnetization reversal in magnetic dot arrays: Nearest-neighbor interactions and global configurational anisotropy

    Energy Technology Data Exchange (ETDEWEB)

    Van de Wiele, Ben [Department of Electrical Energy, Systems and Automation, Ghent University, Technologiepark 913, B-9052 Ghent-Zwijnaarde (Belgium); Fin, Samuele [Dipartimento di Fisica e Scienze della Terra, Università degli Studi di Ferrara, 44122 Ferrara (Italy); Pancaldi, Matteo [CIC nanoGUNE, E-20018 Donostia-San Sebastian (Spain); Vavassori, Paolo [CIC nanoGUNE, E-20018 Donostia-San Sebastian (Spain); IKERBASQUE, Basque Foundation for Science, E-48013 Bilbao (Spain); Sarella, Anandakumar [Physics Department, Mount Holyoke College, 211 Kendade, 50 College St., South Hadley, Massachusetts 01075 (United States); Bisero, Diego [Dipartimento di Fisica e Scienze della Terra, Università degli Studi di Ferrara, 44122 Ferrara (Italy); CNISM, Unità di Ferrara, 44122 Ferrara (Italy)

    2016-05-28

    Various proposals for future magnetic memories, data processing devices, and sensors rely on a precise control of the magnetization ground state and magnetization reversal process in periodically patterned media. In finite dot arrays, such control is hampered by the magnetostatic interactions between the nanomagnets, leading to the non-uniform magnetization state distributions throughout the sample while reversing. In this paper, we evidence how during reversal typical geometric arrangements of dots in an identical magnetization state appear that originate in the dominance of either Global Configurational Anisotropy or Nearest-Neighbor Magnetostatic interactions, which depends on the fields at which the magnetization reversal sets in. Based on our findings, we propose design rules to obtain the uniform magnetization state distributions throughout the array, and also suggest future research directions to achieve non-uniform state distributions of interest, e.g., when aiming at guiding spin wave edge-modes through dot arrays. Our insights are based on the Magneto-Optical Kerr Effect and Magnetic Force Microscopy measurements as well as the extensive micromagnetic simulations.

  15. Evidence of codon usage in the nearest neighbor spacing distribution of bases in bacterial genomes

    Science.gov (United States)

    Higareda, M. F.; Geiger, O.; Mendoza, L.; Méndez-Sánchez, R. A.

    2012-02-01

    Statistical analysis of whole genomic sequences usually assumes a homogeneous nucleotide density throughout the genome, an assumption that has been proved incorrect for several organisms since the nucleotide density is only locally homogeneous. To avoid giving a single numerical value to this variable property, we propose the use of spectral statistics, which characterizes the density of nucleotides as a function of its position in the genome. We show that the cumulative density of bases in bacterial genomes can be separated into an average (or secular) plus a fluctuating part. Bacterial genomes can be divided into two groups according to the qualitative description of their secular part: linear and piecewise linear. These two groups of genomes show different properties when their nucleotide spacing distribution is studied. In order to analyze genomes having a variable nucleotide density, statistically, the use of unfolding is necessary, i.e., to get a separation between the secular part and the fluctuations. The unfolding allows an adequate comparison with the statistical properties of other genomes. With this methodology, four genomes were analyzed Burkholderia, Bacillus, Clostridium and Corynebacterium. Interestingly, the nearest neighbor spacing distributions or detrended distance distributions are very similar for species within the same genus but they are very different for species from different genera. This difference can be attributed to the difference in the codon usage.

  16. Heterogeneous autoregressive model with structural break using nearest neighbor truncation volatility estimators for DAX.

    Science.gov (United States)

    Chin, Wen Cheong; Lee, Min Cherng; Yap, Grace Lee Ching

    2016-01-01

    High frequency financial data modelling has become one of the important research areas in the field of financial econometrics. However, the possible structural break in volatile financial time series often trigger inconsistency issue in volatility estimation. In this study, we propose a structural break heavy-tailed heterogeneous autoregressive (HAR) volatility econometric model with the enhancement of jump-robust estimators. The breakpoints in the volatility are captured by dummy variables after the detection by Bai-Perron sequential multi breakpoints procedure. In order to further deal with possible abrupt jump in the volatility, the jump-robust volatility estimators are composed by using the nearest neighbor truncation approach, namely the minimum and median realized volatility. Under the structural break improvements in both the models and volatility estimators, the empirical findings show that the modified HAR model provides the best performing in-sample and out-of-sample forecast evaluations as compared with the standard HAR models. Accurate volatility forecasts have direct influential to the application of risk management and investment portfolio analysis.

  17. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

    Science.gov (United States)

    Rivas, Elena; Lang, Raymond; Eddy, Sean R

    2012-02-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.

  18. A Comparison of the Spatial Linear Model to Nearest Neighbor (k-NN) Methods for Forestry Applications

    Science.gov (United States)

    Jay M. Ver Hoef; Hailemariam Temesgen; Sergio Gómez

    2013-01-01

    Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically,...

  19. Mapping wildland fuels and forest structure for land management: a comparison of nearest neighbor imputation and other methods

    Science.gov (United States)

    Kenneth B. Pierce; Janet L. Ohmann; Michael C. Wimberly; Matthew J. Gregory; Jeremy S. Fried

    2009-01-01

    Land managers need consistent information about the geographic distribution of wildland fuels and forest structure over large areas to evaluate fire risk and plan fuel treatments. We compared spatial predictions for 12 fuel and forest structure variables across three regions in the western United States using gradient nearest neighbor (GNN) imputation, linear models (...

  20. Antiferromagnetic geometric frustration under the influence of the next-nearest-neighbor interaction. An exactly solvable model

    Science.gov (United States)

    Jurčišinová, E.; Jurčišin, M.

    2018-02-01

    The influence of the next-nearest-neighbor interaction on the properties of the geometrically frustrated antiferromagnetic systems is investigated in the framework of the exactly solvable antiferromagnetic spin- 1 / 2 Ising model in the external magnetic field on the square-kagome recursive lattice, where the next-nearest-neighbor interaction is supposed between sites within each elementary square of the lattice. The thermodynamic properties of the model are investigated in detail and it is shown that the competition between the nearest-neighbor antiferromagnetic interaction and the next-nearest-neighbor ferromagnetic interaction changes properties of the single-point ground states but does not change the frustrated character of the basic model. On the other hand, the presence of the antiferromagnetic next-nearest-neighbor interaction leads to the enhancement of the frustration effects with the formation of additional plateau and single-point ground states at low temperatures. Exact expressions for magnetizations and residual entropies of all ground states of the model are found. It is shown that the model exhibits various ground states with the same value of magnetization but different macroscopic degeneracies as well as the ground states with different values of magnetization but the same value of the residual entropy. The specific heat capacity is investigated and it is shown that the model exhibits the Schottky-type anomaly behavior in the vicinity of each single-point ground state value of the magnetic field. The formation of the field-induced double-peak structure of the specific heat capacity at low temperatures is demonstrated and it is shown that its very existence is directly related to the presence of highly macroscopically degenerated single-point ground states in the model.

  1. Feature selection and nearest centroid classification for protein mass spectrometry

    Directory of Open Access Journals (Sweden)

    Levner Ilya

    2005-03-01

    Full Text Available Abstract Background The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. Results This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. Conclusion This study tested a number of popular feature

  2. Nearest-neighbor Kitaev exchange blocked by charge order in electron-doped α -RuCl3

    Science.gov (United States)

    Koitzsch, A.; Habenicht, C.; Müller, E.; Knupfer, M.; Büchner, B.; Kretschmer, S.; Richter, M.; van den Brink, J.; Börrnert, F.; Nowak, D.; Isaeva, A.; Doert, Th.

    2017-10-01

    A quantum spin liquid might be realized in α -RuCl3 , a honeycomb-lattice magnetic material with substantial spin-orbit coupling. Moreover, α -RuCl3 is a Mott insulator, which implies the possibility that novel exotic phases occur upon doping. Here, we study the electronic structure of this material when intercalated with potassium by photoemission spectroscopy, electron energy loss spectroscopy, and density functional theory calculations. We obtain a stable stoichiometry at K0.5RuCl3 . This gives rise to a peculiar charge disproportionation into formally Ru2 + (4 d6 ) and Ru3 + (4 d5 ). Every Ru 4 d5 site with one hole in the t2 g shell is surrounded by nearest neighbors of 4 d6 character, where the t2 g level is full and magnetically inert. Thus, each type of Ru site forms a triangular lattice, and nearest-neighbor interactions of the original honeycomb are blocked.

  3. Nearest neighbor imputation using spatial-temporal correlations in wireless sensor networks.

    Science.gov (United States)

    Li, YuanYuan; Parker, Lynne E

    2014-01-01

    Missing data is common in Wireless Sensor Networks (WSNs), especially with multi-hop communications. There are many reasons for this phenomenon, such as unstable wireless communications, synchronization issues, and unreliable sensors. Unfortunately, missing data creates a number of problems for WSNs. First, since most sensor nodes in the network are battery-powered, it is too expensive to have the nodes retransmit missing data across the network. Data re-transmission may also cause time delays when detecting abnormal changes in an environment. Furthermore, localized reasoning techniques on sensor nodes (such as machine learning algorithms to classify states of the environment) are generally not robust enough to handle missing data. Since sensor data collected by a WSN is generally correlated in time and space, we illustrate how replacing missing sensor values with spatially and temporally correlated sensor values can significantly improve the network's performance. However, our studies show that it is important to determine which nodes are spatially and temporally correlated with each other. Simple techniques based on Euclidean distance are not sufficient for complex environmental deployments. Thus, we have developed a novel Nearest Neighbor (NN) imputation method that estimates missing data in WSNs by learning spatial and temporal correlations between sensor nodes. To improve the search time, we utilize a k d-tree data structure, which is a non-parametric, data-driven binary search tree. Instead of using traditional mean and variance of each dimension for k d-tree construction, and Euclidean distance for k d-tree search, we use weighted variances and weighted Euclidean distances based on measured percentages of missing data. We have evaluated this approach through experiments on sensor data from a volcano dataset collected by a network of Crossbow motes, as well as experiments using sensor data from a highway traffic monitoring application. Our experimental

  4. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

    KAUST Repository

    Tao, Yufei

    2010-07-01

    Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial

  5. Phase Transition and Critical Values of a Nearest-Neighbor System with Uncountable Local State Space on Cayley Trees

    International Nuclear Information System (INIS)

    Jahnel, Benedikt; Külske, Christof; Botirov, Golibjon I.

    2014-01-01

    We consider a ferromagnetic nearest-neighbor model on a Cayley tree of degree k ⩾ 2 with uncountable local state space [0,1] where the energy function depends on a parameter θ ∊[0, 1). We show that for 0 ⩽ θ ⩽ 5 3 k the model has a unique translation-invariant Gibbs measure. If 5 3 k < θ < 1 , there is a phase transition, in particular there are three translation-invariant Gibbs measures

  6. Influence of geometry on light harvesting in dendrimeric systems. II. nth-nearest neighbor effects and the onset of percolation

    International Nuclear Information System (INIS)

    Bentz, Jonathan L.; Kozak, John J.

    2006-01-01

    We explore the effect of imposing different constraints (biases, boundary conditions) on the mean time to trapping (or mean walklength) for a particle (excitation) migrating on a finite dendrimer lattice with a centrally positioned trap. By mobilizing the theory of finite Markov processes, we are able to obtain exact analytic expressions for site-specific walklengths as well as the overall walklength for both nearest-neighbor and second-nearest-neighbor displacements. This allows the comparison with and generalization of earlier results [A. Bar-Haim, J. Klafter, J. Phys. Chem. B 102 (1998) 1662; A. Bar-Haim, J. Klafter, J. Lumin. 76, 77 (1998) 197; O. Flomenbom, R.J. Amir, D. Shabat, J. Klafter, J. Lumin. 111 (2005) 315; J.L. Bentz, F.N. Hosseini, J.J. Kozak, Chem. Phys. Lett. 370 (2003) 319]. A novel feature of this work is the establishment of a connection between the random walk models studied here and percolation theory. The full dynamical behavior was also determined via solution of the stochastic master equation, and the results obtained compared with recent spectroscopic experiments

  7. Influence of geometry on light harvesting in dendrimeric systems. II. nth-nearest neighbor effects and the onset of percolation

    Energy Technology Data Exchange (ETDEWEB)

    Bentz, Jonathan L. [Department of Chemistry, Iowa State University, Ames, IA, 50011 (United States)]. E-mail: jnbntz@iastate.edu; Kozak, John J. [Beckman Institute, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125-7400 (United States)

    2006-11-15

    We explore the effect of imposing different constraints (biases, boundary conditions) on the mean time to trapping (or mean walklength) for a particle (excitation) migrating on a finite dendrimer lattice with a centrally positioned trap. By mobilizing the theory of finite Markov processes, we are able to obtain exact analytic expressions for site-specific walklengths as well as the overall walklength for both nearest-neighbor and second-nearest-neighbor displacements. This allows the comparison with and generalization of earlier results [A. Bar-Haim, J. Klafter, J. Phys. Chem. B 102 (1998) 1662; A. Bar-Haim, J. Klafter, J. Lumin. 76, 77 (1998) 197; O. Flomenbom, R.J. Amir, D. Shabat, J. Klafter, J. Lumin. 111 (2005) 315; J.L. Bentz, F.N. Hosseini, J.J. Kozak, Chem. Phys. Lett. 370 (2003) 319]. A novel feature of this work is the establishment of a connection between the random walk models studied here and percolation theory. The full dynamical behavior was also determined via solution of the stochastic master equation, and the results obtained compared with recent spectroscopic experiments.

  8. Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Cobaugh Christian W

    2004-08-01

    Full Text Available Abstract Background A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1. Results The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases. Conclusion Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.

  9. PERBANDINGAN K-NEAREST NEIGHBOR DAN NAIVE BAYES UNTUK KLASIFIKASI TANAH LAYAK TANAM POHON JATI

    Directory of Open Access Journals (Sweden)

    Didik Srianto

    2016-10-01

    Full Text Available Data mining adalah proses menganalisa data dari perspektif yang berbeda dan menyimpulkannya menjadi informasi-informasi penting yang dapat dipakai untuk meningkatkan keuntungan, memperkecil biaya pengeluaran, atau bahkan keduanya. Secara teknis, data mining dapat disebut sebagai proses untuk menemukan korelasi atau pola dari ratusan atau ribuan field dari sebuah relasional database yang besar. Pada perum perhutani KPH SEMARANG saat ini masih menggunakan cara manual untuk menentukan jenis tanaman (jati / non jati. K-Nearest Neighbour atau k-NN merupakan algoritma data mining yang dapat digunakan untuk proses klasifikasi dan regresi. Naive bayes Classifier merupakan suatu teknik yang dapat digunakan untuk teknik klasifikasi. Pada penelitian ini k-NN dan Naive Bayes akan digunakan untuk mengklasifikasi data pohon jati dari perum perhutani KPH SEMARANG. Yang mana hasil klasifikasi dari k-NN dan Naive Bayes akan dibandingkan hasilnya. Pengujian dilakukan menggunakan software RapidMiner. Setelah dilakukan pengujian k-NN dianggap lebih baik dari Naife Bayes dengan akurasi 96.66% dan 82.63. Kata kunci -k-NN,Klasifikasi,Naive Bayes,Penanaman Pohon Jati

  10. Eksperimen Seleksi Fitur Pada Parameter Proyek Untuk Software Effort Estimation dengan K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    Fachruddin Fachruddin

    2017-07-01

    Full Text Available Software Effort Estimation adalah proses estimasi biaya perangkat lunak sebagai suatu proses penting dalam melakukan proyek perangkat lunak. Berbagai penelitian terdahulu telah melakukan estimasi usaha perangkat lunak dengan berbagai metode, baik metode machine learning  maupun non machine learning. Penelitian ini mengadakan set eksperimen seleksi atribut pada parameter proyek menggunakan teknik k-nearest neighbours sebagai estimasinya dengan melakukan seleksi atribut menggunakan information gain dan mutual information serta bagaimana menemukan  parameter proyek yang paling representif pada software effort estimation. Dataset software estimation effort yang digunakan pada eksperimen adalah  yakni albrecht, china, kemerer dan mizayaki94 yang dapat diperoleh dari repositori data khusus Software Effort Estimation melalui url http://openscience.us/repo/effort/. Selanjutnya peneliti melakukan pembangunan aplikasi seleksi atribut untuk menyeleksi parameter proyek. Sistem ini menghasilkan dataset arff yang telah diseleksi. Aplikasi ini dibangun dengan bahasa java menggunakan IDE Netbean. Kemudian dataset yang telah di-generate merupakan parameter hasil seleksi yang akan dibandingkan pada saat melakukan Software Effort Estimation menggunakan tool WEKA . Seleksi Fitur berhasil menurunkan nilai error estimasi (yang diwakilkan oleh nilai RAE dan RMSE. Artinya bahwa semakin rendah nilai error (RAE dan RMSE maka semakin akurat nilai estimasi yang dihasilkan. Estimasi semakin baik setelah di lakukan seleksi fitur baik menggunakan information gain maupun mutual information. Dari nilai error yang dihasilkan maka dapat disimpulkan bahwa dataset yang dihasilkan seleksi fitur dengan metode information gain lebih baik dibanding mutual information namun, perbedaan keduanya tidak terlalu signifikan.

  11. Novel qsar combination forecast model for insect repellent coupling support vector regression and k-nearest-neighbor

    International Nuclear Information System (INIS)

    Wang, L.F.; Bai, L.Y.

    2013-01-01

    To improve the precision of quantitative structure-activity relationship (QSAR) modeling for aromatic carboxylic acid derivatives insect repellent, a novel nonlinear combination forecast model was proposed integrating support vector regression (SVR) and K-nearest neighbor (KNN): Firstly, search optimal kernel function and nonlinearly select molecular descriptors by the rule of minimum MSE value using SVR. Secondly, illuminate the effects of all descriptors on biological activity by multi-round enforcement resistance-selection. Thirdly, construct the sub-models with predicted values of different KNN. Then, get the optimal kernel and corresponding retained sub-models through subtle selection. Finally, make prediction with leave-one-out (LOO) method in the basis of reserved sub-models. Compared with previous widely used models, our work shows significant improvement in modeling performance, which demonstrates the superiority of the present combination forecast model. (author)

  12. A Diagnosis Method for Rotation Machinery Faults Based on Dimensionless Indexes Combined with K-Nearest Neighbor Algorithm

    Directory of Open Access Journals (Sweden)

    Jianbin Xiong

    2015-01-01

    Full Text Available It is difficult to well distinguish the dimensionless indexes between normal petrochemical rotating machinery equipment and those with complex faults. When the conflict of evidence is too big, it will result in uncertainty of diagnosis. This paper presents a diagnosis method for rotation machinery fault based on dimensionless indexes combined with K-nearest neighbor (KNN algorithm. This method uses a KNN algorithm and an evidence fusion theoretical formula to process fuzzy data, incomplete data, and accurate data. This method can transfer the signals from the petrochemical rotating machinery sensors to the reliability manners using dimensionless indexes and KNN algorithm. The input information is further integrated by an evidence synthesis formula to get the final data. The type of fault will be decided based on these data. The experimental results show that the proposed method can integrate data to provide a more reliable and reasonable result, thereby reducing the decision risk.

  13. Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier

    Science.gov (United States)

    Allen, Victoria W; Shirasu-Hiza, Mimi

    2018-01-01

    Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila. PMID:29485401

  14. Spin canting in a Dy-based single-chain magnet with dominant next-nearest-neighbor antiferromagnetic interactions

    Science.gov (United States)

    Bernot, K.; Luzon, J.; Caneschi, A.; Gatteschi, D.; Sessoli, R.; Bogani, L.; Vindigni, A.; Rettori, A.; Pini, M. G.

    2009-04-01

    We investigate theoretically and experimentally the static magnetic properties of single crystals of the molecular-based single-chain magnet of formula [Dy(hfac)3NIT(C6H4OPh)]∞ comprising alternating Dy3+ and organic radicals. The magnetic molar susceptibility χM displays a strong angular variation for sample rotations around two directions perpendicular to the chain axis. A peculiar inversion between maxima and minima in the angular dependence of χM occurs on increasing temperature. Using information regarding the monomeric building block as well as an ab initio estimation of the magnetic anisotropy of the Dy3+ ion, this “anisotropy-inversion” phenomenon can be assigned to weak one-dimensional ferromagnetism along the chain axis. This indicates that antiferromagnetic next-nearest-neighbor interactions between Dy3+ ions dominate, despite the large Dy-Dy separation, over the nearest-neighbor interactions between the radicals and the Dy3+ ions. Measurements of the field dependence of the magnetization, both along and perpendicularly to the chain, and of the angular dependence of χM in a strong magnetic field confirm such an interpretation. Transfer-matrix simulations of the experimental measurements are performed using a classical one-dimensional spin model with antiferromagnetic Heisenberg exchange interaction and noncollinear uniaxial single-ion anisotropies favoring a canted antiferromagnetic spin arrangement, with a net magnetic moment along the chain axis. The fine agreement obtained with experimental data provides estimates of the Hamiltonian parameters, essential for further study of the dynamics of rare-earth-based molecular chains.

  15. Microscopic theory of the nearest-neighbor valence bond sector of the spin-1/2 kagome antiferromagnet

    Science.gov (United States)

    Ralko, Arnaud; Mila, Frédéric; Rousochatzakis, Ioannis

    2018-03-01

    The spin-1/2 Heisenberg model on the kagome lattice, which is closely realized in layered Mott insulators such as ZnCu3(OH) 6Cl2 , is one of the oldest and most enigmatic spin-1/2 lattice models. While the numerical evidence has accumulated in favor of a quantum spin liquid, the debate is still open as to whether it is a Z2 spin liquid with very short-range correlations (some kind of resonating valence bond spin liquid), or an algebraic spin liquid with power-law correlations. To address this issue, we have pushed the program started by Rokhsar and Kivelson in their derivation of the effective quantum dimer model description of Heisenberg models to unprecedented accuracy for the spin-1/2 kagome, by including all the most important virtual singlet contributions on top of the orthogonalization of the nearest-neighbor valence bond singlet basis. Quite remarkably, the resulting picture is a competition between a Z2 spin liquid and a diamond valence bond crystal with a 12-site unit cell, as in the density-matrix renormalization group simulations of Yan et al. Furthermore, we found that, on cylinders of finite diameter d , there is a transition between the Z2 spin liquid at small d and the diamond valence bond crystal at large d , the prediction of the present microscopic description for the two-dimensional lattice. These results show that, if the ground state of the spin-1/2 kagome antiferromagnet can be described by nearest-neighbor singlet dimers, it is a diamond valence bond crystal, and, a contrario, that, if the system is a quantum spin liquid, it has to involve long-range singlets, consistent with the algebraic spin liquid scenario.

  16. Hole motion in the t-J and Hubbard models: Effect of a next-nearest-neighbor hopping

    International Nuclear Information System (INIS)

    Gagliano, E.; Bacci, S.; Dagotto, E.

    1990-01-01

    Using exact diagonalization techniques, we study one dynamical hole in the two-dimensional t-J and Hubbard models on a square lattice including a next-nearest-neighbor hopping t'. We present the phase diagram in the parameter space (J/t,t'/t), discussing the ground-state properties of the hole. At J=0, a crossing of levels exists at some value of t' separating a ferromagnetic from an antiferromagnetic ground state. For nonzero J, at least four different regions appear where the system behaves like an antiferromagnet or a (not fully saturated) ferromagnet. We study the quasiparticle behavior of the hole, showing that for small values of |t'| the previously presented string picture is still valid. We also find that, for a realistic set of parameters derived from the Cu-O Hamiltonian, the hole has momentum (π/2,π/2), suggesting an enhancement of the p-wave superconducting mode due to the second-neighbor interactions in the spin-bag picture. Results for the t-t'-U model are also discussed with conclusions similar to those of the t-t'-J model. In general we found that t'=0 is not a singular point of these models

  17. The influence of As/III pressure ratio on nitrogen nearest-neighbor environments in as-grown GaInNAs quantum wells

    International Nuclear Information System (INIS)

    Kudrawiec, R.; Poloczek, P.; Misiewicz, J.; Korpijaervi, V.-M.; Laukkanen, P.; Pakarinen, J.; Dumitrescu, M.; Guina, M.; Pessa, M.

    2009-01-01

    The energy fine structure, corresponding to different nitrogen nearest-neighbor environments, was observed in contactless electroreflectance (CER) spectra of as-grown GaInNAs quantum wells (QWs) obtained at various As/III pressure ratios. In the spectral range of the fundamental transition, two CER resonances were detected for samples grown at low As pressures whereas only one CER resonance was observed for samples obtained at higher As pressures. This resonance corresponds to the most favorable nitrogen nearest-neighbor environment in terms of the total crystal energy. It means that the nitrogen nearest-neighbor environment in GaInNAs QWs can be controlled in molecular beam epitaxy process by As/III pressure ratio.

  18. Analysis and Identification of Aptamer-Compound Interactions with a Maximum Relevance Minimum Redundancy and Nearest Neighbor Algorithm.

    Science.gov (United States)

    Wang, ShaoPeng; Zhang, Yu-Hang; Lu, Jing; Cui, Weiren; Hu, Jerry; Cai, Yu-Dong

    2016-01-01

    The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.

  19. Randomized Approaches for Nearest Neighbor Search in Metric Space When Computing the Pairwise Distance Is Extremely Expensive

    Science.gov (United States)

    Wang, Lusheng; Yang, Yong; Lin, Guohui

    Finding the closest object for a query in a database is a classical problem in computer science. For some modern biological applications, computing the similarity between two objects might be very time consuming. For example, it takes a long time to compute the edit distance between two whole chromosomes and the alignment cost of two 3D protein structures. In this paper, we study the nearest neighbor search problem in metric space, where the pair-wise distance between two objects in the database is known and we want to minimize the number of distances computed on-line between the query and objects in the database in order to find the closest object. We have designed two randomized approaches for indexing metric space databases, where objects are purely described by their distances with each other. Analysis and experiments show that our approaches only need to compute O(logn) objects in order to find the closest object, where n is the total number of objects in the database.

  20. ESTIMATING PHOTOMETRIC REDSHIFTS OF QUASARS VIA THE k-NEAREST NEIGHBOR APPROACH BASED ON LARGE SURVEY DATABASES

    International Nuclear Information System (INIS)

    Zhang Yanxia; Ma He; Peng Nanbo; Zhao Yongheng; Wu Xuebing

    2013-01-01

    We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.

  1. ESTIMATING PHOTOMETRIC REDSHIFTS OF QUASARS VIA THE k-NEAREST NEIGHBOR APPROACH BASED ON LARGE SURVEY DATABASES

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Yanxia; Ma He; Peng Nanbo; Zhao Yongheng [Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, 100012 Beijing (China); Wu Xuebing, E-mail: zyx@bao.ac.cn [Department of Astronomy, Peking University, 100871 Beijing (China)

    2013-08-01

    We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.

  2. Highway Travel Time Prediction Using Sparse Tensor Completion Tactics and K-Nearest Neighbor Pattern Matching Method

    Directory of Open Access Journals (Sweden)

    Jiandong Zhao

    2018-01-01

    Full Text Available Remote transportation microwave sensor (RTMS technology is being promoted for China’s highways. The distance is about 2 to 5 km between RTMSs, which leads to missing data and data sparseness problems. These two problems seriously restrict the accuracy of travel time prediction. Aiming at the data-missing problem, based on traffic multimode characteristics, a tensor completion method is proposed to recover the lost RTMS speed and volume data. Aiming at the data sparseness problem, virtual sensor nodes are set up between real RTMS nodes, and the two-dimensional linear interpolation and piecewise method are applied to estimate the average travel time between two nodes. Next, compared with the traditional K-nearest neighbor method, an optimal KNN method is proposed for travel time prediction. optimization is made in three aspects. Firstly, the three original state vectors, that is, speed, volume, and time of the day, are subdivided into seven periods. Secondly, the traffic congestion level is added as a new state vector. Thirdly, the cross-validation method is used to calibrate the K value to improve the adaptability of the KNN algorithm. Based on the data collected from Jinggangao highway, all the algorithms are validated. The results show that the proposed method can improve data quality and prediction precision of travel time.

  3. An improved coupled-states approximation including the nearest neighbor Coriolis couplings for diatom-diatom inelastic collision

    Science.gov (United States)

    Yang, Dongzheng; Hu, Xixi; Zhang, Dong H.; Xie, Daiqian

    2018-02-01

    Solving the time-independent close coupling equations of a diatom-diatom inelastic collision system by using the rigorous close-coupling approach is numerically difficult because of its expensive matrix manipulation. The coupled-states approximation decouples the centrifugal matrix by neglecting the important Coriolis couplings completely. In this work, a new approximation method based on the coupled-states approximation is presented and applied to time-independent quantum dynamic calculations. This approach only considers the most important Coriolis coupling with the nearest neighbors and ignores weaker Coriolis couplings with farther K channels. As a result, it reduces the computational costs without a significant loss of accuracy. Numerical tests for para-H2+ortho-H2 and para-H2+HD inelastic collision were carried out and the results showed that the improved method dramatically reduces the errors due to the neglect of the Coriolis couplings in the coupled-states approximation. This strategy should be useful in quantum dynamics of other systems.

  4. A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-05-01

    Full Text Available Electric load forecasting plays an important role in electricity markets and power systems. Because electric load time series are complicated and nonlinear, it is very difficult to achieve a satisfactory forecasting accuracy. In this paper, a hybrid model, Wavelet Denoising-Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EWKM, which combines k-Nearest Neighbor (KNN and Extreme Learning Machine (ELM based on a wavelet denoising technique is proposed for short-term load forecasting. The proposed hybrid model decomposes the time series into a low frequency-associated main signal and some detailed signals associated with high frequencies at first, then uses KNN to determine the independent and dependent variables from the low-frequency signal. Finally, the ELM is used to get the non-linear relationship between these variables to get the final prediction result for the electric load. Compared with three other models, Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EKM, Wavelet Denoising-Extreme Learning Machine (WKM and Wavelet Denoising-Back Propagation Neural Network optimized by k-Nearest Neighbor Regression (WNNM, the model proposed in this paper can improve the accuracy efficiently. New South Wales is the economic powerhouse of Australia, so we use the proposed model to predict electric demand for that region. The accurate prediction has a significant meaning.

  5. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    Science.gov (United States)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  6. Comparison of Two Classifiers; K-Nearest Neighbor and Artificial Neural Network, for Fault Diagnosis on a Main Engine Journal-Bearing

    Directory of Open Access Journals (Sweden)

    A. Moosavian

    2013-01-01

    Full Text Available Vibration analysis is an accepted method in condition monitoring of machines, since it can provide useful and reliable information about machine working condition. This paper surveys a new scheme for fault diagnosis of main journal-bearings of internal combustion (IC engine based on power spectral density (PSD technique and two classifiers, namely, K-nearest neighbor (KNN and artificial neural network (ANN. Vibration signals for three different conditions of journal-bearing; normal, with oil starvation condition and extreme wear fault were acquired from an IC engine. PSD was applied to process the vibration signals. Thirty features were extracted from the PSD values of signals as a feature source for fault diagnosis. KNN and ANN were trained by training data set and then used as diagnostic classifiers. Variable K value and hidden neuron count (N were used in the range of 1 to 20, with a step size of 1 for KNN and ANN to gain the best classification results. The roles of PSD, KNN and ANN techniques were studied. From the results, it is shown that the performance of ANN is better than KNN. The experimental results dèmonstrate that the proposed diagnostic method can reliably separate different fault conditions in main journal-bearings of IC engine.

  7. Improved Multiscale Entropy Technique with Nearest-Neighbor Moving-Average Kernel for Nonlinear and Nonstationary Short-Time Biomedical Signal Analysis

    Directory of Open Access Journals (Sweden)

    S. P. Arunachalam

    2018-01-01

    Full Text Available Analysis of biomedical signals can yield invaluable information for prognosis, diagnosis, therapy evaluation, risk assessment, and disease prevention which is often recorded as short time series data that challenges existing complexity classification algorithms such as Shannon entropy (SE and other techniques. The purpose of this study was to improve previously developed multiscale entropy (MSE technique by incorporating nearest-neighbor moving-average kernel, which can be used for analysis of nonlinear and non-stationary short time series physiological data. The approach was tested for robustness with respect to noise analysis using simulated sinusoidal and ECG waveforms. Feasibility of MSE to discriminate between normal sinus rhythm (NSR and atrial fibrillation (AF was tested on a single-lead ECG. In addition, the MSE algorithm was applied to identify pivot points of rotors that were induced in ex vivo isolated rabbit hearts. The improved MSE technique robustly estimated the complexity of the signal compared to that of SE with various noises, discriminated NSR and AF on single-lead ECG, and precisely identified the pivot points of ex vivo rotors by providing better contrast between the rotor core and the peripheral region. The improved MSE technique can provide efficient complexity analysis of variety of nonlinear and nonstationary short-time biomedical signals.

  8. Studying nearest neighbor correlations by atom probe tomography (APT) in metallic glasses as exemplified for Fe40Ni40B20 glassy ribbons

    KAUST Repository

    Shariq, Ahmed

    2012-01-01

    A next nearest neighbor evaluation procedure of atom probe tomography data provides distributions of the distances between atoms. The width of these distributions for metallic glasses studied so far is a few Angstrom reflecting the spatial resolution of the analytical technique. However, fitting Gaussian distributions to the distribution of atomic distances yields average distances with statistical uncertainties of 2 to 3 hundredth of an Angstrom. Fe 40Ni40B20 metallic glass ribbons are characterized this way in the as quenched state and for a state heat treated at 350 °C for 1 h revealing a change in the structure on the sub-nanometer scale. By applying the statistical tool of the χ2 test a slight deviation from a random distribution of B-atoms in the as quenched sample is perceived, whereas a pronounced elemental inhomogeneity of boron is detected for the annealed state. In addition, the distance distribution of the first fifteen atomic neighbors is determined by using this algorithm for both annealed and as quenched states. The next neighbor evaluation algorithm evinces a steric periodicity of the atoms when the next neighbor distances are normalized by the first next neighbor distance. A comparison of the nearest neighbor atomic distribution for as quenched and annealed state shows accumulation of Ni and B. Moreover, it also reveals the tendency of Fe and B to move slightly away from each other, an incipient step to Ni rich boride formation. © 2011 Elsevier B.V.

  9. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines.

    Science.gov (United States)

    Majid, Abdul; Ali, Safdar; Iqbal, Mubashar; Kausar, Nabeela

    2014-03-01

    This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  10. SISTEM PEMBAGIAN KELAS KULIAH MAHASISWA DENGAN METODE K-MEANS DAN K-NEAREST NEIGHBORS UNTUK MENINGKATKAN KUALITAS PEMBELAJARAN

    Directory of Open Access Journals (Sweden)

    Gede Aditra Pradnyana

    2018-01-01

    Full Text Available Permasalahan yang terjadi saat pembentukan atau pembagian kelas mahasiswa adalah perbedaan kemampuan yang dimiliki oleh mahasiswa di setiap kelasnya yang dapat berdampak pada tidak efektifnya proses pembelajaran yang berlangsung. Pengelompokkan mahasiswa dengan kemampuan yang sama merupakan hal yang sangat penting dalam rangka meningkatkan kualitas proses belajar mengajar yang dilakukan. Dengan pengelompokkan mahasiswa yang tepat, mereka akan dapat saling membantu dalam proses pembelajaran. Selain itu, membagi kelas mahasiswa sesuai dengan kemampuannya dapat mempermudah tenaga pendidik dalam menentukan metode atau strategi pembelajaran yang sesuai. Penggunaan metode dan strategi pembelajaran yang tepat akan meningkatkan efektifitas proses belajar mengajar. Pada penelitian ini dirancang sebuah metode baru untuk pembagian kelas kuliah mahasiswa dengan mengkombinasikan metode K-Means dan K-Nearest Neighbors (KNN. Metode K-means digunakan untuk pembagian kelas kuliah mahasiswa berdasarkan komponen penilaian dari mata kuliah prasyaratnya. Adapun fitur yang digunakan dalam pengelompokkan adalah nilai tugas, nilai ujian tengah semester, nilai ujian akhir semester, dan indeks prestasi kumulatif (IPK. Metode KNN digunakan untuk memprediksi kelulusan seoarang mahasiswa di sebuah matakuliah berdasarkan data sebelumnya. Hasil prediksi ini akan digunakan sebagai fitur tambahan yang digunakan dalam pembentukan kelas mahasiswa menggunakan metode K-means. Pendekatan yang digunakan dalam penelitian ini adalah Software Development Live Cycle (SDLC dengan model waterfall. Berdasarkan hasil pengujian yang dilakukan diperoleh kesimpulan bahwa jumlah cluster atau kelas dan jumlah data yang digunakan mempengaruhi dari kualitas cluster yang dibentuk oleh metode K-Means dan KNN yang digunakan. Nilai Silhouette Indeks tertinggi diperolah saat menggunakan 100 data dengan jumlah cluster 10 sebesar 0,534 yang tergolong kelas dengan kualitas medium structure.

  11. A Local Weighted Nearest Neighbor Algorithm and a Weighted and Constrained Least-Squared Method for Mixed Odor Analysis by Electronic Nose Systems

    Directory of Open Access Journals (Sweden)

    Jyuo-Min Shyu

    2010-11-01

    Full Text Available A great deal of work has been done to develop techniques for odor analysis by electronic nose systems. These analyses mostly focus on identifying a particular odor by comparing with a known odor dataset. However, in many situations, it would be more practical if each individual odorant could be determined directly. This paper proposes two methods for such odor components analysis for electronic nose systems. First, a K-nearest neighbor (KNN-based local weighted nearest neighbor (LWNN algorithm is proposed to determine the components of an odor. According to the component analysis, the odor training data is firstly categorized into several groups, each of which is represented by its centroid. The examined odor is then classified as the class of the nearest centroid. The distance between the examined odor and the centroid is calculated based on a weighting scheme, which captures the local structure of each predefined group. To further determine the concentration of each component, odor models are built by regressions. Then, a weighted and constrained least-squares (WCLS method is proposed to estimate the component concentrations. Experiments were carried out to assess the effectiveness of the proposed methods. The LWNN algorithm is able to classify mixed odors with different mixing ratios, while the WCLS method can provide good estimates on component concentrations.

  12. Weak doping dependence of the antiferromagnetic coupling between nearest-neighbor Mn2 + spins in (Ba1 -xKx) (Zn1-yMny) 2As2

    Science.gov (United States)

    Surmach, M. A.; Chen, B. J.; Deng, Z.; Jin, C. Q.; Glasbrenner, J. K.; Mazin, I. I.; Ivanov, A.; Inosov, D. S.

    2018-03-01

    Dilute magnetic semiconductors (DMS) are nonmagnetic semiconductors doped with magnetic transition metals. The recently discovered DMS material (Ba1 -xKx) (Zn1-yMny) 2As2 offers a unique and versatile control of the Curie temperature TC by decoupling the spin (Mn2 +, S =5 /2 ) and charge (K+) doping in different crystallographic layers. In an attempt to describe from first-principles calculations the role of hole doping in stabilizing ferromagnetic order, it was recently suggested that the antiferromagnetic exchange coupling J between the nearest-neighbor Mn ions would experience a nearly twofold suppression upon doping 20% of holes by potassium substitution. At the same time, further-neighbor interactions become increasingly ferromagnetic upon doping, leading to a rapid increase of TC. Using inelastic neutron scattering, we have observed a localized magnetic excitation at about 13 meV associated with the destruction of the nearest-neighbor Mn-Mn singlet ground state. Hole doping results in a notable broadening of this peak, evidencing significant particle-hole damping, but with only a minor change in the peak position. We argue that this unexpected result can be explained by a combined effect of superexchange and double-exchange interactions.

  13. Dynamical correlation functions of the S=1/2 nearest-neighbor and Haldane-Shastry Heisenberg antiferromagnetic chains in zero and applied fields

    DEFF Research Database (Denmark)

    Lefmann, K.; Rischel, C.

    1996-01-01

    We present a numerical diagonalization study of two one-dimensional S=1/2 antiferromagnetic Heisenberg chains, having nearest-neighbor and Haldane-Shastry (1/r(2)) interactions, respectively. We have obtained the T=0 dynamical correlation function, S-alpha alpha(q,omega), for chains of length N=8......-28. We have studied S-zz(q,omega) for the Heisenberg chain in zero field, and from finite-size scaling we have obtained a limiting behavior that for large omega deviates from the conjecture proposed earlier by Muller ct al. For both chains we describe the behavior of S-zz(q,omega) and S...

  14. The spectrum and the quantum Hall effect on the square lattice with next-nearest-neighbor hopping: Statistics of holons and spinons in the t-J model

    International Nuclear Information System (INIS)

    Hatsugai, Y.; Kohmoto, M.

    1992-01-01

    We investigate the energy spectrum and the Hall effect of electrons on the square lattice with next-nearest-neighbor (NNN) hopping as well as nearest-neighbor hopping. General rational values of magnetic flux per unit cell φ=p/q are considered. In the absence of NNN hopping, the two bands at the center touch for q even, thus the Hall conductance is not well defined at half filling. An energy gap opens there by introducing NNN hoping. When φ=1/2, the NNN model coincides with the mean field Hamiltonian for the chiral spin state proposed by Wen, Wilczek and Zee (WWZ). The Hall conductance is calculated from the Diophantine equation and the E-φ diagram. We find that gaps close for other fillings at certain values of NNN hopping strength. The quantized value of the Hall conductance changes once this phenomenon occurs. In a mean field treatment of the t-J model, the effective Hamiltonian is the same as our NNN model. From this point of view, the statistics of the quasi-particles is not always semion and depends on the filling and the strength of the mean field. (orig.)

  15. Time series classification using k-Nearest neighbours, Multilayer Perceptron and Learning Vector Quantization algorithms

    Directory of Open Access Journals (Sweden)

    Jiří Fejfar

    2012-01-01

    Full Text Available We are presenting results comparison of three artificial intelligence algorithms in a classification of time series derived from musical excerpts in this paper. Algorithms were chosen to represent different principles of classification – statistic approach, neural networks and competitive learning. The first algorithm is a classical k-Nearest neighbours algorithm, the second algorithm is Multilayer Perceptron (MPL, an example of artificial neural network and the third one is a Learning Vector Quantization (LVQ algorithm representing supervised counterpart to unsupervised Self Organizing Map (SOM.After our own former experiments with unlabelled data we moved forward to the data labels utilization, which generally led to a better accuracy of classification results. As we need huge data set of labelled time series (a priori knowledge of correct class which each time series instance belongs to, we used, with a good experience in former studies, musical excerpts as a source of real-world time series. We are using standard deviation of the sound signal as a descriptor of a musical excerpts volume level.We are describing principle of each algorithm as well as its implementation briefly, giving links for further research. Classification results of each algorithm are presented in a confusion matrix showing numbers of misclassifications and allowing to evaluate overall accuracy of the algorithm. Results are compared and particular misclassifications are discussed for each algorithm. Finally the best solution is chosen and further research goals are given.

  16. Recognition Number of The Vehicle Plate Using Otsu Method and K-Nearest Neighbour Classification

    Directory of Open Access Journals (Sweden)

    Maulidia Rahmah Hidayah

    2017-05-01

    Full Text Available The current topic that is interesting as a solution of the impact of public service improvement toward vehicle is License Plate Recognition (LPR, but it still needs to develop the research of LPR method. Some of the previous researchs showed that K-Nearest Neighbour (KNN succeed in car license plate recognition. The Objectives of this research was to determine the implementation and accuracy of Otsu Method toward license plate recognition. The method of this research was Otsu method to extract the characteristics and image of the plate into binary image and KNN as recognition classification method of each character. The development of the license plate recognition program by using Otsu method and classification of KNN is following the steps of pattern recognition, such as input and sensing, pre-processing, extraction feature Otsu method binary, segmentation, KNN classification method and post-processing by calculating the level of accuracy. The study showed that this program can recognize by 82% from 100 test plate with 93,75% of number recognition accuracy and 91,92% of letter recognition accuracy. 

  17. DichroMatch at the protein circular dichroism data bank (DM@PCDDB): A web-based tool for identifying protein nearest neighbors using circular dichroism spectroscopy.

    Science.gov (United States)

    Whitmore, Lee; Mavridis, Lazaros; Wallace, B A; Janes, Robert W

    2018-01-01

    Circular dichroism spectroscopy is a well-used, but simple method in structural biology for providing information on the secondary structure and folds of proteins. DichroMatch (DM@PCDDB) is an online tool that is newly available in the Protein Circular Dichroism Data Bank (PCDDB), which takes advantage of the wealth of spectral and metadata deposited therein, to enable identification of spectral nearest neighbors of a query protein based on four different methods of spectral matching. DM@PCDDB can potentially provide novel information about structural relationships between proteins and can be used in comparison studies of protein homologs and orthologs. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  18. Correction of dental artifacts within the anatomical surface in PET/MRI using active shape models and k-nearest-neighbors

    DEFF Research Database (Denmark)

    Ladefoged, Claes N.; Andersen, Flemming L.; Keller, Sune H.

    2014-01-01

    n combined PET/MR, attenuation correction (AC) is performed indirectly based on the available MR image information. Metal implant-induced susceptibility artifacts and subsequent signal voids challenge MR-based AC. Several papers acknowledge the problem in PET attenuation correction when dental...... artifacts are ignored, but none of them attempts to solve the problem. We propose a clinically feasible correction method which combines Active Shape Models (ASM) and k- Nearest-Neighbors (kNN) into a simple approach which finds and corrects the dental artifacts within the surface boundaries of the patient...... anatomy. ASM is used to locate a number of landmarks in the T1-weighted MR-image of a new patient. We calculate a vector of offsets from each voxel within a signal void to each of the landmarks. We then use kNN to classify each voxel as belonging to an artifact or an actual signal void using this offset...

  19. Two tree-formation methods for fast pattern search using nearest-neighbour and nearest-centroid matching

    NARCIS (Netherlands)

    Schomaker, Lambertus; Mangalagiu, D.; Vuurpijl, Louis; Weinfeld, M.; Schomaker, Lambert; Vuurpijl, Louis

    2000-01-01

    This paper describes tree­based classification of character images, comparing two methods of tree formation and two methods of matching: nearest neighbor and nearest centroid. The first method, Preprocess Using Relative Distances (PURD) is a tree­based reorganization of a flat list of patterns,

  20. Identifikasi Tumbuhan Obat Herbal Berdasarkan Citra Daun Menggunakan Algoritma Gray Level Co-occurence Matrix dan K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    Fittria Shofrotun Ni'mah

    2018-03-01

    Full Text Available Medicinal plants can be used as an alternative natural treatment, instead of chemical drugs. But because of too many types of plants and lack of knowledge, it will be difficult to identify these herbs. Computer assistance can be used to facilitate the identification of these herbs. This research proposes the identification of herbal plants based on leaf image using texture analysis. There are 10 types of herbal medicinal plants used in this study. The texture analysis used was GLCM by extracting contrast, correlation, energy, and homogeneity. Classification is done by KNN. The result of the experiment showed that the accuracy of identification using 9-fold cross-cross validation method was 83.33% using 9 subsets. Tumbuhan obat herbal bisa dijadikan sebagai alternatif pengobatan yang alami, selain obat-obatan kimia. Namun karena terlalu banyak jenis tumbuhan dan kurangnya pengetahuan, identifikasi tumbuhan berkhasiat akan sulit. Bantuan komputer dapat digunakan untuk memudahkan mengidentifikasi tumbuhan herbal tersebut. Penelitian ini mengusulkan identifikasi tumbuhan herbal berdasarkan citra daun menggunakan analisis tekstur. Ada 10 spesies tumbuhan obat herbal yang digunakan dalam penelitian ini. Analisis tekstur yang digunakan adalah GLCM dengan mengekstrak nilai kontras, korelasi, energi dan homogenitas. Klasifikasi dilakukan dengan KNN. Hasil percobaan menunjukkan akurasi identifikasi menggunakan metode 9-fold cross validation mencapai 83.33% dengan menggunakan 9 subset.

  1. Error minimizing algorithms for nearest eighbor classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory; Zimmer, G. Beate [TEXAS A& M

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.

  2. Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm.

    Science.gov (United States)

    Manganaro, Alberto; Pizzo, Fabiola; Lombardo, Anna; Pogliaghi, Alberto; Benfenati, Emilio

    2016-02-01

    The ability of a substance to resist degradation and persist in the environment needs to be readily identified in order to protect the environment and human health. Many regulations require the assessment of persistence for substances commonly manufactured and marketed. Besides laboratory-based testing methods, in silico tools may be used to obtain a computational prediction of persistence. We present a new program to develop k-Nearest Neighbor (k-NN) models. The k-NN algorithm is a similarity-based approach that predicts the property of a substance in relation to the experimental data for its most similar compounds. We employed this software to identify persistence in the sediment compartment. Data on half-life (HL) in sediment were obtained from different sources and, after careful data pruning the final dataset, containing 297 organic compounds, was divided into four experimental classes. We developed several models giving satisfactory performances, considering that both the training and test set accuracy ranged between 0.90 and 0.96. We finally selected one model which will be made available in the near future in the freely available software platform VEGA. This model offers a valuable in silico tool that may be really useful for fast and inexpensive screening. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. A novel method for the detection of R-peaks in ECG based on K-Nearest Neighbors and Particle Swarm Optimization

    Science.gov (United States)

    He, Runnan; Wang, Kuanquan; Li, Qince; Yuan, Yongfeng; Zhao, Na; Liu, Yang; Zhang, Henggui

    2017-12-01

    Cardiovascular diseases are associated with high morbidity and mortality. However, it is still a challenge to diagnose them accurately and efficiently. Electrocardiogram (ECG), a bioelectrical signal of the heart, provides crucial information about the dynamical functions of the heart, playing an important role in cardiac diagnosis. As the QRS complex in ECG is associated with ventricular depolarization, therefore, accurate QRS detection is vital for interpreting ECG features. In this paper, we proposed a real-time, accurate, and effective algorithm for QRS detection. In the algorithm, a proposed preprocessor with a band-pass filter was first applied to remove baseline wander and power-line interference from the signal. After denoising, a method combining K-Nearest Neighbor (KNN) and Particle Swarm Optimization (PSO) was used for accurate QRS detection in ECGs with different morphologies. The proposed algorithm was tested and validated using 48 ECG records from MIT-BIH arrhythmia database (MITDB), achieved a high averaged detection accuracy, sensitivity and positive predictivity of 99.43, 99.69, and 99.72%, respectively, indicating a notable improvement to extant algorithms as reported in literatures.

  4. Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation

    Directory of Open Access Journals (Sweden)

    CARLOS ALBERTO SILVA

    Full Text Available ABSTRACT Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM and tree density (TD of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR data and the non- k-nearest neighbor (k-NN imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2 and a root mean squared difference (RMSD for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

  5. Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation.

    Science.gov (United States)

    Silva, Carlos Alberto; Klauberg, Carine; Hudak, Andrew T; Vierling, Lee A; Liesenberg, Veraldo; Bernett, Luiz G; Scheraiber, Clewerson F; Schoeninger, Emerson R

    2018-01-01

    Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

  6. Exotic lagomorph may influence eagle abundances and breeding spatial aggregations: a field study and meta-analysis on the nearest neighbor distance

    Directory of Open Access Journals (Sweden)

    Facundo Barbar

    2018-05-01

    Full Text Available The introduction of alien species could be changing food source composition, ultimately restructuring demography and spatial distribution of native communities. In Argentine Patagonia, the exotic European hare has one of the highest numbers recorded worldwide and is now a widely consumed prey for many predators. We examine the potential relationship between abundance of this relatively new prey and the abundance and breeding spacing of one of its main consumers, the Black-chested Buzzard-Eagle (Geranoaetus melanoleucus. First we analyze the abundance of individuals of a raptor guild in relation to hare abundance through a correspondence analysis. We then estimated the Nearest Neighbor Distance (NND of the Black-chested Buzzard-eagle abundances in the two areas with high hare abundances. Finally, we performed a meta-regression between the NND and the body masses of Accipitridae raptors, to evaluate if Black-chested Buzzard-eagle NND deviates from the expected according to their mass. We found that eagle abundance was highly associated with hare abundance, more than with any other raptor species in the study area. Their NND deviates from the value expected, which was significantly lower than expected for a raptor species of this size in two areas with high hare abundance. Our results support the hypothesis that high local abundance of prey leads to a reduction of the breeding spacing of its main predator, which could potentially alter other interspecific interactions, and thus the entire community.

  7. α-K2AgF4: Ferromagnetism induced by the weak superexchange of different eg orbitals from the nearest neighbor Ag ions

    Science.gov (United States)

    Zhang, Xiaoli; Zhang, Guoren; Jia, Ting; Zeng, Zhi; Lin, H. Q.

    2016-05-01

    We study the abnormal ferromagnetism in α-K2AgF4, which is very similar to high-TC parent material La2CuO4 in structure. We find out that the electron correlation is very important in determining the insulating property of α-K2AgF4. The Ag(II) 4d9 in the octahedron crystal field has the t2 g 6 eg 3 electron occupation with eg x2-y2 orbital fully occupied and 3z2-r2 orbital partially occupied. The two eg orbitals are very extended indicating both of them are active in superexchange. Using the Hubbard model combined with Nth-order muffin-tin orbital (NMTO) downfolding technique, it is concluded that the exchange interaction between eg 3z2-r2 and x2-y2 from the first nearest neighbor Ag ions leads to the anomalous ferromagnetism in α-K2AgF4.

  8. α-K2AgF4: Ferromagnetism induced by the weak superexchange of different eg orbitals from the nearest neighbor Ag ions

    Directory of Open Access Journals (Sweden)

    Xiaoli Zhang

    2016-05-01

    Full Text Available We study the abnormal ferromagnetism in α-K2AgF4, which is very similar to high-TC parent material La2CuO4 in structure. We find out that the electron correlation is very important in determining the insulating property of α-K2AgF4. The Ag(II 4d9 in the octahedron crystal field has the t 2 g 6 e g 3 electron occupation with eg x2-y2 orbital fully occupied and 3z2-r2 orbital partially occupied. The two eg orbitals are very extended indicating both of them are active in superexchange. Using the Hubbard model combined with Nth-order muffin-tin orbital (NMTO downfolding technique, it is concluded that the exchange interaction between eg 3z2-r2 and x2-y2 from the first nearest neighbor Ag ions leads to the anomalous ferromagnetism in α-K2AgF4.

  9. Large-Scale Mapping of Carbon Stocks in Riparian Forests with Self-Organizing Maps and the k-Nearest-Neighbor Algorithm

    Directory of Open Access Journals (Sweden)

    Leonhard Suchenwirth

    2014-07-01

    Full Text Available Among the machine learning tools being used in recent years for environmental applications such as forestry, self-organizing maps (SOM and the k-nearest neighbor (kNN algorithm have been used successfully. We applied both methods for the mapping of organic carbon (Corg in riparian forests due to their considerably high carbon storage capacity. Despite the importance of floodplains for carbon sequestration, a sufficient scientific foundation for creating large-scale maps showing the spatial Corg distribution is still missing. We estimated organic carbon in a test site in the Danube Floodplain based on RapidEye remote sensing data and additional geodata. Accordingly, carbon distribution maps of vegetation, soil, and total Corg stocks were derived. Results were compared and statistically evaluated with terrestrial survey data for outcomes with pure remote sensing data and for the combination with additional geodata using bias and the Root Mean Square Error (RMSE. Results show that SOM and kNN approaches enable us to reproduce spatial patterns of riparian forest Corg stocks. While vegetation Corg has very high RMSEs, outcomes for soil and total Corg stocks are less biased with a lower RMSE, especially when remote sensing and additional geodata are conjointly applied. SOMs show similar percentages of RMSE to kNN estimations.

  10. Data-driven method based on particle swarm optimization and k-nearest neighbor regression for estimating capacity of lithium-ion battery

    International Nuclear Information System (INIS)

    Hu, Chao; Jain, Gaurav; Zhang, Puqiang; Schmidt, Craig; Gomadam, Parthasarathy; Gorka, Tom

    2014-01-01

    Highlights: • We develop a data-driven method for the battery capacity estimation. • Five charge-related features that are indicative of the capacity are defined. • The kNN regression model captures the dependency of the capacity on the features. • Results with 10 years’ continuous cycling data verify the effectiveness of the method. - Abstract: Reliability of lithium-ion (Li-ion) rechargeable batteries used in implantable medical devices has been recognized as of high importance from a broad range of stakeholders, including medical device manufacturers, regulatory agencies, physicians, and patients. To ensure Li-ion batteries in these devices operate reliably, it is important to be able to assess the battery health condition by estimating the battery capacity over the life-time. This paper presents a data-driven method for estimating the capacity of Li-ion battery based on the charge voltage and current curves. The contributions of this paper are three-fold: (i) the definition of five characteristic features of the charge curves that are indicative of the capacity, (ii) the development of a non-linear kernel regression model, based on the k-nearest neighbor (kNN) regression, that captures the complex dependency of the capacity on the five features, and (iii) the adaptation of particle swarm optimization (PSO) to finding the optimal combination of feature weights for creating a kNN regression model that minimizes the cross validation (CV) error in the capacity estimation. Verification with 10 years’ continuous cycling data suggests that the proposed method is able to accurately estimate the capacity of Li-ion battery throughout the whole life-time

  11. Experimental Validation of an Efficient Fan-Beam Calibration Procedure for k-Nearest Neighbor Position Estimation in Monolithic Scintillator Detectors

    Science.gov (United States)

    Borghi, Giacomo; Tabacchini, Valerio; Seifert, Stefan; Schaart, Dennis R.

    2015-02-01

    Monolithic scintillator detectors can achieve excellent spatial resolution and coincidence resolving time. However, their practical use for positron emission tomography (PET) and other applications in the medical imaging field is still limited due to drawbacks of the different methods used to estimate the position of interaction. Common statistical methods for example require the collection of an extensive dataset of reference events with a narrow pencil beam aimed at a fine grid of reference positions. Such procedures are time consuming and not straightforwardly implemented in systems composed of many detectors. Here, we experimentally demonstrate for the first time a new calibration procedure for k-nearest neighbor ( k-NN) position estimation that utilizes reference data acquired with a fan beam. The procedure is tested on two detectors consisting of 16 mm ×16 mm ×10 mm and 16 mm ×16 mm ×20 mm monolithic, Ca-codoped LSO:Ce crystals and digital photon counter (DPC) arrays. For both detectors, the spatial resolution and the bias obtained with the new method are found to be practically the same as those obtained with the previously used method based on pencil-beam irradiation, while the calibration time is reduced by a factor of 20. Specifically, a FWHM of 1.1 mm and a FWTM of 2.7 mm were obtained using the fan-beam method with the 10 mm crystal, whereas a FWHM of 1.5 mm and a FWTM of 6 mm were achieved with the 20 mm crystal. Using a fan beam made with a 4.5 MBq 22Na point-source and a tungsten slit collimator with 0.5 mm aperture, the total measurement time needed to acquire the reference dataset was 3 hours for the thinner crystal and 2 hours for the thicker one.

  12. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Science.gov (United States)

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  13. Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

    DEFF Research Database (Denmark)

    Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan

    2009-01-01

    The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify t...... other previously applied intelligent approaches....

  14. Remaining Useful Life Estimation of Insulated Gate Biploar Transistors (IGBTs Based on a Novel Volterra k-Nearest Neighbor Optimally Pruned Extreme Learning Machine (VKOPP Model Using Degradation Data

    Directory of Open Access Journals (Sweden)

    Zhen Liu

    2017-11-01

    Full Text Available The insulated gate bipolar transistor (IGBT is a kind of excellent performance switching device used widely in power electronic systems. How to estimate the remaining useful life (RUL of an IGBT to ensure the safety and reliability of the power electronics system is currently a challenging issue in the field of IGBT reliability. The aim of this paper is to develop a prognostic technique for estimating IGBTs’ RUL. There is a need for an efficient prognostic algorithm that is able to support in-situ decision-making. In this paper, a novel prediction model with a complete structure based on optimally pruned extreme learning machine (OPELM and Volterra series is proposed to track the IGBT’s degradation trace and estimate its RUL; we refer to this model as Volterra k-nearest neighbor OPELM prediction (VKOPP model. This model uses the minimum entropy rate method and Volterra series to reconstruct phase space for IGBTs’ ageing samples, and a new weight update algorithm, which can effectively reduce the influence of the outliers and noises, is utilized to establish the VKOPP network; then a combination of the k-nearest neighbor method (KNN and least squares estimation (LSE method is used to calculate the output weights of OPELM and predict the RUL of the IGBT. The prognostic results show that the proposed approach can predict the RUL of IGBT modules with small error and achieve higher prediction precision and lower time cost than some classic prediction approaches.

  15. Improving the family orientation process in Cuban Special Schools trough Nearest Prototype classification

    Directory of Open Access Journals (Sweden)

    Caballero-Mota, Y.

    2013-03-01

    Full Text Available Cuban Schools for children with Affective – Behavioral Maladies (SABM have as goal to accomplish a major change in children behavior, to insert them effectively into society. One of the key elements in this objective is to give an adequate orientation to the children’s families; due to the family is one of the most important educational contexts in which the children will develop their personality. The family orientation process in SABM involves clustering and classification of mixed type data with non-symmetric similarity functions. To improve this process, this paper includes some novel characteristics in clustering and prototype selection. The proposed approach uses a hierarchical clustering based on compact sets, making it suitable for dealing with non-symmetric similarity functions, as well as with mixed and incomplete data. The proposal obtains very good results on the SABM data, and over repository databases.

  16. Nearest patch matching for color image segmentation supporting neural network classification in pulmonary tuberculosis identification

    Science.gov (United States)

    Rulaningtyas, Riries; Suksmono, Andriyan B.; Mengko, Tati L. R.; Saptawati, Putri

    2016-03-01

    Pulmonary tuberculosis is a deadly infectious disease which occurs in many countries in Asia and Africa. In Indonesia, many people with tuberculosis disease are examined in the community health center. Examination of pulmonary tuberculosis is done through sputum smear with Ziehl - Neelsen staining using conventional light microscope. The results of Ziehl - Neelsen staining will give effect to the appearance of tuberculosis (TB) bacteria in red color and sputum background in blue color. The first examination is to detect the presence of TB bacteria from its color, then from the morphology of the TB bacteria itself. The results of Ziehl - Neelsen staining in sputum smear give the complex color images, so that the clinicians have difficulty when doing slide examination manually because it is time consuming and needs highly training to detect the presence of TB bacteria accurately. The clinicians have heavy workload to examine many sputum smear slides from the patients. To assist the clinicians when reading the sputum smear slide, this research built computer aided diagnose with color image segmentation, feature extraction, and classification method. This research used K-means clustering with patch technique to segment digital sputum smear images which separated the TB bacteria images from the background images. This segmentation method gave the good accuracy 97.68%. Then, feature extraction based on geometrical shape of TB bacteria was applied to this research. The last step, this research used neural network with back propagation method to classify TB bacteria and non TB bacteria images in sputum slides. The classification result of neural network back propagation are learning time (42.69±0.02) second, the number of epoch 5000, error rate of learning 15%, learning accuracy (98.58±0.01)%, and test accuracy (96.54±0.02)%.

  17. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Iwan Setyawan

    2012-12-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing methods used in a hand gesture recognition system. The pre-processing methods are based on the combinations of several image processing operations, namely edge detection, low pass filtering, histogram equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possible classes. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  18. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Regina Lionnie

    2013-09-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing  methods  used  in  a  hand  gesture  recognition  system.  The  preprocessing methods are based on the combinations ofseveral image processing operations,  namely  edge  detection,  low  pass  filtering,  histogram  equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possibleclasses. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  19. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data.

    Directory of Open Access Journals (Sweden)

    Brett A McKinney

    Full Text Available Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k for each gene to optimize the Relief-F test statistics (importance scores for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to

  20. Nearest Neighbor Queries in Road Networks

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

    2003-01-01

    in road networks. Such queries may be of use in many services. Specifically, we present an easily implementable data model that serves well as a foundation for such queries. We also present the design of a prototype system that implements the queries based on the data model. The algorithm used...

  1. Norrie disease and MAO genes: nearest neighbors.

    Science.gov (United States)

    Chen, Z Y; Denney, R M; Breakefield, X O

    1995-01-01

    The Norrie disease and MAO genes are tandemly arranged in the p11.4-p11.3 region of the human X chromosome in the order tel-MAOA-MAOB-NDP-cent. This relationship is conserved in the mouse in the order tel-MAOB-MAOA-NDP-cent. The MAO genes appear to have arisen by tandem duplication of an ancestral MAO gene, but their positional relationship to NDP appears to be random. Distinctive X-linked syndromes have been described for mutations in the MAOA and NDP genes, and in addition, individuals have been identified with contiguous gene syndromes due to chromosomal deletions which encompass two or three of these genes. Loss of function of the NDP gene causes a syndrome of congenital blindness and progressive hearing loss, sometimes accompanied by signs of CNS dysfunction, including variable mental retardation and psychiatric symptoms. Other mutations in the NDP gene have been found to underlie another X-linked eye disease, exudative vitreo-retinopathy. An MAOA deficiency state has been described in one family to date, with features of altered amine and amine metabolite levels, low normal intelligence, apparent difficulty in impulse control and cardiovascular difficulty in affected males. A contiguous gene syndrome in which all three genes are lacking, as well as other as yet unidentified flanking genes, results in severe mental retardation, small stature, seizures and congenital blindness, as well as altered amine and amine metabolites. Issues that remain to be resolved are the function of the NDP gene product, the frequency and phenotype of the MAOA deficiency state, and the possible occurrence and phenotype of an MAOB deficiency state.

  2. Colorectal Cancer and Colitis Diagnosis Using Fourier Transform Infrared Spectroscopy and an Improved K-Nearest-Neighbour Classifier.

    Science.gov (United States)

    Li, Qingbo; Hao, Can; Kang, Xue; Zhang, Jialin; Sun, Xuejun; Wang, Wenbo; Zeng, Haishan

    2017-11-27

    Combining Fourier transform infrared spectroscopy (FTIR) with endoscopy, it is expected that noninvasive, rapid detection of colorectal cancer can be performed in vivo in the future. In this study, Fourier transform infrared spectra were collected from 88 endoscopic biopsy colorectal tissue samples (41 colitis and 47 cancers). A new method, viz., entropy weight local-hyperplane k-nearest-neighbor (EWHK), which is an improved version of K-local hyperplane distance nearest-neighbor (HKNN), is proposed for tissue classification. In order to avoid limiting high dimensions and small values of the nearest neighbor, the new EWHK method calculates feature weights based on information entropy. The average results of the random classification showed that the EWHK classifier for differentiating cancer from colitis samples produced a sensitivity of 81.38% and a specificity of 92.69%.

  3. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    Science.gov (United States)

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  4. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss Classification Using Image-Based Features

    Directory of Open Access Journals (Sweden)

    Mohammadmehdi Saberioon

    2018-03-01

    Full Text Available The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss were fed either a fish-meal based diet (80 fish or a 100% plant-based diet (80 fish and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF, Support vector machine (SVM, Logistic regression (LR and k-Nearest neighbours (k-NN. The SVM with radial based kernel provided the best classifier with correct classification rate (CCR of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40% classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin.

  5. The classification of hunger behaviour of Lates Calcarifer through the integration of image processing technique and k-Nearest Neighbour learning algorithm

    Science.gov (United States)

    Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.

    2018-04-01

    Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.

  6. Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost

    NARCIS (Netherlands)

    Mensink, T.; Verbeek, J.; Perronnin, F.; Csurka, G.

    2013-01-01

    We study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers, and introduce a new

  7. Predicción de fracaso en empresas latinoamericanas utilizando el método del vecino más cercano para predecir efectos aleatorios en modelos mixtos || Prediction of Failure in Latin-American Companies Using the Nearest-Neighbor Method to Predict Random Effects in Mixed Models

    Directory of Open Access Journals (Sweden)

    Caro, Norma Patricia

    2017-12-01

    Full Text Available En la presente década, en economías emergentes como las latinoamericanas, se han comenzado a aplicar modelos logísticos mixtos para predecir el fracaso financiero de las empresas. No obstante, existen limitaciones subyacentes a la metodología, vinculadas a la factibilidad de predicción del estado de nuevas empresas que no han formado parte de la muestra de entrenamiento con la que se estimó el modelo. En la literatura se han propuesto diversos métodos de predicción para los efectos aleatorios que forman parte de los modelos mixtos, entre ellos, el del vecino más cercano. Este método es aplicado en una segunda etapa, luego de la estimación de un modelo que explica la situación financiera (en crisis o sana de las empresas mediante la consideración del comportamiento de sus ratios contables. En el presente trabajo, se consideraron empresas de Argentina, Chile y Perú, estimando los efectos aleatorios que resultaron significativos en la estimación del modelo mixto. De este modo, se concluye que la aplicación de este método permite identificar empresas con problemas financieros con una tasa de clasificación correcta superior a 80%, lo cual cobra relevancia en la modelación y predicción de este tipo de riesgo. || In the present decade, in emerging economies such as those in Latin-America, mixed logistic models have been started applying to predict the financial failure of companies. However, there are limitations for the methodology linked to the feasibility of predicting the state of new companies that have not been part of the training sample which was used to estimate the model. In the literature, several methods have been proposed for predicting random effects in the mixed models such as, for example, the nearest neighbor. This method is applied in a second step, after estimating a model that explains the financial situation (in crisis or healthy of companies by considering the behavior of its financial ratios. In this study

  8. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  9. The eighth TNM classification system for lung cancer: A consideration based on the degree of pleural invasion and involved neighboring structures.

    Science.gov (United States)

    Sakakura, Noriaki; Mizuno, Tetsuya; Kuroda, Hiroaki; Arimura, Takaaki; Yatabe, Yasushi; Yoshimura, Kenichi; Sakao, Yukinori

    2018-04-01

    The eighth tumor-node-metastasis (TNM) classification system for lung cancer has been used since January 2017 and must be applied to an individual institution's database. We analyzed pathological stage data of 2756 patients who underwent resection of non-small-cell lung cancer, particularly in terms of the degree of visceral pleural invasion and involved neighboring structures. Few patients had stage IIA disease (103, 4%); stratification between stages IB and IIA was insufficient (p = 0.129). When T2a tumors were divided into PL1 and PL2 subgroups based on the degree of pleural invasion, there was a significant prognostic difference between the subgroups (p consideration. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. MOST OBSERVATIONS OF OUR NEAREST NEIGHBOR: FLARES ON PROXIMA CENTAURI

    Energy Technology Data Exchange (ETDEWEB)

    Davenport, James R. A. [Department of Physics and Astronomy, Western Washington University, 516 High Street, Bellingham, WA 98225 (United States); Kipping, David M. [Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY 10027 (United States); Sasselov, Dimitar [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Matthews, Jaymie M. [Department of Physics and Astronomy, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1 (Canada); Cameron, Chris [Department of Mathematics, Physics and Geology, Cape Breton University, 1250 Grand Lake Road, Sydney, NS B1P 6L2 (Canada)

    2016-10-01

    We present a study of white-light flares from the active M5.5 dwarf Proxima Centauri using the Canadian microsatellite Microvariability and Oscillations of STars . Using 37.6 days of monitoring data from 2014 to 2015, we have detected 66 individual flare events, the largest number of white-light flares observed to date on Proxima Cen. Flare energies in our sample range from 10{sup 29} to 10{sup 31.5} erg. The flare rate is lower than that of other classic flare stars of a similar spectral type, such as UV Ceti, which may indicate Proxima Cen had a higher flare rate in its youth. Proxima Cen does have an unusually high flare rate given its slow rotation period, however. Extending the observed power-law occurrence distribution down to 10{sup 28} erg, we show that flares with flux amplitudes of 0.5% occur 63 times per day, while superflares with energies of 10{sup 33} erg occur ∼8 times per year. Small flares may therefore pose a great difficulty in searches for transits from the recently announced 1.27 M {sub ⊕} Proxima b, while frequent large flares could have significant impact on the planetary atmosphere.

  11. Analytical approach for collective diffusion: one-dimensional lattice with the nearest neighbor and the next nearest neighbor lateral interactions

    Czech Academy of Sciences Publication Activity Database

    Tarasenko, Alexander

    2018-01-01

    Roč. 95, Jan (2018), s. 37-40 ISSN 1386-9477 R&D Projects: GA MŠk LO1409; GA MŠk LM2015088 Institutional support: RVO:68378271 Keywords : lattice gas systems * kinetic Monte Carlo simulations * diffusion and migration Subject RIV: BE - Theoretical Physics OBOR OECD: Atomic, molecular and chemical physics (physics of atoms and molecules including collision, interaction with radiation, magnetic resonances, Mössbauer effect) Impact factor: 2.221, year: 2016

  12. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor

    Directory of Open Access Journals (Sweden)

    Chang Xu

    2018-05-01

    Full Text Available This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs. Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  13. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    Science.gov (United States)

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  14. A Literature Survey of Early Time Series Classification and Deep Learning

    OpenAIRE

    Santos, Tiago; Kern, Roman

    2017-01-01

    This paper provides an overview of current literature on time series classification approaches, in particular of early time series classification. A very common and effective time series classification approach is the 1-Nearest Neighbor classier, with different distance measures such as the Euclidean or dynamic time warping distances. This paper starts by reviewing these baseline methods. More recently, with the gain in popularity in the application of deep neural networks to the eld of...

  15. Efficient Fingercode Classification

    Science.gov (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  16. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  17. Nearest Cosmic Mirage

    Science.gov (United States)

    2003-07-01

    = 0.66. The bottom panel shows the spectrum of the lensing, elliptical galaxy at redshift z=0.3. The team of astronomers [1] then used the ESO 3.5-m New Technology Telescope (NTT) at La Silla to obtain spectra of the individual image components of this lensing system. This is imperative because, like human fingerprints, the spectra allow unambiguous identification of the observed objects. Nevertheless, this is not an easy task because the different images of the cosmic mirage are located very close to each other in the sky and the best possible conditions are needed to obtain clean and well separated spectra. However, the excellent optical quality of the NTT combined with reasonably good seeing conditions (about 0.7 arcsecond) enabled the astronomers to detect the "spectral fingerprints" of both the source and the object acting as a lens, cf. ESO PR Photo 20b/03. The evaluation of the spectra showed that the background source is a quasar with a redshift of z = 0.66 [3], corresponding to a distance of about 6,300 million light-years. The light from this quasar is lensed by a massive elliptical galaxy with a redshift z=0.3, i.e. at a distance of 3,500 million light-years or about halfway between the quasar and us. It is the nearest gravitationally lensed quasar known to date . Because of the specific geometry of the lens and the position of the lensing galaxy, it is possible to show that the light from the extended galaxy in which the quasar is located should also be lensed and become visible as a ring-shaped image. That this is indeed the case is demonstrated by PR Photo 20a/03 which clearly shows the presence of such an "Einstein ring", surrounding the image of the more nearby lensing galaxy. Micro lensing within macro lensing ? The particular configuration of the individual lensed images observed in this system has enabled the astronomers to produce a detailed model of the system. From this, they can then make predictions about the relative brightness of the various

  18. KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS

    Directory of Open Access Journals (Sweden)

    Sajjad Ahmed Ghauri

    2016-11-01

    Full Text Available Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC. When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK, Quadrature Phase Shift Keying (QPSK, Quadrature Amplitude Modulation (QAM, 16-QAM and 64-QAM. Higher order cummulants (HOC are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.

  19. Near Neighbor Distribution in Sets of Fractal Nature

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel

    2013-01-01

    Roč. 5, č. 1 (2013), s. 159-166 ISSN 2150-7988 R&D Projects: GA MŠk(CZ) LG12020 Institutional support: RVO:67985807 Keywords : nearest neighbor * fractal set * multifractal * Erlang distribution Subject RIV: BB - Applied Statistics, Operational Research http://www.mirlabs.org/ijcisim/regular_papers_2013/Paper91.pdf

  20. Hyperplane distance neighbor clustering based on local discriminant analysis for complex chemical processes monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Chunhong; Xiao, Shaoqing; Gu, Xiaofeng [Jiangnan University, Wuxi (China)

    2014-11-15

    The collected training data often include both normal and faulty samples for complex chemical processes. However, some monitoring methods, such as partial least squares (PLS), principal component analysis (PCA), independent component analysis (ICA) and Fisher discriminant analysis (FDA), require fault-free data to build the normal operation model. These techniques are applicable after the preliminary step of data clustering is applied. We here propose a novel hyperplane distance neighbor clustering (HDNC) based on the local discriminant analysis (LDA) for chemical process monitoring. First, faulty samples are separated from normal ones using the HDNC method. Then, the optimal subspace for fault detection and classification can be obtained using the LDA approach. The proposed method takes the multimodality within the faulty data into account, and thus improves the capability of process monitoring significantly. The HDNC-LDA monitoring approach is applied to two simulation processes and then compared with the conventional FDA based on the K-nearest neighbor (KNN-FDA) method. The results obtained in two different scenarios demonstrate the superiority of the HDNC-LDA approach in terms of fault detection and classification accuracy.

  1. Hyperplane distance neighbor clustering based on local discriminant analysis for complex chemical processes monitoring

    International Nuclear Information System (INIS)

    Lu, Chunhong; Xiao, Shaoqing; Gu, Xiaofeng

    2014-01-01

    The collected training data often include both normal and faulty samples for complex chemical processes. However, some monitoring methods, such as partial least squares (PLS), principal component analysis (PCA), independent component analysis (ICA) and Fisher discriminant analysis (FDA), require fault-free data to build the normal operation model. These techniques are applicable after the preliminary step of data clustering is applied. We here propose a novel hyperplane distance neighbor clustering (HDNC) based on the local discriminant analysis (LDA) for chemical process monitoring. First, faulty samples are separated from normal ones using the HDNC method. Then, the optimal subspace for fault detection and classification can be obtained using the LDA approach. The proposed method takes the multimodality within the faulty data into account, and thus improves the capability of process monitoring significantly. The HDNC-LDA monitoring approach is applied to two simulation processes and then compared with the conventional FDA based on the K-nearest neighbor (KNN-FDA) method. The results obtained in two different scenarios demonstrate the superiority of the HDNC-LDA approach in terms of fault detection and classification accuracy

  2. Transportation Modes Classification Using Sensors on Smartphones

    Directory of Open Access Journals (Sweden)

    Shih-Hau Fang

    2016-08-01

    Full Text Available This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  3. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  4. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  5. NeighborHood

    OpenAIRE

    Corominola Ocaña, Víctor

    2015-01-01

    NeighborHood és una aplicació basada en el núvol, adaptable a qualsevol dispositiu (mòbil, tablet, desktop). L'objectiu d'aquesta aplicació és poder permetre als usuaris introduir a les persones del seu entorn més immediat i que aquestes persones siguin visibles per a la resta d'usuaris. NeighborHood es una aplicación basada en la nube, adaptable a cualquier dispositivo (móvil, tablet, desktop). El objetivo de esta aplicación es poder permitir a los usuarios introducir a las personas de su...

  6. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  7. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  8. Neighbors United for Health

    Science.gov (United States)

    Westhoff, Wayne W.; Corvin, Jaime; Virella, Irmarie

    2009-01-01

    Modeled upon the ecclesiastic community group concept of Latin America to unite and strengthen the bond between the Church and neighborhoods, a community-based organization created Vecinos Unidos por la Salud (Neighbors United for Health) to bring health messages into urban Latino neighborhoods. The model is based on five tenants, and incorporates…

  9. Classification of Polarimetric SAR Data Using Dictionary Learning

    DEFF Research Database (Denmark)

    Vestergaard, Jacob Schack; Nielsen, Allan Aasbjerg; Dahl, Anders Lindbjerg

    2012-01-01

    This contribution deals with classification of multilook fully polarimetric synthetic aperture radar (SAR) data by learning a dictionary of crop types present in the Foulum test site. The Foulum test site contains a large number of agricultural fields, as well as lakes, forests, natural vegetation......, grasslands and urban areas, which make it ideally suited for evaluation of classification algorithms. Dictionary learning centers around building a collection of image patches typical for the classification problem at hand. This requires initial manual labeling of the classes present in the data and is thus...... a method for supervised classification. Sparse coding of these image patches aims to maintain a proficient number of typical patches and associated labels. Data is consecutively classified by a nearest neighbor search of the dictionary elements and labeled with probabilities of each class. Each dictionary...

  10. A Comparative Analysis of Classification Algorithms on Diverse Datasets

    Directory of Open Access Journals (Sweden)

    M. Alghobiri

    2018-04-01

    Full Text Available Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.

  11. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  12. Performance Evaluation of Downscaling Sentinel-2 Imagery for Land Use and Land Cover Classification by Spectral-Spatial Features

    Directory of Open Access Journals (Sweden)

    Hongrui Zheng

    2017-12-01

    Full Text Available Land Use and Land Cover (LULC classification is vital for environmental and ecological applications. Sentinel-2 is a new generation land monitoring satellite with the advantages of novel spectral capabilities, wide coverage and fine spatial and temporal resolutions. The effects of different spatial resolution unification schemes and methods on LULC classification have been scarcely investigated for Sentinel-2. This paper bridged this gap by comparing the differences between upscaling and downscaling as well as different downscaling algorithms from the point of view of LULC classification accuracy. The studied downscaling algorithms include nearest neighbor resampling and five popular pansharpening methods, namely, Gram-Schmidt (GS, nearest neighbor diffusion (NNDiffusion, PANSHARP algorithm proposed by Y. Zhang, wavelet transformation fusion (WTF and high-pass filter fusion (HPF. Two spatial features, textural metrics derived from Grey-Level-Co-occurrence Matrix (GLCM and extended attribute profiles (EAPs, are investigated to make up for the shortcoming of pixel-based spectral classification. Random forest (RF is adopted as the classifier. The experiment was conducted in Xitiaoxi watershed, China. The results demonstrated that downscaling obviously outperforms upscaling in terms of classification accuracy. For downscaling, image sharpening has no obvious advantages than spatial interpolation. Different image sharpening algorithms have distinct effects. Two multiresolution analysis (MRA-based methods, i.e., WTF and HFP, achieve the best performance. GS achieved a similar accuracy with NNDiffusion and PANSHARP. Compared to image sharpening, the introduction of spatial features, both GLCM and EAPs can greatly improve the classification accuracy for Sentinel-2 imagery. Their effects on overall accuracy are similar but differ significantly to specific classes. In general, using the spectral bands downscaled by nearest neighbor interpolation can meet

  13. Study of parameters of the nearest neighbour shared algorithm on clustering documents

    Science.gov (United States)

    Mustika Rukmi, Alvida; Budi Utomo, Daryono; Imro’atus Sholikhah, Neni

    2018-03-01

    Document clustering is one way of automatically managing documents, extracting of document topics and fastly filtering information. Preprocess of clustering documents processed by textmining consists of: keyword extraction using Rapid Automatic Keyphrase Extraction (RAKE) and making the document as concept vector using Latent Semantic Analysis (LSA). Furthermore, the clustering process is done so that the documents with the similarity of the topic are in the same cluster, based on the preprocesing by textmining performed. Shared Nearest Neighbour (SNN) algorithm is a clustering method based on the number of "nearest neighbors" shared. The parameters in the SNN Algorithm consist of: k nearest neighbor documents, ɛ shared nearest neighbor documents and MinT minimum number of similar documents, which can form a cluster. Characteristics The SNN algorithm is based on shared ‘neighbor’ properties. Each cluster is formed by keywords that are shared by the documents. SNN algorithm allows a cluster can be built more than one keyword, if the value of the frequency of appearing keywords in document is also high. Determination of parameter values on SNN algorithm affects document clustering results. The higher parameter value k, will increase the number of neighbor documents from each document, cause similarity of neighboring documents are lower. The accuracy of each cluster is also low. The higher parameter value ε, caused each document catch only neighbor documents that have a high similarity to build a cluster. It also causes more unclassified documents (noise). The higher the MinT parameter value cause the number of clusters will decrease, since the number of similar documents can not form clusters if less than MinT. Parameter in the SNN Algorithm determine performance of clustering result and the amount of noise (unclustered documents ). The Silhouette coeffisient shows almost the same result in many experiments, above 0.9, which means that SNN algorithm works well

  14. A Pan-STARRS1 Proper-Motion Survey for Young Brown Dwarfs in the Nearest Star-Forming Regions and a Reddening-Free Classification Method for Ultracool Dwarfs

    Science.gov (United States)

    Zhang, Zhoujian; Liu, Michael C.; Best, William M. J.; Magnier, Eugene; Aller, Kimberly

    2018-01-01

    Young brown dwarfs are of prime importance to investigate the universality of the initial mass function (IMF). Based on photometry and proper motions from the Pan-STARRS1 (PS1) 3π survey, we are conducting the widest and deepest brown dwarf survey in the nearby star-forming regions, Taurus–Auriga (Taurus) and Upper Scorpius (USco). Our work is the first to measure proper motions, a robust proxy of membership, for brown dwarf candidates in Taurus and USco over such a large area and long time baseline (≈ 15 year) with such high precision (≈ 4 mas yr-1). Since extinction complicates spectral classification, we have developed a new approach to quantitatively determine reddening-free spectral types, extinctions, and gravity classifications for mid-M to late-L ultracool dwarfs (≈ 100–5 MJup), using low-resolution near-infrared spectra. So far, our IRTF/SpeX spectroscopic follow-up has increased the substellar and planetary-mass census of Taurus by ≈ 50% and almost doubled the substellar census of USco, constituting the largest single increases of brown dwarfs and free-floating planets found in both regions to date. Most notably, our new discoveries reveal an older (> 10 Myr) low-mass population in Taurus, in accord with recent studies of the higher-mass stellar members. In addition, the mass function appears to differ between the younger and older Taurus populations, possibly due to incompleteness of the older stellar members or different star formation processes. Upon completion, our survey will establish the most complete substellar and planetary-mass census in both Taurus and USco associations, make a significant addition to the low-mass IMF in both regions, and deliver more comprehensive pictures of star formation histories.

  15. Predictive Manufacturing: A Classification Strategy to Predict Product Failures

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Kulahci, Murat

    2018-01-01

    manufacturing analytics model that employs a big data approach to predicting product failures; third, we illustrate the issue of high dimensionality, along with statistically redundant information; and, finally, our proposed method will be compared against the well-known classification methods (SVM, K......-nearest neighbor, artificial neural networks). The results from real data show that our predictive manufacturing analytics approach, using genetic algorithms and Voronoi tessellations, is capable of predicting product failure with reasonable accuracy. The potential application of this method contributes...... to accurately predicting product failures, which would enable manufacturers to reduce production costs without compromising product quality....

  16. Fast Most Similar Neighbor (MSN) classifiers for Mixed Data

    OpenAIRE

    Hernández Rodríguez, Selene

    2010-01-01

    The k nearest neighbor (k-NN) classifier has been extensively used in Pattern Recognition because of its simplicity and its good performance. However, in large datasets applications, the exhaustive k-NN classifier becomes impractical. Therefore, many fast k-NN classifiers have been developed; most of them rely on metric properties (usually the triangle inequality) to reduce the number of prototype comparisons. Hence, the existing fast k-NN classifiers are applicable only when the comparison f...

  17. Search techniques in intelligent classification systems

    CERN Document Server

    Savchenko, Andrey V

    2016-01-01

    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  18. Analytic nearest neighbour model for FCC metals

    International Nuclear Information System (INIS)

    Idiodi, J.O.A.; Garba, E.J.D.; Akinlade, O.

    1991-06-01

    A recently proposed analytic nearest-neighbour model for fcc metals is criticised and two alternative nearest-neighbour models derived from the separable potential method (SPM) are recommended. Results for copper and aluminium illustrate the utility of the recommended models. (author). 20 refs, 5 tabs

  19. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic

    Directory of Open Access Journals (Sweden)

    Fawaz S. Al-Anzi

    2017-04-01

    Full Text Available Cosine similarity is one of the most popular distance measures in text classification problems. In this paper, we used this important measure to investigate the performance of Arabic language text classification. For textual features, vector space model (VSM is generally used as a model to represent textual information as numerical vectors. However, Latent Semantic Indexing (LSI is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular value decomposition (SVD method to extract textual features based on LSI. In our experiments, we conducted comparison between some of the well-known classification methods such as Naïve Bayes, k-Nearest Neighbors, Neural Network, Random Forest, Support Vector Machine, and classification tree. We used a corpus that contains 4,000 documents of ten topics (400 document for each topic. The corpus contains 2,127,197 words with about 139,168 unique words. The testing set contains 400 documents, 40 documents for each topics. As a weighing scheme, we used Term Frequency.Inverse Document Frequency (TF.IDF. This study reveals that the classification methods that use LSI features significantly outperform the TF.IDF-based methods. It also reveals that k-Nearest Neighbors (based on cosine measure and support vector machine are the best performing classifiers.

  20. Comparison of feature selection and classification for MALDI-MS data

    Directory of Open Access Journals (Sweden)

    Yang Mary

    2009-07-01

    Full Text Available Abstract Introduction In the classification of Mass Spectrometry (MS proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE, and a recently developed method, Gradient based Leave-one-out Gene Selection (GLGS that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC, Naïve Bayes Classifier (NBC, Nearest Mean Scaled Classifier (NMSC, uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing

  1. Some Observations about the Nearest-Neighbor Model of the Error Threshold

    International Nuclear Information System (INIS)

    Gerrish, Philip J.

    2009-01-01

    I explore some aspects of the 'error threshold' - a critical mutation rate above which a population is nonviable. The phase transition that occurs as mutation rate crosses this threshold has been shown to be mathematically equivalent to the loss of ferromagnetism that occurs as temperature exceeds the Curie point. I will describe some refinements and new results based on the simplest of these mutation models, will discuss the commonly unperceived robustness of this simple model, and I will show some preliminary results comparing qualitative predictions with simulations of finite populations adapting at high mutation rates. I will talk about how these qualitative predictions are relevant to biomedical science and will discuss how my colleagues and I are looking for phase-transition signatures in real populations of Escherichia coli that go extinct as a result of excessive mutation.

  2. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents.

    OpenAIRE

    Glickman, B W; Horsfall, M J; Gordon, A J; Burns, P A

    1987-01-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preference was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species is the alkyl diazonium ion. In contrast, this preference ...

  3. Renormalization-group studies of antiferromagnetic chains. I. Nearest-neighbor interactions

    International Nuclear Information System (INIS)

    Rabin, J.M.

    1980-01-01

    The real-space renormalization-group method introduced by workers at the Stanford Linear Accelerator Center (SLAC) is used to study one-dimensional antiferromagnetic chains at zero temperature. Calculations using three-site blocks (for the Heisenberg-Ising model) and two-site blocks (for the isotropic Heisenberg model) are compared with exact results. In connection with the two-site calculation a duality transformation is introduced under which the isotropic Heisenberg model is self-dual. Such duality transformations can be defined for models other than those considered here, and may be useful in various block-spin calculations

  4. Approximate and exact hybrid algorithms for private nearest-neighbor queries with database protection

    KAUST Repository

    Ghinita, Gabriel; Kalnis, Panos; Kantarcioǧlu, Murâ t; Bertino, Elisa

    2010-01-01

    Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. To protect user privacy, it is important not to disclose exact user coordinates to un-trusted entities that provide location-based services. Currently, there are two main approaches to protect the location privacy of users: (i) hiding locations inside cloaking regions (CRs) and (ii) encrypting location data using private information retrieval (PIR) protocols. Previous work focused on finding good trade-offs between privacy and performance of user protection techniques, but disregarded the important issue of protecting the POI dataset D. For instance, location cloaking requires large-sized CRs, leading to excessive disclosure of POIs (O({pipe}D{pipe}) in the worst case). PIR, on the other hand, reduces this bound to O(√{pipe}D{pipe}), but at the expense of high processing and communication overhead. We propose hybrid, two-step approaches for private location-based queries which provide protection for both the users and the database. In the first step, user locations are generalized to coarse-grained CRs which provide strong privacy. Next, a PIR protocol is applied with respect to the obtained query CR. To protect against excessive disclosure of POI locations, we devise two cryptographic protocols that privately evaluate whether a point is enclosed inside a rectangular region or a convex polygon. We also introduce algorithms to efficiently support PIR on dynamic POI sub-sets. We provide solutions for both approximate and exact NN queries. In the approximate case, our method discloses O(1) POI, orders of magnitude fewer than CR- or PIR-based techniques. For the exact case, we obtain optimal disclosure of a single POI, although with slightly higher computational overhead. Experimental results show that the hybrid approaches are scalable in practice, and outperform the pure-PIR approach in terms of computational and communication overhead. © 2010 Springer Science+Business Media, LLC.

  5. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.

    Science.gov (United States)

    Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

    2015-12-01

    Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data

    OpenAIRE

    Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

    2015-01-01

    Most clinical and biomedical data contain missing values. A patient’s record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across tim...

  7. PENINGKATAN KECERDASAN COMPUTER PLAYER PADA GAME PERTARUNGAN BERBASIS K-NEAREST NEIGHBOR BERBOBOT

    Directory of Open Access Journals (Sweden)

    M Ihsan Alfani Putera

    2018-02-01

    Full Text Available Salah satu teknologi komputer yang berkembang dan perubahannya cukup pesat adalah game. Tujuan dibuatnya game adalah sebagai sarana hiburan dan memberikan kesenangan bagi penggunanya. Contoh elemen dalam pembuatan game yang penting adalah adanya tantangan yang seimbang sesuai level. Dalam hal ini, adanya kecerdasan buatan atau AI merupakan salah satu unsur yang diperlukan dalam pembentukan game. Penggunaan AI yang tidak beradaptasi ke strategi lawan akan  mudah diprediksi dan repetitif. Jika AI terlalu pintar maka player akan kesulitan dalam memainkan game tersebut. Dengan keadaan seperti itu akan menurunkan tingkat enjoyment dari pemain. Oleh karena itu, dibutuhkan suatu metode AI yang dapat beradaptasi dengan kemampuan dari player yang bermain. Sehingga tingkat kesulitan yang dihadapi dapat mengikuti kemampuan pemainnya dan pengalaman enjoyment ketika bermain game terus terjaga. Pada penelitian sebelumnya, metode AI yang sering digunakan pada game berjenis pertarungan adalah K-NN. Namun metode tersebut menganggap semua atribut dalam game adalah sama sehingga hal ini mempengaruhi hasil learning AI menjadi kurang optimal.Penelitian ini mengusulkan metode untuk AI dengan menggunakan metode K-NN berbobot pada game berjenis pertarungan. Dimana, pembobotan tersebut dilakukan untuk memberikan pengaruh setiap atribut dengan bobot disesuaikan dengan aksi player. Dari hasil evaluasi yang dilakukan terhadap 50 kali pertandingan pada 3 skenario uji coba, metode yang diusulkan yaitu K-NN berbobot mampu menghasilkan tingkat kecerdasan AI dengan akurasi sebesar 51%. Sedangkan, metode sebelumnya yaitu K-NN tanpa bobot hanya menghasilkan tingkat kecerdasan AI sebesar 38% dan metode random menghasilkan tingkat kecerdasan AI sebesar 25%.

  8. PENINGKATAN KECERDASAN COMPUTER PLAYER PADA GAME PERTARUNGAN BERBASIS K-NEAREST NEIGHBOR BERBOBOT

    OpenAIRE

    M Ihsan Alfani Putera; Darlis Heru Murti

    2018-01-01

    Salah satu teknologi komputer yang berkembang dan perubahannya cukup pesat adalah game. Tujuan dibuatnya game adalah sebagai sarana hiburan dan memberikan kesenangan bagi penggunanya. Contoh elemen dalam pembuatan game yang penting adalah adanya tantangan yang seimbang sesuai level. Dalam hal ini, adanya kecerdasan buatan atau AI merupakan salah satu unsur yang diperlukan dalam pembentukan game. Penggunaan AI yang tidak beradaptasi ke strategi lawan akan  mudah diprediksi dan repetitif. Jika ...

  9. Approximate and exact hybrid algorithms for private nearest-neighbor queries with database protection

    KAUST Repository

    Ghinita, Gabriel

    2010-12-15

    Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. To protect user privacy, it is important not to disclose exact user coordinates to un-trusted entities that provide location-based services. Currently, there are two main approaches to protect the location privacy of users: (i) hiding locations inside cloaking regions (CRs) and (ii) encrypting location data using private information retrieval (PIR) protocols. Previous work focused on finding good trade-offs between privacy and performance of user protection techniques, but disregarded the important issue of protecting the POI dataset D. For instance, location cloaking requires large-sized CRs, leading to excessive disclosure of POIs (O({pipe}D{pipe}) in the worst case). PIR, on the other hand, reduces this bound to O(√{pipe}D{pipe}), but at the expense of high processing and communication overhead. We propose hybrid, two-step approaches for private location-based queries which provide protection for both the users and the database. In the first step, user locations are generalized to coarse-grained CRs which provide strong privacy. Next, a PIR protocol is applied with respect to the obtained query CR. To protect against excessive disclosure of POI locations, we devise two cryptographic protocols that privately evaluate whether a point is enclosed inside a rectangular region or a convex polygon. We also introduce algorithms to efficiently support PIR on dynamic POI sub-sets. We provide solutions for both approximate and exact NN queries. In the approximate case, our method discloses O(1) POI, orders of magnitude fewer than CR- or PIR-based techniques. For the exact case, we obtain optimal disclosure of a single POI, although with slightly higher computational overhead. Experimental results show that the hybrid approaches are scalable in practice, and outperform the pure-PIR approach in terms of computational and communication overhead. © 2010 Springer Science+Business Media, LLC.

  10. Quasi-phases and pseudo-transitions in one-dimensional models with nearest neighbor interactions

    Science.gov (United States)

    de Souza, S. M.; Rojas, Onofre

    2018-01-01

    There are some particular one-dimensional models, such as the Ising-Heisenberg spin models with a variety of chain structures, which exhibit unexpected behaviors quite similar to the first and second order phase transition, which could be confused naively with an authentic phase transition. Through the analysis of the first derivative of free energy, such as entropy, magnetization, and internal energy, a "sudden" jump that closely resembles a first-order phase transition at finite temperature occurs. However, by analyzing the second derivative of free energy, such as specific heat and magnetic susceptibility at finite temperature, it behaves quite similarly to a second-order phase transition exhibiting an astonishingly sharp and fine peak. The correlation length also confirms the evidence of this pseudo-transition temperature, where a sharp peak occurs at the pseudo-critical temperature. We also present the necessary conditions for the emergence of these quasi-phases and pseudo-transitions.

  11. Nearest-Neighbor Interactions and Their Influence on the Structural Aspects of Dipeptides

    Directory of Open Access Journals (Sweden)

    Gunajyoti Das

    2013-01-01

    Full Text Available In this theoretical study, the role of the side chain moiety of C-terminal residue in influencing the structural and molecular properties of dipeptides is analyzed by considering a series of seven dipeptides. The C-terminal positions of the dipeptides are varied with seven different amino acid residues, namely. Val, Leu, Asp, Ser, Gln, His, and Pyl while their N-terminal positions are kept constant with Sec residues. Full geometry optimization and vibrational frequency calculations are carried out at B3LYP/6-311++G(d,p level in gas and aqueous phase. The stereo-electronic effects of the side chain moieties of C-terminal residues are found to influence the values of Φ and Ω dihedrals, planarity of the peptide planes, and geometry around the C7   α-carbon atoms of the dipeptides. The gas phase intramolecular H-bond combinations of the dipeptides are similar to those in aqueous phase. The theoretical vibrational spectra of the dipeptides reflect the nature of intramolecular H-bonds existing in the dipeptide structures. Solvation effects of aqueous environment are evident on the geometrical parameters related to the amide planes, dipole moments, HOMOLUMO energy gaps as well as thermodynamic stability of the dipeptides.

  12. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents.

    Science.gov (United States)

    Glickman, B W; Horsfall, M J; Gordon, A J; Burns, P A

    1987-01-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preference was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species is the alkyl diazonium ion. In contrast, this preference was not seen following treatment with ethylmethanesulfonate. The observed preference for 5'PuG-3' site over 5'-PyG-3' sites corresponds well with alterations observed in the Ha-ras oncogene recovered after treatment with NMU. This indicates that the mutations recovered in the oncogenes are likely the direct consequence of the alkylation treatment and that the local sequence effects seen in E. coli also appear to occur in mammalian cells. PMID:3329097

  13. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents

    Energy Technology Data Exchange (ETDEWEB)

    Glickman, B.W.; Horsfall, M.J.; Gordon, A.J.E.; Burns, P.A.

    1987-12-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preferences was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species in the alkyl diazonium ion. In contrast, this preferences was not seen following treatment with ethylmethanesulfonate. The observed preference for 5'PuG-3' site over 5'-PyG-3' sites corresponds well with alterations observed in the Ha-ras oncogene recovered after treatment with NMU. This indicates that the mutations recovered in the oncogenes are likely the direct consequence of the alkylation treatment and that the local sequence effects seen in E. coli also appear to occur in mammalian cells.

  14. THE SOLAR NEIGHBORHOOD XXIX: THE HABITABLE REAL ESTATE OF OUR NEAREST STELLAR NEIGHBORS

    Energy Technology Data Exchange (ETDEWEB)

    Cantrell, Justin R.; Henry, Todd J.; White, Russel J., E-mail: cantrell@chara.gsu.edu, E-mail: thenry@chara.gsu.edu, E-mail: white@chara.gsu.edu [Georgia State University, Atlanta, GA 30302-4106 (United States)

    2013-10-01

    We use the sample of known stars and brown dwarfs within 5 pc of the Sun, supplemented with AFGK stars within 10 pc, to determine which stellar spectral types provide the most habitable real estate—defined as locations where liquid water could be present on Earth-like planets. Stellar temperatures and radii are determined by fitting model spectra to spatially resolved broadband photometric energy distributions for stars in the sample. Using these values, the locations of the habitable zones are calculated using an empirical formula for planetary surface temperature and assuming the condition of liquid water, called here the empirical habitable zone (EHZ). Systems that have dynamically disruptive companions are considered not habitable. We consider companions to be disruptive if the separation ratio of the companion to the habitable zone is less than 5:1. We use the results of these calculations to derive a simple formula for predicting the location of the EHZ for main sequence stars based on V – K color. We consider EHZ widths as more useful measures of the habitable real estate around stars than areas because multiple planets are not expected to orbit stars at identical stellar distances. This EHZ provides a qualitative guide on where to expect the largest population of planets in the habitable zones of main sequence stars. Because of their large numbers and lower frequency of short-period companions, M stars provide more EHZ real estate than other spectral types, possessing 36.5% of the habitable real estate en masse. K stars are second with 21.5%, while A, F, and G stars offer 18.5%, 6.9%, and 16.6%, respectively. Our calculations show that three M dwarfs within 10 pc harbor planets in their EHZs—GJ 581 may have two planets (d with msin i = 6.1 M {sub ⊕}; g with msin i = 3.1 M {sub ⊕}), GJ 667 C has one (c with msin i = 4.5 M {sub ⊕}), and GJ 876 has two (b with msin i = 1.89 M {sub Jup} and c with msin i = 0.56 M {sub Jup}). If Earth-like planets are as common around low-mass stars as recent Kepler results suggest, M stars will harbor more Earth-like planets in habitable zones than any other stellar spectral type.

  15. THE SOLAR NEIGHBORHOOD XXIX: THE HABITABLE REAL ESTATE OF OUR NEAREST STELLAR NEIGHBORS

    International Nuclear Information System (INIS)

    Cantrell, Justin R.; Henry, Todd J.; White, Russel J.

    2013-01-01

    We use the sample of known stars and brown dwarfs within 5 pc of the Sun, supplemented with AFGK stars within 10 pc, to determine which stellar spectral types provide the most habitable real estate—defined as locations where liquid water could be present on Earth-like planets. Stellar temperatures and radii are determined by fitting model spectra to spatially resolved broadband photometric energy distributions for stars in the sample. Using these values, the locations of the habitable zones are calculated using an empirical formula for planetary surface temperature and assuming the condition of liquid water, called here the empirical habitable zone (EHZ). Systems that have dynamically disruptive companions are considered not habitable. We consider companions to be disruptive if the separation ratio of the companion to the habitable zone is less than 5:1. We use the results of these calculations to derive a simple formula for predicting the location of the EHZ for main sequence stars based on V – K color. We consider EHZ widths as more useful measures of the habitable real estate around stars than areas because multiple planets are not expected to orbit stars at identical stellar distances. This EHZ provides a qualitative guide on where to expect the largest population of planets in the habitable zones of main sequence stars. Because of their large numbers and lower frequency of short-period companions, M stars provide more EHZ real estate than other spectral types, possessing 36.5% of the habitable real estate en masse. K stars are second with 21.5%, while A, F, and G stars offer 18.5%, 6.9%, and 16.6%, respectively. Our calculations show that three M dwarfs within 10 pc harbor planets in their EHZs—GJ 581 may have two planets (d with msin i = 6.1 M ⊕ ; g with msin i = 3.1 M ⊕ ), GJ 667 C has one (c with msin i = 4.5 M ⊕ ), and GJ 876 has two (b with msin i = 1.89 M Jup and c with msin i = 0.56 M Jup ). If Earth-like planets are as common around low-mass stars as recent Kepler results suggest, M stars will harbor more Earth-like planets in habitable zones than any other stellar spectral type

  16. Haussdorff and hellinger for colorimetric sensor array classification

    DEFF Research Database (Denmark)

    Alstrøm, Tommy Sonne; Jensen, Bjørn Sand; Schmidt, Mikkel Nørgaard

    2012-01-01

    Development of sensors and systems for detection of chemical compounds is an important challenge with applications in areas such as anti-terrorism, demining, and environmental monitoring. A newly developed colorimetric sensor array is able to detect explosives and volatile organic compounds......; however, each sensor reading consists of hundreds of pixel values, and methods for combining these readings from multiple sensors must be developed to make a classification system. In this work we examine two distance based classification methods, K-Nearest Neighbor (KNN) and Gaussian process (GP......) classification, which both rely on a suitable distance metric. We evaluate a range of different distance measures and propose a method for sensor fusion in the GP classifier. Our results indicate that the best choice of distance measure depends on the sensor and the chemical of interest....

  17. Classification of Pulse Waveforms Using Edit Distance with Real Penalty

    Directory of Open Access Journals (Sweden)

    Zhang Dongyu

    2010-01-01

    Full Text Available Abstract Advances in sensor and signal processing techniques have provided effective tools for quantitative research in traditional Chinese pulse diagnosis (TCPD. Because of the inevitable intraclass variation of pulse patterns, the automatic classification of pulse waveforms has remained a difficult problem. In this paper, by referring to the edit distance with real penalty (ERP and the recent progress in -nearest neighbors (KNN classifiers, we propose two novel ERP-based KNN classifiers. Taking advantage of the metric property of ERP, we first develop an ERP-induced inner product and a Gaussian ERP kernel, then embed them into difference-weighted KNN classifiers, and finally develop two novel classifiers for pulse waveform classification. The experimental results show that the proposed classifiers are effective for accurate classification of pulse waveform.

  18. Consistency Analysis of Nearest Subspace Classifier

    OpenAIRE

    Wang, Yi

    2015-01-01

    The Nearest subspace classifier (NSS) finds an estimation of the underlying subspace within each class and assigns data points to the class that corresponds to its nearest subspace. This paper mainly studies how well NSS can be generalized to new samples. It is proved that NSS is strongly consistent under certain assumptions. For completeness, NSS is evaluated through experiments on various simulated and real data sets, in comparison with some other linear model based classifiers. It is also ...

  19. Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use

    Science.gov (United States)

    Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil

    2013-01-01

    The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648

  20. Learning Euclidean Embeddings for Indexing and Classification

    National Research Council Canada - National Science Library

    Athitsos, Vassilis; Alon, Joni; Sclaroff, Stan; Kollios, George

    2004-01-01

    BoostMap is a recently proposed method for efficient approximate nearest neighbor retrieval in arbitrary non-Euclidean spaces with computationally expensive and possibly non-metric distance measures...

  1. Comparison of four approaches to a rock facies classification problem

    Science.gov (United States)

    Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  2. Nearest neighbour classification of Indian sign language gestures ...

    Indian Academy of Sciences (India)

    In the ideal case, a gesture recognition ... Every geographical region has developed its own sys- ... et al [10] present a study on vision-based static hand shape .... tures, and neural networks for recognition. ..... We used the city-block dis-.

  3. Hybrid RGSA and Support Vector Machine Framework for Three-Dimensional Magnetic Resonance Brain Tumor Classification

    Directory of Open Access Journals (Sweden)

    R. Rajesh Sharma

    2015-01-01

    algorithm (RGSA. Support vector machines, over backpropagation network, and k-nearest neighbor are used to evaluate the goodness of classifier approach. The preliminary evaluation of the system is performed using 320 real-time brain MRI images. The system is trained and tested by using a leave-one-case-out method. The performance of the classifier is tested using the receiver operating characteristic curve of 0.986 (±002. The experimental results demonstrate the systematic and efficient feature extraction and feature selection algorithm to the performance of state-of-the-art feature classification methods.

  4. Emotional State Classification in Virtual Reality Using Wearable Electroencephalography

    Science.gov (United States)

    Suhaimi, N. S.; Teo, J.; Mountstephens, J.

    2018-03-01

    This paper presents the classification of emotions on EEG signals. One of the key issues in this research is the lack of mental classification using VR as the medium to stimulate emotion. The approach towards this research is by using K-nearest neighbor (KNN) and Support Vector Machine (SVM). Firstly, each of the participant will be required to wear the EEG headset and recording their brainwaves when they are immersed inside the VR. The data points are then marked if they showed any physical signs of emotion or by observing the brainwave pattern. Secondly, the data will then be tested and trained with KNN and SVM algorithms. The accuracy achieved from both methods were approximately 82% throughout the brainwave spectrum (α, β, γ, δ, θ). These methods showed promising results and will be further enhanced using other machine learning approaches in VR stimulus.

  5. Functional Basis of Microorganism Classification.

    Science.gov (United States)

    Zhu, Chengsheng; Delmont, Tom O; Vogel, Timothy M; Bromberg, Yana

    2015-08-01

    Correctly identifying nearest "neighbors" of a given microorganism is important in industrial and clinical applications where close relationships imply similar treatment. Microbial classification based on similarity of physiological and genetic organism traits (polyphasic similarity) is experimentally difficult and, arguably, subjective. Evolutionary relatedness, inferred from phylogenetic markers, facilitates classification but does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Using over thirteen hundred sequenced bacterial genomes, we built a novel function-based microorganism classification scheme, functional-repertoire similarity-based organism network (FuSiON; flattened to fusion). Our scheme is phenetic, based on a network of quantitatively defined organism relationships across the known prokaryotic space. It correlates significantly with the current taxonomy, but the observed discrepancies reveal both (1) the inconsistency of functional diversity levels among different taxa and (2) an (unsurprising) bias towards prioritizing, for classification purposes, relatively minor traits of particular interest to humans. Our dynamic network-based organism classification is independent of the arbitrary pairwise organism similarity cut-offs traditionally applied to establish taxonomic identity. Instead, it reveals natural, functionally defined organism groupings and is thus robust in handling organism diversity. Additionally, fusion can use organism meta-data to highlight the specific environmental factors that drive microbial diversification. Our approach provides a complementary view to cladistic assignments and holds important clues for further exploration of microbial lifestyles. Fusion is a more practical fit for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment and are less concerned with

  6. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  7. Fast-HPLC Fingerprinting to Discriminate Olive Oil from Other Edible Vegetable Oils by Multivariate Classification Methods.

    Science.gov (United States)

    Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis

    2017-03-01

    A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.

  8. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  9. Nonlocal synchronization in nearest neighbour coupled oscillators

    International Nuclear Information System (INIS)

    El-Nashar, H.F.; Elgazzar, A.S.; Cerdeira, H.A.

    2002-02-01

    We investigate a system of nearest neighbour coupled oscillators. We show that the nonlocal frequency synchronization, that might appear in such a system, occurs as a consequence of the nearest neighbour coupling. The power spectra of nonadjacent oscillators shows that there is no complete coincidence between all frequency peaks of the oscillators in the nonlocal cluster, while the peaks for neighbouring oscillators approximately coincide even if they are not yet in a cluster. It is shown that nonadjacent oscillators closer in frequencies, share slow modes with their adjacent oscillators which are neighbours in space. It is also shown that when a direct coupling between non-neighbours oscillators is introduced explicitly, the peaks of the spectra of the frequencies of those non-neighbours coincide. (author)

  10. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    Science.gov (United States)

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  11. Diagnostic radiology in the nearest future

    International Nuclear Information System (INIS)

    Lindenbraten, L.D.

    1984-01-01

    Basic trends of diagnostic radiology (DR) development in the nearest future are formulated. Possibilities of perspective ways and means of DR studies are described. The prohlems of strategy, tactics, organization of diagnostic radiological service are considered. An attempt has been made to outline the professional image of a specialist in the DR of the future. It is shown that prediction of the DR future development is the planning stage of the present, the choice of a right way of development

  12. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  13. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  14. Exact Cross-Validation for kNN and applications to passive and active learning in classification

    OpenAIRE

    Célisse, Alain; Mary-Huard, Tristan

    2011-01-01

    In the binary classification framework, a closed form expression of the cross-validation Leave-p-Out (LpO) risk estimator for the k Nearest Neighbor algorithm (kNN) is derived. It is first used to study the LpO risk minimization strategy for choosing k in the passive learning setting. The impact of p on the choice of k and the LpO estimation of the risk are inferred. In the active learning setting, a procedure is proposed that selects new examples using a LpO committee of kNN classifiers. The...

  15. Neighboring and Urbanism: Commonality versus Friendship.

    Science.gov (United States)

    Silverman, Carol J.

    1986-01-01

    Examines a dimension of neighboring that need not assume friendship as the role model. When the model assumes only a sense of connectedness as defining neighboring, then the residential correlation, shown in many studies between urbanism and neighboring, disappears. Theories of neighboring, study variables, methods, and analysis are discussed.…

  16. Random ensemble learning for EEG classification.

    Science.gov (United States)

    Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid

    2018-01-01

    Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. ALIGNMENTS OF GROUP GALAXIES WITH NEIGHBORING GROUPS

    International Nuclear Information System (INIS)

    Wang Yougang; Chen Xuelei; Park, Changbom; Yang Xiaohu; Choi, Yun-Young

    2009-01-01

    Using a sample of galaxy groups found in the Sloan Digital Sky Survey Data Release 4, we measure the following four types of alignment signals: (1) the alignment between the distributions of the satellites of each group relative to the direction of the nearest neighbor group (NNG); (2) the alignment between the major axis direction of the central galaxy of the host group (HG) and the direction of the NNG; (3) the alignment between the major axes of the central galaxies of the HG and the NNG; and (4) the alignment between the major axes of the satellites of the HG and the direction of the NNG. We find strong signal of alignment between the satellite distribution and the orientation of central galaxy relative to the direction of the NNG, even when the NNG is located beyond 3r vir of the host group. The major axis of the central galaxy of the HG is aligned with the direction of the NNG. The alignment signals are more prominent for groups that are more massive and with early-type central galaxies. We also find that there is a preference for the two major axes of the central galaxies of the HG and NNG to be parallel for the system with both early central galaxies, however, not for the systems with both late-type central galaxies. For the orientation of satellite galaxies, we do not find any significant alignment signals relative to the direction of the NNG. From these four types of alignment measurements, we conclude that the large-scale environment traced by the nearby group affects primarily the shape of the host dark matter halo, and hence also affects the distribution of satellite galaxies and the orientation of central galaxies. In addition, the NNG directly affects the distribution of the satellite galaxies by inducing asymmetric alignment signals, and the NNG at very small separation may also contribute a second-order impact on the orientation of the central galaxy in the HG.

  18. Velocity correlations and spatial dependencies between neighbors in a unidirectional flow of pedestrians

    Science.gov (United States)

    Porzycki, Jakub; WÄ s, Jarosław; Hedayatifar, Leila; Hassanibesheli, Forough; Kułakowski, Krzysztof

    2017-08-01

    The aim of the paper is an analysis of self-organization patterns observed in the unidirectional flow of pedestrians. On the basis of experimental data from Zhang et al. [J. Zhang et al., J. Stat. Mech. (2011) P06004, 10.1088/1742-5468/2011/06/P06004], we analyze the mutual positions and velocity correlations between pedestrians when walking along a corridor. The angular and spatial dependencies of the mutual positions reveal a spatial structure that remains stable during the crowd motion. This structure differs depending on the value of n , for the consecutive n th -nearest-neighbor position set. The preferred position for the first-nearest neighbor is on the side of the pedestrian, while for further neighbors, this preference shifts to the axis of movement. The velocity correlations vary with the angle formed by the pair of neighboring pedestrians and the direction of motion and with the time delay between pedestrians' movements. The delay dependence of the correlations shows characteristic oscillations, produced by the velocity oscillations when striding; however, a filtering of the main frequency of individual striding out reduces the oscillations only partially. We conclude that pedestrians select their path directions so as to evade the necessity of continuously adjusting their speed to their neighbors'. They try to keep a given distance, but follow the person in front of them, as well as accepting and observing pedestrians on their sides. Additionally, we show an empirical example that illustrates the shape of a pedestrian's personal space during movement.

  19. Classification in medical images using adaptive metric k-NN

    Science.gov (United States)

    Chen, C.; Chernoff, K.; Karemore, G.; Lo, P.; Nielsen, M.; Lauze, F.

    2010-03-01

    The performance of the k-nearest neighborhoods (k-NN) classifier is highly dependent on the distance metric used to identify the k nearest neighbors of the query points. The standard Euclidean distance is commonly used in practice. This paper investigates the performance of k-NN classifier with respect to different adaptive metrics in the context of medical imaging. We propose using adaptive metrics such that the structure of the data is better described, introducing some unsupervised learning knowledge in k-NN. We investigated four different metrics are estimated: a theoretical metric based on the assumption that images are drawn from Brownian Image Model (BIM), the normalized metric based on variance of the data, the empirical metric is based on the empirical covariance matrix of the unlabeled data, and an optimized metric obtained by minimizing the classification error. The spectral structure of the empirical covariance also leads to Principal Component Analysis (PCA) performed on it which results the subspace metrics. The metrics are evaluated on two data sets: lateral X-rays of the lumbar aortic/spine region, where we use k-NN for performing abdominal aorta calcification detection; and mammograms, where we use k-NN for breast cancer risk assessment. The results show that appropriate choice of metric can improve classification.

  20. Nearest Neighbour Corner Points Matching Detection Algorithm

    Directory of Open Access Journals (Sweden)

    Zhang Changlong

    2015-01-01

    Full Text Available Accurate detection towards the corners plays an important part in camera calibration. To deal with the instability and inaccuracies of present corner detection algorithm, the nearest neighbour corners match-ing detection algorithms was brought forward. First, it dilates the binary image of the photographed pictures, searches and reserves quadrilateral outline of the image. Second, the blocks which accord with chess-board-corners are classified into a class. If too many blocks in class, it will be deleted; if not, it will be added, and then let the midpoint of the two vertex coordinates be the rough position of corner. At last, it precisely locates the position of the corners. The Experimental results have shown that the algorithm has obvious advantages on accuracy and validity in corner detection, and it can give security for camera calibration in traffic accident measurement.

  1. Electronic Nose Odor Classification with Advanced Decision Tree Structures

    Directory of Open Access Journals (Sweden)

    S. Guney

    2013-09-01

    Full Text Available Electronic nose (e-nose is an electronic device which can measure chemical compounds in air and consequently classify different odors. In this paper, an e-nose device consisting of 8 different gas sensors was designed and constructed. Using this device, 104 different experiments involving 11 different odor classes (moth, angelica root, rose, mint, polis, lemon, rotten egg, egg, garlic, grass, and acetone were performed. The main contribution of this paper is the finding that using the chemical domain knowledge it is possible to train an accurate odor classification system. The domain knowledge about chemical compounds is represented by a decision tree whose nodes are composed of classifiers such as Support Vector Machines and k-Nearest Neighbor. The overall accuracy achieved with the proposed algorithm and the constructed e-nose device was 97.18 %. Training and testing data sets used in this paper are published online.

  2. Voting-based Classification for E-mail Spam Detection

    Directory of Open Access Journals (Sweden)

    Bashar Awad Al-Shboul

    2016-06-01

    Full Text Available The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited e-mails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying e-mails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for classifying each e-mail individually. The classification results of three different classifiers (i.e. Decision Trees, Random Forests and k-Nearest Neighbor are combined in various voting schemes (i.e. majority vote, average probability, product of probabilities, minimum probability and maximum probability for making the final decision. To validate our method, two different spam e-mail collections were used.

  3. Automatic music genres classification as a pattern recognition problem

    Science.gov (United States)

    Ul Haq, Ihtisham; Khan, Fauzia; Sharif, Sana; Shaukat, Arsalan

    2013-12-01

    Music genres are the simplest and effect descriptors for searching music libraries stores or catalogues. The paper compares the results of two automatic music genres classification systems implemented by using two different yet simple classifiers (K-Nearest Neighbor and Naïve Bayes). First a 10-12 second sample is selected and features are extracted from it, and then based on those features results of both classifiers are represented in the form of accuracy table and confusion matrix. An experiment carried out on test 60 taken from middle of a song represents the true essence of its genre as compared to the samples taken from beginning and ending of a song. The novel techniques have achieved an accuracy of 91% and 78% by using Naïve Bayes and KNN classifiers respectively.

  4. Sistem Klasifikasi Kualitas Kopra Berdasarkan Warna dan Tekstur Menggunakan Metode Nearest Mean Classifier (NMC

    Directory of Open Access Journals (Sweden)

    Abdullah Abdullah

    2017-12-01

    The classification of copra quality with the help of computer by using image processing can help to speed up human work. Data mining techniques can be utilized for copra quality classification based on RGB color (red, green, blue and texture (energy, contrast, correlation, homogeneity. The problem is the difficulty in predicting the quality of copra in grade of A (80-85%, grade of B (70-75% and grade of C (60-65%. The purpose of this study is to develope an application for the classification of copra quality based on color and texture. The method used is the nearest mean classifier (NMC. Preprocessing is done before the classification process for background subtraction by using pixel subtraction method to separate the image of object against the background. The benefits of this research are it can save time in classifying the quality of copra and can facilitate the determination of copra price. Based on the evaluation result by using cross validation method obtained the average accuracy is 80.67% with standard deviation is 1.17%.  Keywords: classification,  image, copra, nearest mean classifier, pixel subtraction, RGB color, texture

  5. Comparison of models of automatic classification of textural patterns of mineral presents in Colombian coals

    International Nuclear Information System (INIS)

    Lopez Carvajal, Jaime; Branch Bedoya, John Willian

    2005-01-01

    The automatic classification of objects is a very interesting approach under several problem domains. This paper outlines some results obtained under different classification models to categorize textural patterns of minerals using real digital images. The data set used was characterized by a small size and noise presence. The implemented models were the Bayesian classifier, Neural Network (2-5-1), support vector machine, decision tree and 3-nearest neighbors. The results after applying crossed validation show that the Bayesian model (84%) proved better predictive capacity than the others, mainly due to its noise robustness behavior. The neuronal network (68%) and the SVM (67%) gave promising results, because they could be improved increasing the data amount used, while the decision tree (55%) and K-NN (54%) did not seem to be adequate for this problem, because of their sensibility to noise

  6. Staining pattern classification of antinuclear autoantibodies based on block segmentation in indirect immunofluorescence images.

    Directory of Open Access Journals (Sweden)

    Jiaqian Li

    Full Text Available Indirect immunofluorescence based on HEp-2 cell substrate is the most commonly used staining method for antinuclear autoantibodies associated with different types of autoimmune pathologies. The aim of this paper is to design an automatic system to identify the staining patterns based on block segmentation compared to the cell segmentation most used in previous research. Various feature descriptors and classifiers are tested and compared in the classification of the staining pattern of blocks and it is found that the technique of the combination of the local binary pattern and the k-nearest neighbor algorithm achieve the best performance. Relying on the results of block pattern classification, experiments on the whole images show that classifier fusion rules are able to identify the staining patterns of the whole well (specimen image with a total accuracy of about 94.62%.

  7. Measurement of near neighbor separations of surface atoms

    International Nuclear Information System (INIS)

    Cohen, P.I.

    Two techniques are being developed to measure the nearest neighbor distances of atoms at the surfaces of solids. Both measures extended fine structure in the excitation probability of core level electrons which are excited by an incident electron beam. This is an important problem because the structures of most surface systems are as yet unknown, even though the location of surface atoms is the basis for any quantitative understanding of the chemistry and physics of surfaces and interfaces. These methods would allow any laboratory to make in situ determinations of surface structure in conjunction with most other laboratory probes of surfaces. Each of these two techniques has different advantages; further, the combination of the two will increase confidence in the results by reducing systematic error in the data analysis

  8. Supervised Classification of Agricultural Land Cover Using a Modified k-NN Technique (MNN and Landsat Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Karsten Schulz

    2009-11-01

    Full Text Available Nearest neighbor techniques are commonly used in remote sensing, pattern recognition and statistics to classify objects into a predefined number of categories based on a given set of predictors. These techniques are especially useful for highly nonlinear relationship between the variables. In most studies the distance measure is adopted a priori. In contrast we propose a general procedure to find an adaptive metric that combines a local variance reducing technique and a linear embedding of the observation space into an appropriate Euclidean space. To illustrate the application of this technique, two agricultural land cover classifications using mono-temporal and multi-temporal Landsat scenes are presented. The results of the study, compared with standard approaches used in remote sensing such as maximum likelihood (ML or k-Nearest Neighbor (k-NN indicate substantial improvement with regard to the overall accuracy and the cardinality of the calibration data set. Also, using MNN in a soft/fuzzy classification framework demonstrated to be a very useful tool in order to derive critical areas that need some further attention and investment concerning additional calibration data.

  9. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  10. Image classification independent of orientation and scale

    Science.gov (United States)

    Arsenault, Henri H.; Parent, Sebastien; Moisan, Sylvain

    1998-04-01

    The recognition of targets independently of orientation has become fairly well developed in recent years for in-plane rotation. The out-of-plane rotation problem is much less advanced. When both out-of-plane rotations and changes of scale are present, the problem becomes very difficult. In this paper we describe our research on the combined out-of- plane rotation problem and the scale invariance problem. The rotations were limited to rotations about an axis perpendicular to the line of sight. The objects to be classified were three kinds of military vehicles. The inputs used were infrared imagery and photographs. We used a variation of a method proposed by Neiberg and Casasent, where a neural network is trained with a subset of the database and a minimum distances from lines in feature space are used for classification instead of nearest neighbors. Each line in the feature space corresponds to one class of objects, and points on one line correspond to different orientations of the same target. We found that the training samples needed to be closer for some orientations than for others, and that the most difficult orientations are where the target is head-on to the observer. By means of some additional training of the neural network, we were able to achieve 100% correct classification for 360 degree rotation and a range of scales over a factor of five.

  11. Analysis and implementation of cross lingual short message service spam filtering using graph-based k-nearest neighbor

    Science.gov (United States)

    Ayu Cyntya Dewi, Dyah; Shaufiah; Asror, Ibnu

    2018-03-01

    SMS (Short Message Service) is on e of the communication services that still be the main choice, although now the phone grow with various applications. Along with the development of various other communication media, some countries lowered SMS rates to keep the interest of mobile users. It resulted in increased spam SMS that used by several parties, one of them for advertisement. Given the kind of multi-lingual documents in a message SMS, the Web, and others, necessary for effective multilingual or cross-lingual processing techniques is becoming increasingly important. The steps that performed in this research is data / messages first preprocessing then represented into a graph model. Then calculated using GKNN method. From this research we get the maximum accuracy is 98.86 with training data in Indonesian language and testing data in indonesian language with K 10 and threshold 0.001.

  12. Nonseparable dynamic nearest neighbor Gaussian process models for large spatio-temporal data with an application to particulate matter analysis

    NARCIS (Netherlands)

    Datta, A.; Banerjee, S.; Finley, A.O.; Hamm, N.A.S.; Schaap, M.

    2016-01-01

    Particulate matter (PM) is a class of malicious environmental pollutants known to be detrimental to human health. Regulatory efforts aimed at curbing PM levels in different countries often require high resolution space–time maps that can identify red-flag regions exceeding statutory concentration

  13. Nearest-neighbor Kitaev exchange blocked by charge order in electron doped $\\alpha$-RuCl$_{3}$

    OpenAIRE

    Koitzsch, A.; Habenicht, C.; Mueller, E.; Knupfer, M.; Buechner, B.; Kretschmer, S.; Richter, M.; Brink, J. van den; Boerrnert, F.; Nowak, D.; Isaeva, A.; Doert, Th.

    2017-01-01

    A quantum spin-liquid might be realized in $\\alpha$-RuCl$_{3}$, a honeycomb-lattice magnetic material with substantial spin-orbit coupling. Moreover, $\\alpha$-RuCl$_{3}$ is a Mott insulator, which implies the possibility that novel exotic phases occur upon doping. Here, we study the electronic structure of this material when intercalated with potassium by photoemission spectroscopy, electron energy loss spectroscopy, and density functional theory calculations. We obtain a stable stoichiometry...

  14. Global 30m Height Above the Nearest Drainage

    Science.gov (United States)

    Donchyts, Gennadii; Winsemius, Hessel; Schellekens, Jaap; Erickson, Tyler; Gao, Hongkai; Savenije, Hubert; van de Giesen, Nick

    2016-04-01

    Variability of the Earth surface is the primary characteristics affecting the flow of surface and subsurface water. Digital elevation models, usually represented as height maps above some well-defined vertical datum, are used a lot to compute hydrologic parameters such as local flow directions, drainage area, drainage network pattern, and many others. Usually, it requires a significant effort to derive these parameters at a global scale. One hydrological characteristic introduced in the last decade is Height Above the Nearest Drainage (HAND): a digital elevation model normalized using nearest drainage. This parameter has been shown to be useful for many hydrological and more general purpose applications, such as landscape hazard mapping, landform classification, remote sensing and rainfall-runoff modeling. One of the essential characteristics of HAND is its ability to capture heterogeneities in local environments, difficult to measure or model otherwise. While many applications of HAND were published in the academic literature, no studies analyze its variability on a global scale, especially, using higher resolution DEMs, such as the new, one arc-second (approximately 30m) resolution version of SRTM. In this work, we will present the first global version of HAND computed using a mosaic of two DEMS: 30m SRTM and Viewfinderpanorama DEM (90m). The lower resolution DEM was used to cover latitudes above 60 degrees north and below 56 degrees south where SRTM is not available. We compute HAND using the unmodified version of the input DEMs to ensure consistency with the original elevation model. We have parallelized processing by generating a homogenized, equal-area version of HydroBASINS catchments. The resulting catchment boundaries were used to perform processing using 30m resolution DEM. To compute HAND, a new version of D8 local drainage directions as well as flow accumulation were calculated. The latter was used to estimate river head by incorporating fixed and

  15. Optimization of Classification Strategies of Acetowhite Temporal Patterns towards Improving Diagnostic Performance of Colposcopy

    Directory of Open Access Journals (Sweden)

    Karina Gutiérrez-Fragoso

    2017-01-01

    Full Text Available Efforts have been being made to improve the diagnostic performance of colposcopy, trying to help better diagnose cervical cancer, particularly in developing countries. However, improvements in a number of areas are still necessary, such as the time it takes to process the full digital image of the cervix, the performance of the computing systems used to identify different kinds of tissues, and biopsy sampling. In this paper, we explore three different, well-known automatic classification methods (k-Nearest Neighbors, Naïve Bayes, and C4.5, in addition to different data models that take full advantage of this information and improve the diagnostic performance of colposcopy based on acetowhite temporal patterns. Based on the ROC and PRC area scores, the k-Nearest Neighbors and discrete PLA representation performed better than other methods. The values of sensitivity, specificity, and accuracy reached using this method were 60% (95% CI 50–70, 79% (95% CI 71–86, and 70% (95% CI 60–80, respectively. The acetowhitening phenomenon is not exclusive to high-grade lesions, and we have found acetowhite temporal patterns of epithelial changes that are not precancerous lesions but that are similar to positive ones. These findings need to be considered when developing more robust computing systems in the future.

  16. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  17. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  18. Automatic target classification of man-made objects in synthetic aperture radar images using Gabor wavelet and neural network

    Science.gov (United States)

    Vasuki, Perumal; Roomi, S. Mohamed Mansoor

    2013-01-01

    Processing of synthetic aperture radar (SAR) images has led to the development of automatic target classification approaches. These approaches help to classify individual and mass military ground vehicles. This work aims to develop an automatic target classification technique to classify military targets like truck/tank/armored car/cannon/bulldozer. The proposed method consists of three stages via preprocessing, feature extraction, and neural network (NN). The first stage removes speckle noise in a SAR image by the identified frost filter and enhances the image by histogram equalization. The second stage uses a Gabor wavelet to extract the image features. The third stage classifies the target by an NN classifier using image features. The proposed work performs better than its counterparts, like K-nearest neighbor (KNN). The proposed work performs better on databases like moving and stationary target acquisition and recognition against the earlier methods by KNN.

  19. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    Science.gov (United States)

    Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159

  20. A comparative study of PCA, SIMCA and Cole model for classification of bioimpedance spectroscopy measurements.

    Science.gov (United States)

    Nejadgholi, Isar; Bolic, Miodrag

    2015-08-01

    Due to safety and low cost of bioimpedance spectroscopy (BIS), classification of BIS can be potentially a preferred way of detecting changes in living tissues. However, for longitudinal datasets linear classifiers fail to classify conventional Cole parameters extracted from BIS measurements because of their high variability. In some applications, linear classification based on Principal Component Analysis (PCA) has shown more accurate results. Yet, these methods have not been established for BIS classification, since PCA features have neither been investigated in combination with other classifiers nor have been compared to conventional Cole features in benchmark classification tasks. In this work, PCA and Cole features are compared in three synthesized benchmark classification tasks which are expected to be detected by BIS. These three tasks are classification of before and after geometry change, relative composition change and blood perfusion in a cylindrical organ. Our results show that in all tasks the features extracted by PCA are more discriminant than Cole parameters. Moreover, a pilot study was done on a longitudinal arm BIS dataset including eight subjects and three arm positions. The goal of the study was to compare different methods in arm position classification which includes all three synthesized changes mentioned above. Our comparative study on various classification methods shows that the best classification accuracy is obtained when PCA features are classified by a K-Nearest Neighbors (KNN) classifier. The results of this work suggest that PCA+KNN is a promising method to be considered for classification of BIS datasets that deal with subject and time variability. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Applying cost-sensitive classification for financial fraud detection under high class-imbalance

    CSIR Research Space (South Africa)

    Moepya, SO

    2014-12-01

    Full Text Available , sensitivity, specificity, recall and precision using PCA and Factor Analysis. Weighted Support Vector Machines (SVM) were shown superior to the cost-sensitive Naive Bayes (NB) and K-Nearest Neighbors classifiers....

  2. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    Directory of Open Access Journals (Sweden)

    Jiangyuan Mei

    2014-01-01

    the basis of classifiers, for example, the k-nearest neighbors classifier. Experiments on benchmark data sets demonstrate that the proposed algorithm compares favorably with the state-of-the-art methods.

  3. Discovery of Nearest Known Brown Dwarf

    Science.gov (United States)

    2003-01-01

    Bright Southern Star Epsilon Indi Has Cool, Substellar Companion [1] Summary A team of European astronomers [2] has discovered a Brown Dwarf object (a 'failed' star) less than 12 light-years from the Sun. It is the nearest yet known. Now designated Epsilon Indi B, it is a companion to a well-known bright star in the southern sky, Epsilon Indi (now "Epsilon Indi A"), previously thought to be single. The binary system is one of the twenty nearest stellar systems to the Sun. The brown dwarf was discovered from the comparatively rapid motion across the sky which it shares with its brighter companion : the pair move a full lunar diameter in less than 400 years. It was first identified using digitised archival photographic plates from the SuperCOSMOS Sky Surveys (SSS) and confirmed using data from the Two Micron All Sky Survey (2MASS). Follow-up observations with the near-infrared sensitive SOFI instrument on the ESO 3.5-m New Technology Telescope (NTT) at the La Silla Observatory confirmed its nature and has allowed measurements of its physical properties. Epsilon Indi B has a mass just 45 times that of Jupiter, the largest planet in the Solar System, and a surface temperature of only 1000 °C. It belongs to the so-called 'T dwarf' category of objects which straddle the domain between stars and giant planets. Epsilon Indi B is the nearest and brightest T dwarf known. Future studies of the new object promise to provide astronomers with important new clues as to the formation and evolution of these exotic celestial bodies, at the same time yielding interesting insights into the border zone between planets and stars. TINY MOVING NEEDLES IN GIANT HAYSTACKS ESO PR Photo 03a/03 ESO PR Photo 03a/03 [Preview - JPEG: 400 x 605 pix - 92k [Normal - JPEG: 1200 x 1815 pix - 1.0M] Caption: PR Photo 03a/03 shows Epsilon Indi A (the bright star at far right) and its newly discovered brown dwarf companion Epsilon Indi B (circled). The upper image comes from one of the SuperCOSMOS Sky

  4. Machine Learning Classification of Buildings for Map Generalization

    Directory of Open Access Journals (Sweden)

    Jaeeun Lee

    2017-10-01

    Full Text Available A critical problem in mapping data is the frequent updating of large data sets. To solve this problem, the updating of small-scale data based on large-scale data is very effective. Various map generalization techniques, such as simplification, displacement, typification, elimination, and aggregation, must therefore be applied. In this study, we focused on the elimination and aggregation of the building layer, for which each building in a large scale was classified as “0-eliminated,” “1-retained,” or “2-aggregated.” Machine-learning classification algorithms were then used for classifying the buildings. The data of 1:1000 scale and 1:25,000 scale digital maps obtained from the National Geographic Information Institute were used. We applied to these data various machine-learning classification algorithms, including naive Bayes (NB, decision tree (DT, k-nearest neighbor (k-NN, and support vector machine (SVM. The overall accuracies of each algorithm were satisfactory: DT, 88.96%; k-NN, 88.27%; SVM, 87.57%; and NB, 79.50%. Although elimination is a direct part of the proposed process, generalization operations, such as simplification and aggregation of polygons, must still be performed for buildings classified as retained and aggregated. Thus, these algorithms can be used for building classification and can serve as preparatory steps for building generalization.

  5. Application of texture analysis method for mammogram density classification

    Science.gov (United States)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  6. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  7. Application of In-Segment Multiple Sampling in Object-Based Classification

    Directory of Open Access Journals (Sweden)

    Nataša Đurić

    2014-12-01

    Full Text Available When object-based analysis is applied to very high-resolution imagery, pixels within the segments reveal large spectral inhomogeneity; their distribution can be considered complex rather than normal. When normality is violated, the classification methods that rely on the assumption of normally distributed data are not as successful or accurate. It is hard to detect normality violations in small samples. The segmentation process produces segments that vary highly in size; samples can be very big or very small. This paper investigates whether the complexity within the segment can be addressed using multiple random sampling of segment pixels and multiple calculations of similarity measures. In order to analyze the effect sampling has on classification results, statistics and probability value equations of non-parametric two-sample Kolmogorov-Smirnov test and parametric Student’s t-test are selected as similarity measures in the classification process. The performance of both classifiers was assessed on a WorldView-2 image for four land cover classes (roads, buildings, grass and trees and compared to two commonly used object-based classifiers—k-Nearest Neighbor (k-NN and Support Vector Machine (SVM. Both proposed classifiers showed a slight improvement in the overall classification accuracies and produced more accurate classification maps when compared to the ground truth image.

  8. Supervised Self-Organizing Classification of Superresolution ISAR Images: An Anechoic Chamber Experiment

    Directory of Open Access Journals (Sweden)

    Radoi Emanuel

    2006-01-01

    Full Text Available The problem of the automatic classification of superresolution ISAR images is addressed in the paper. We describe an anechoic chamber experiment involving ten-scale-reduced aircraft models. The radar images of these targets are reconstructed using MUSIC-2D (multiple signal classification method coupled with two additional processing steps: phase unwrapping and symmetry enhancement. A feature vector is then proposed including Fourier descriptors and moment invariants, which are calculated from the target shape and the scattering center distribution extracted from each reconstructed image. The classification is finally performed by a new self-organizing neural network called SART (supervised ART, which is compared to two standard classifiers, MLP (multilayer perceptron and fuzzy KNN ( nearest neighbors. While the classification accuracy is similar, SART is shown to outperform the two other classifiers in terms of training speed and classification speed, especially for large databases. It is also easier to use since it does not require any input parameter related to its structure.

  9. Building an asynchronous web-based tool for machine learning classification.

    Science.gov (United States)

    Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila

    2002-01-01

    Various unsupervised and supervised learning methods including support vector machines, classification trees, linear discriminant analysis and nearest neighbor classifiers have been used to classify high-throughput gene expression data. Simpler and more widely accepted statistical tools have not yet been used for this purpose, hence proper comparisons between classification methods have not been conducted. We developed free software that implements logistic regression with stepwise variable selection as a quick and simple method for initial exploration of important genetic markers in disease classification. To implement the algorithm and allow our collaborators in remote locations to evaluate and compare its results against those of other methods, we developed a user-friendly asynchronous web-based application with a minimal amount of programming using free, downloadable software tools. With this program, we show that classification using logistic regression can perform as well as other more sophisticated algorithms, and it has the advantages of being easy to interpret and reproduce. By making the tool freely and easily available, we hope to promote the comparison of classification methods. In addition, we believe our web application can be used as a model for other bioinformatics laboratories that need to develop web-based analysis tools in a short amount of time and on a limited budget.

  10. Feasibility Study of Land Cover Classification Based on Normalized Difference Vegetation Index for Landslide Risk Assessment

    Directory of Open Access Journals (Sweden)

    Thilanki Dahigamuwa

    2016-10-01

    Full Text Available Unfavorable land cover leads to excessive damage from landslides and other natural hazards, whereas the presence of vegetation is expected to mitigate rainfall-induced landslide potential. Hence, unexpected and rapid changes in land cover due to deforestation would be detrimental in landslide-prone areas. Also, vegetation cover is subject to phenological variations and therefore, timely classification of land cover is an essential step in effective evaluation of landslide hazard potential. The work presented here investigates methods that can be used for land cover classification based on the Normalized Difference Vegetation Index (NDVI, derived from up-to-date satellite images, and the feasibility of application in landslide risk prediction. A major benefit of this method would be the eventual ability to employ NDVI as a stand-alone parameter for accurate assessment of the impact of land cover in landslide hazard evaluation. An added benefit would be the timely detection of undesirable practices such as deforestation using satellite imagery. A landslide-prone region in Oregon, USA is used as a model for the application of the classification method. Five selected classification techniques—k-nearest neighbor, Gaussian support vector machine (GSVM, artificial neural network, decision tree and quadratic discriminant analysis support the viability of the NDVI-based land cover classification. Finally, its application in landslide risk evaluation is demonstrated.

  11. A semi-supervised classification algorithm using the TAD-derived background as training data

    Science.gov (United States)

    Fan, Lei; Ambeau, Brittany; Messinger, David W.

    2013-05-01

    In general, spectral image classification algorithms fall into one of two categories: supervised and unsupervised. In unsupervised approaches, the algorithm automatically identifies clusters in the data without a priori information about those clusters (except perhaps the expected number of them). Supervised approaches require an analyst to identify training data to learn the characteristics of the clusters such that they can then classify all other pixels into one of the pre-defined groups. The classification algorithm presented here is a semi-supervised approach based on the Topological Anomaly Detection (TAD) algorithm. The TAD algorithm defines background components based on a mutual k-Nearest Neighbor graph model of the data, along with a spectral connected components analysis. Here, the largest components produced by TAD are used as regions of interest (ROI's),or training data for a supervised classification scheme. By combining those ROI's with a Gaussian Maximum Likelihood (GML) or a Minimum Distance to the Mean (MDM) algorithm, we are able to achieve a semi supervised classification method. We test this classification algorithm against data collected by the HyMAP sensor over the Cooke City, MT area and University of Pavia scene.

  12. Classification in medical image analysis using adaptive metric k-NN

    DEFF Research Database (Denmark)

    Chen, Chen; Chernoff, Konstantin; Karemore, Gopal

    2010-01-01

    The performance of the k-nearest neighborhoods (k-NN) classifier is highly dependent on the distance metric used to identify the k nearest neighbors of the query points. The standard Euclidean distance is commonly used in practice. This paper investigates the performance of k-NN classifier...

  13. IMPROVING NEAREST NEIGHBOUR SEARCH IN 3D SPATIAL ACCESS METHOD

    Directory of Open Access Journals (Sweden)

    A. Suhaibaha

    2016-10-01

    Full Text Available Nearest Neighbour (NN is one of the important queries and analyses for spatial application. In normal practice, spatial access method structure is used during the Nearest Neighbour query execution to retrieve information from the database. However, most of the spatial access method structures are still facing with unresolved issues such as overlapping among nodes and repetitive data entry. This situation will perform an excessive Input/Output (IO operation which is inefficient for data retrieval. The situation will become more crucial while dealing with 3D data. The size of 3D data is usually large due to its detail geometry and other attached information. In this research, a clustered 3D hierarchical structure is introduced as a 3D spatial access method structure. The structure is expected to improve the retrieval of Nearest Neighbour information for 3D objects. Several tests are performed in answering Single Nearest Neighbour search and k Nearest Neighbour (kNN search. The tests indicate that clustered hierarchical structure is efficient in handling Nearest Neighbour query compared to its competitor. From the results, clustered hierarchical structure reduced the repetitive data entry and the accessed page. The proposed structure also produced minimal Input/Output operation. The query response time is also outperformed compared to the other competitor. For future outlook of this research several possible applications are discussed and summarized.

  14. Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2017-12-01

    Full Text Available Accurate solar photovoltaic (PV power forecasting is an essential tool for mitigating the negative effects caused by the uncertainty of PV output power in systems with high penetration levels of solar PV generation. Weather classification based modeling is an effective way to increase the accuracy of day-ahead short-term (DAST solar PV power forecasting because PV output power is strongly dependent on the specific weather conditions in a given time period. However, the accuracy of daily weather classification relies on both the applied classifiers and the training data. This paper aims to reveal how these two factors impact the classification performance and to delineate the relation between classification accuracy and sample dataset scale. Two commonly used classification methods, K-nearest neighbors (KNN and support vector machines (SVM are applied to classify the daily local weather types for DAST solar PV power forecasting using the operation data from a grid-connected PV plant in Hohhot, Inner Mongolia, China. We assessed the performance of SVM and KNN approaches, and then investigated the influences of sample scale, the number of categories, and the data distribution in different categories on the daily weather classification results. The simulation results illustrate that SVM performs well with small sample scale, while KNN is more sensitive to the length of the training dataset and can achieve higher accuracy than SVM with sufficient samples.

  15. The square Ising model with second-neighbor interactions and the Ising chain in a transverse field

    International Nuclear Information System (INIS)

    Grynberg, M.D.; Tanatar, B.

    1991-06-01

    We consider the thermal and critical behaviour of the square Ising lattice with frustrated first - and second-neighbor interactions. A low-temperature domain wall analysis including kinks and dislocations shows that there is a close relation between this classical model and the Hamiltonian of an Ising chain in a transverse field provided that the ratio of the next-nearest to nearest-neighbor coupling, is close to 1/2. Due to the field inversion symmetry of the Ising chain Hamiltonian, the thermal properties of the classical system are symmetrical with respect to this coupling ratio. In the neighborhood of this regime critical exponents of the model turn out to belong to the Ising universality class. Our results are compared with previous Monte Carlo simulations. (author). 23 refs, 6 figs

  16. Structure of the first- and second-neighbor shells of simulated water: Quantitative relation to translational and orientational order

    Science.gov (United States)

    Yan, Zhenyu; Buldyrev, Sergey V.; Kumar, Pradeep; Giovambattista, Nicolas; Debenedetti, Pablo G.; Stanley, H. Eugene

    2007-11-01

    We perform molecular dynamics simulations of water using the five-site transferable interaction potential (TIP5P) model to quantify structural order in both the first shell (defined by four nearest neighbors) and second shell (defined by twelve next-nearest neighbors) of a central water molecule. We find that the anomalous decrease of orientational order upon compression occurs in both shells, but the anomalous decrease of translational order upon compression occurs mainly in the second shell. The decreases of translational order and orientational order upon compression (called the “structural anomaly”) are thus correlated only in the second shell. Our findings quantitatively confirm the qualitative idea that the thermodynamic, structural, and hence dynamic anomalies of water are related to changes upon compression in the second shell.

  17. Identifying influential neighbors in animal flocking.

    Directory of Open Access Journals (Sweden)

    Li Jiang

    2017-11-01

    Full Text Available Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.

  18. Identifying influential neighbors in animal flocking.

    Science.gov (United States)

    Jiang, Li; Giuggioli, Luca; Perna, Andrea; Escobedo, Ramón; Lecheval, Valentin; Sire, Clément; Han, Zhangang; Theraulaz, Guy

    2017-11-01

    Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.

  19. A localized navigation algorithm for Radiation Evasion for nuclear facilities. Part II: Optimizing the “Nearest Exit” Criterion

    Energy Technology Data Exchange (ETDEWEB)

    Khasawneh, Mohammed A., E-mail: mkha@ieee.org [Department of Electrical Engineering, Jordan University of Science and Technology (Jordan); Al-Shboul, Zeina Aman M., E-mail: xeinaaman@gmail.com [Department of Electrical Engineering, Jordan University of Science and Technology (Jordan); Jaradat, Mohammad A., E-mail: majaradat@just.edu.jo [Department of Mechanical Engineering, Jordan University of Science and Technology (Jordan); Malkawi, Mohammad I., E-mail: mmalkawi@aimws.com [College of Engineering, Jadara University, Irbid 221 10 (Jordan)

    2013-06-15

    Highlights: ► A new navigation algorithm for Radiation Evasion around nuclear facilities. ► An optimization criteria minimized under algorithm operation. ► A man-borne device guiding the occupational worker towards paths that warrant least radiation × time products. ► Benefits of using localized navigation as opposed to global navigation schemas. ► A path discrimination function for finding the navigational paths exhibiting the least amounts of radiation. -- Abstract: In this extension from part I (Khasawneh et al., in press), we modify the navigation algorithm which was presented with the objective of optimizing the “Radiation Evasion” Criterion so that navigation would optimize the criterion of “Nearest Exit”. Under this modification, algorithm would yield navigation paths that would guide occupational workers towards Nearest Exit points. Again, under this optimization criterion, algorithm leverages the use of localized information acquired through a well designed and distributed wireless sensor network, as it averts the need for any long-haul communication links or centralized decision and monitoring facility thereby achieving a more reliable performance under dynamic environments. As was done in part I, the proposed algorithm under the “Nearest Exit” Criterion is designed to leverage nearest neighbor information coming in through the sensory network overhead, in computing successful navigational paths from one point to another. For comparison purposes, the proposed algorithm is tested under the two optimization criteria: “Radiation Evasion” and “Nearest Exit”, for different numbers of step look-ahead. We verify the performance of the algorithm by means of simulations, whereby navigational paths are calculated for different radiation fields. We, via simulations, also, verify the performance of the algorithm in comparison with a well-known global navigation algorithm upon which we draw our conclusions.

  20. A localized navigation algorithm for Radiation Evasion for nuclear facilities. Part II: Optimizing the “Nearest Exit” Criterion

    International Nuclear Information System (INIS)

    Khasawneh, Mohammed A.; Al-Shboul, Zeina Aman M.; Jaradat, Mohammad A.; Malkawi, Mohammad I.

    2013-01-01

    Highlights: ► A new navigation algorithm for Radiation Evasion around nuclear facilities. ► An optimization criteria minimized under algorithm operation. ► A man-borne device guiding the occupational worker towards paths that warrant least radiation × time products. ► Benefits of using localized navigation as opposed to global navigation schemas. ► A path discrimination function for finding the navigational paths exhibiting the least amounts of radiation. -- Abstract: In this extension from part I (Khasawneh et al., in press), we modify the navigation algorithm which was presented with the objective of optimizing the “Radiation Evasion” Criterion so that navigation would optimize the criterion of “Nearest Exit”. Under this modification, algorithm would yield navigation paths that would guide occupational workers towards Nearest Exit points. Again, under this optimization criterion, algorithm leverages the use of localized information acquired through a well designed and distributed wireless sensor network, as it averts the need for any long-haul communication links or centralized decision and monitoring facility thereby achieving a more reliable performance under dynamic environments. As was done in part I, the proposed algorithm under the “Nearest Exit” Criterion is designed to leverage nearest neighbor information coming in through the sensory network overhead, in computing successful navigational paths from one point to another. For comparison purposes, the proposed algorithm is tested under the two optimization criteria: “Radiation Evasion” and “Nearest Exit”, for different numbers of step look-ahead. We verify the performance of the algorithm by means of simulations, whereby navigational paths are calculated for different radiation fields. We, via simulations, also, verify the performance of the algorithm in comparison with a well-known global navigation algorithm upon which we draw our conclusions

  1. Effective Feature Selection for Classification of Promoter Sequences.

    Directory of Open Access Journals (Sweden)

    Kouser K

    Full Text Available Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine, KNN (K Nearest Neighbor and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.

  2. Comparisons of likelihood and machine learning methods of individual classification

    Science.gov (United States)

    Guinand, B.; Topchy, A.; Page, K.S.; Burnham-Curtis, M. K.; Punch, W.F.; Scribner, K.T.

    2002-01-01

    Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin (“assignment tests”). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high FST), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (0–2.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to “learn” and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks. In recent years, characterization of highly polymorphic molecular markers such as mini- and microsatellites and development of novel methods of analysis have enabled researchers to extend investigations of ecological and evolutionary processes below the population level to the level of

  3. Feature Extraction and Classification on Esophageal X-Ray Images of Xinjiang Kazak Nationality

    Directory of Open Access Journals (Sweden)

    Fang Yang

    2017-01-01

    Full Text Available Esophageal cancer is one of the fastest rising types of cancers in China. The Kazak nationality is the highest-risk group in Xinjiang. In this work, an effective computer-aided diagnostic system is developed to assist physicians in interpreting digital X-ray image features and improving the quality of diagnosis. The modules of the proposed system include image preprocessing, feature extraction, feature selection, image classification, and performance evaluation. 300 original esophageal X-ray images were resized to a region of interest and then enhanced by the median filter and histogram equalization method. 37 features from textural, frequency, and complexity domains were extracted. Both sequential forward selection and principal component analysis methods were employed to select the discriminative features for classification. Then, support vector machine and K-nearest neighbors were applied to classify the esophageal cancer images with respect to their specific types. The classification performance was evaluated in terms of the area under the receiver operating characteristic curve, accuracy, precision, and recall, respectively. Experimental results show that the classification performance of the proposed system outperforms the conventional visual inspection approaches in terms of diagnostic quality and processing time. Therefore, the proposed computer-aided diagnostic system is promising for the diagnostics of esophageal cancer.

  4. An ant colony optimization based feature selection for web page classification.

    Science.gov (United States)

    Saraç, Esra; Özel, Selma Ayşe

    2014-01-01

    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.

  5. Seismic Target Classification Using a Wavelet Packet Manifold in Unattended Ground Sensors Systems

    Directory of Open Access Journals (Sweden)

    Enliang Song

    2013-07-01

    Full Text Available One of the most challenging problems in target classification is the extraction of a robust feature, which can effectively represent a specific type of targets. The use of seismic signals in unattended ground sensor (UGS systems makes this problem more complicated, because the seismic target signal is non-stationary, geology-dependent and with high-dimensional feature space. This paper proposes a new feature extraction algorithm, called wavelet packet manifold (WPM, by addressing the neighborhood preserving embedding (NPE algorithm of manifold learning on the wavelet packet node energy (WPNE of seismic signals. By combining non-stationary information and low-dimensional manifold information, WPM provides a more robust representation for seismic target classification. By using a K nearest neighbors classifier on the WPM signature, the algorithm of wavelet packet manifold classification (WPMC is proposed. Experimental results show that the proposed WPMC can not only reduce feature dimensionality, but also improve the classification accuracy up to 95.03%. Moreover, compared with state-of-the-art methods, WPMC is more suitable for UGS in terms of recognition ratio and computational complexity.

  6. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    Science.gov (United States)

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  7. Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.

    Directory of Open Access Journals (Sweden)

    Daniel Ting

    2010-04-01

    Full Text Available Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1 input data size and criteria for structure inclusion (resolution, R-factor, etc.; 2 filtering of suspect conformations and outliers using B-factors or other features; 3 secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included; 4 the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5 whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.

  8. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  9. Mapping forested wetlands in the Great Zhan River Basin through integrating optical, radar, and topographical data classification techniques.

    Science.gov (United States)

    Na, X D; Zang, S Y; Wu, C S; Li, W L

    2015-11-01

    Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (pwetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (pwetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.

  10. Large margin classification with indefinite similarities

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2016-01-07

    Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer condition is not satisfied, or the Mercer condition is difficult to verify. Examples of such indefinite similarities in machine learning applications are ample including, for instance, the BLAST similarity score between protein sequences, human-judged similarities between concepts and words, and the tangent distance or the shape matching distance in computer vision. Nevertheless, previous works on classification with indefinite similarities are not fully satisfactory. They have either introduced sources of inconsistency in handling past and future examples using kernel approximation, settled for local-minimum solutions using non-convex optimization, or produced non-sparse solutions by learning in Krein spaces. Despite the large volume of research devoted to this subject lately, we demonstrate in this paper how an old idea, namely the 1-norm support vector machine (SVM) proposed more than 15 years ago, has several advantages over more recent work. In particular, the 1-norm SVM method is conceptually simpler, which makes it easier to implement and maintain. It is competitive, if not superior to, all other methods in terms of predictive accuracy. Moreover, it produces solutions that are often sparser than more recent methods by several orders of magnitude. In addition, we provide various theoretical justifications by relating 1-norm SVM to well-established learning algorithms such as neural networks, SVM, and nearest neighbor classifiers. Finally, we conduct a thorough experimental evaluation, which reveals that the evidence in favor of 1-norm SVM is statistically significant.

  11. Recrafting the Neighbor-Joining Method

    DEFF Research Database (Denmark)

    Mailund; Brodal, Gerth Stølting; Fagerberg, Rolf

    2006-01-01

    Background: The neighbor-joining method by Saitou and Nei is a widely used method for constructing phylogenetic trees. The formulation of the method gives rise to a canonical Θ(n3) algorithm upon which all existing implementations are based. Methods: In this paper we present techniques for speeding...... up the canonical neighbor-joining method. Our algorithms construct the same phylogenetic trees as the canonical neighbor-joining method. The best-case running time of our algorithms are O(n2) but the worst-case remains O(n3). We empirically evaluate the performance of our algoritms on distance...... matrices obtained from the Pfam collection of alignments. Results: The experiments indicate that the running time of our algorithms evolve as Θ(n2) on the examined instance collection. We also compare the running time with that of the QuickTree tool, a widely used efficient implementation of the canonical...

  12. The clinic as a good corporate neighbor.

    Science.gov (United States)

    Sass, Hans-Martin

    2013-02-01

    Clinics today specialize in health repair services similar to car repair shops; procedures and prices are standardized, regulated, and inflexibly uniform. Clinics of the future have to become Health Care Centers in order to be more respected and more effective corporate neighbors in offering outreach services in health education and preventive health care. The traditional concept of care for health is much broader than repair management and includes the promotion of lay health competence and responsibility in healthy social and natural environments. The corporate profile and ethics of the clinic as a good and competitive local neighbor will have to focus on [a] better personalized care, [b] education and services in preventive care, [c] direct or web-based information and advice for general, seasonal, or age related health risks, and on developing and improving trustworthy character traits of the clinic as a corporate person and a good neighbor.

  13. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

    KAUST Repository

    Abusamra, Heba

    2013-05-01

    Microarray technology has enriched the study of gene expression in such a way that scientists are now able to measure the expression levels of thousands of genes in a single experiment. Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification, interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This thesis aims on a comparative study of state-of-the-art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k- nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t- statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used for this study. Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in

  14. A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.

    Science.gov (United States)

    Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho

    2014-10-01

    Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.

  15. Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa

    2018-07-01

    Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  16. New Sliding Puzzle with Neighbors Swap Motion

    OpenAIRE

    Prihardono, Ariyanto; Kawagoe, Kenichi

    2015-01-01

    The sliding puzzles (15-puzzle, 8-puzzle, 5-puzzle) are known to have 2 kind of puz-zle: solvable puzzle and unsolvable puzzle. In this thesis, we make a new puzzle with only 1 kind of it, solvable puzzle. This new puzzle is made by adopting sliding puzzle with several additional rules from M13 puzzle; the puzzle that is formed form The Mathieu group M13. This puzzle has a movement that called a neighbors swap motion, a rule of movement that enables every neighboring points to swap. This extr...

  17. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization.

    Science.gov (United States)

    Vafaee Sharbaf, Fatemeh; Mosafer, Sara; Moattar, Mohammad Hossein

    2016-06-01

    This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor; support vector machine and naïve Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Recrafting the neighbor-joining method

    Directory of Open Access Journals (Sweden)

    Pedersen Christian NS

    2006-01-01

    Full Text Available Abstract Background The neighbor-joining method by Saitou and Nei is a widely used method for constructing phylogenetic trees. The formulation of the method gives rise to a canonical Θ(n3 algorithm upon which all existing implementations are based. Results In this paper we present techniques for speeding up the canonical neighbor-joining method. Our algorithms construct the same phylogenetic trees as the canonical neighbor-joining method. The best-case running time of our algorithms are O(n2 but the worst-case remains O(n3. We empirically evaluate the performance of our algoritms on distance matrices obtained from the Pfam collection of alignments. The experiments indicate that the running time of our algorithms evolve as Θ(n2 on the examined instance collection. We also compare the running time with that of the QuickTree tool, a widely used efficient implementation of the canonical neighbor-joining method. Conclusion The experiments show that our algorithms also yield a significant speed-up, already for medium sized instances.

  19. Embedded vision equipment of industrial robot for inline detection of product errors by clustering–classification algorithms

    Directory of Open Access Journals (Sweden)

    Kamil Zidek

    2016-10-01

    Full Text Available The article deals with the design of embedded vision equipment of industrial robots for inline diagnosis of product error during manipulation process. The vision equipment can be attached to the end effector of robots or manipulators, and it provides an image snapshot of part surface before grasp, searches for error during manipulation, and separates products with error from the next operation of manufacturing. The new approach is a methodology based on machine teaching for the automated identification, localization, and diagnosis of systematic errors in products of high-volume production. To achieve this, we used two main data mining algorithms: clustering for accumulation of similar errors and classification methods for the prediction of any new error to proposed class. The presented methodology consists of three separate processing levels: image acquisition for fail parameterization, data clustering for categorizing errors to separate classes, and new pattern prediction with a proposed class model. We choose main representatives of clustering algorithms, for example, K-mean from quantization of vectors, fast library for approximate nearest neighbor from hierarchical clustering, and density-based spatial clustering of applications with noise from algorithm based on the density of the data. For machine learning, we selected six major algorithms of classification: support vector machines, normal Bayesian classifier, K-nearest neighbor, gradient boosted trees, random trees, and neural networks. The selected algorithms were compared for speed and reliability and tested on two platforms: desktop-based computer system and embedded system based on System on Chip (SoC with vision equipment.

  20. Rapid differentiation of Ghana cocoa beans by FT-NIR spectroscopy coupled with multivariate classification

    Science.gov (United States)

    Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng

    2013-10-01

    Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.

  1. Chromatographic profiles of Phyllanthus aqueous extracts samples: a proposition of classification using chemometric models.

    Science.gov (United States)

    Martins, Lucia Regina Rocha; Pereira-Filho, Edenir Rodrigues; Cass, Quezia Bezerra

    2011-04-01

    Taking in consideration the global analysis of complex samples, proposed by the metabolomic approach, the chromatographic fingerprint encompasses an attractive chemical characterization of herbal medicines. Thus, it can be used as a tool in quality control analysis of phytomedicines. The generated multivariate data are better evaluated by chemometric analyses, and they can be modeled by classification methods. "Stone breaker" is a popular Brazilian plant of Phyllanthus genus, used worldwide to treat renal calculus, hepatitis, and many other diseases. In this study, gradient elution at reversed-phase conditions with detection at ultraviolet region were used to obtain chemical profiles (fingerprints) of botanically identified samples of six Phyllanthus species. The obtained chromatograms, at 275 nm, were organized in data matrices, and the time shifts of peaks were adjusted using the Correlation Optimized Warping algorithm. Principal Component Analyses were performed to evaluate similarities among cultivated and uncultivated samples and the discrimination among the species and, after that, the samples were used to compose three classification models using Soft Independent Modeling of Class analogy, K-Nearest Neighbor, and Partial Least Squares for Discriminant Analysis. The ability of classification models were discussed after their successful application for authenticity evaluation of 25 commercial samples of "stone breaker."

  2. Comparisons and Selections of Features and Classifiers for Short Text Classification

    Science.gov (United States)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  3. Laser-induced breakdown spectroscopy and chemometrics for classification of toys relying on toxic elements

    International Nuclear Information System (INIS)

    Godoi, Quienly; Leme, Flavio O.; Trevizan, Lilian C.; Pereira Filho, Edenir R.; Rufini, Iolanda A.; Santos, Dario; Krug, Francisco J.

    2011-01-01

    Quality control of toys for avoiding children exposure to potentially toxic elements is of utmost relevance and it is a common requirement in national and/or international norms for health and safety reasons. Laser-induced breakdown spectroscopy (LIBS) was recently evaluated at authors' laboratory for direct analysis of plastic toys and one of the main difficulties for the determination of Cd, Cr and Pb was the variety of mixtures and types of polymers. As most norms rely on migration (lixiviation) protocols, chemometric classification models from LIBS spectra were tested for sampling toys that present potential risk of Cd, Cr and Pb contamination. The classification models were generated from the emission spectra of 51 polymeric toys and by using Partial Least Squares - Discriminant Analysis (PLS-DA), Soft Independent Modeling of Class Analogy (SIMCA) and K-Nearest Neighbor (KNN). The classification models and validations were carried out with 40 and 11 test samples, respectively. Best results were obtained when KNN was used, with corrected predictions varying from 95% for Cd to 100% for Cr and Pb.

  4. Application of Musical Information Retrieval (MIR Techniques to Seismic Facies Classification. Examples in Hydrocarbon Exploration

    Directory of Open Access Journals (Sweden)

    Paolo Dell’Aversana

    2016-12-01

    Full Text Available In this paper, we introduce a novel approach for automatic pattern recognition and classification of geophysical data based on digital music technology. We import and apply in the geophysical domain the same approaches commonly used for Musical Information Retrieval (MIR. After accurate conversion from geophysical formats (example: SEG-Y to musical formats (example: Musical Instrument Digital Interface, or briefly MIDI, we extract musical features from the converted data. These can be single-valued attributes, such as pitch and sound intensity, or multi-valued attributes, such as pitch histograms, melodic, harmonic and rhythmic paths. Using a real data set, we show that these musical features can be diagnostic for seismic facies classification in a complex exploration area. They can be complementary with respect to “conventional” seismic attributes. Using a supervised machine learning approach based on the k-Nearest Neighbors algorithm and on Automatic Neural Networks, we classify three gas-bearing channels. The good performance of our classification approach is confirmed by borehole data available in the same area.

  5. Classification of schizophrenia patients based on resting-state functional network connectivity

    Directory of Open Access Journals (Sweden)

    Mohammad Reza Arbabshirani

    2013-07-01

    Full Text Available There is a growing interest in automatic classification of mental disorders based on neuroimaging data. Small training data sets (subjects and very large amount of high dimensional data make it a challenging task to design robust and accurate classifiers for heterogeneous disorders such as schizophrenia. Most previous studies considered structural MRI, diffusion tensor imaging and task-based fMRI for this purpose. However, resting-state data has been rarely used in discrimination of schizophrenia patients from healthy controls. Resting data are of great interest, since they are relatively easy to collect, and not confounded by behavioral performance on a task. Several linear and non-linear classification methods were trained using a training dataset and evaluate with a separate testing dataset. Results show that classification with high accuracy is achievable using simple non-linear discriminative methods such as k-nearest neighbors which is very promising. We compare and report detailed results of each classifier as well as statistical analysis and evaluation of each single feature. To our knowledge our effects represent the first use of resting-state functional network connectivity features to classify schizophrenia.

  6. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Safieva, Ravilya Z.; Lomakina, Ekaterina I.

    2010-01-01

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm -1 NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  7. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    Energy Technology Data Exchange (ETDEWEB)

    Balabin, Roman M., E-mail: balabin@org.chem.ethz.ch [Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich (Switzerland); Safieva, Ravilya Z. [Gubkin Russian State University of Oil and Gas, 119991 Moscow (Russian Federation); Lomakina, Ekaterina I. [Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119992 Moscow (Russian Federation)

    2010-06-25

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm{sup -1} NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  8. Texture-based classification of different gastric tumors at contrast-enhanced CT

    Energy Technology Data Exchange (ETDEWEB)

    Ba-Ssalamah, Ahmed, E-mail: ahmed.ba-ssalamah@meduniwien.ac.at [Department of Radiology, Medical University of Vienna (Austria); Muin, Dina; Schernthaner, Ruediger; Kulinna-Cosentini, Christiana; Bastati, Nina [Department of Radiology, Medical University of Vienna (Austria); Stift, Judith [Department of Pathology, Medical University of Vienna (Austria); Gore, Richard [Department of Radiology, University of Chicago Pritzker School of Medicine, Chicago, IL (United States); Mayerhoefer, Marius E. [Department of Radiology, Medical University of Vienna (Austria)

    2013-10-01

    Purpose: To determine the feasibility of texture analysis for the classification of gastric adenocarcinoma, lymphoma, and gastrointestinal stromal tumors on contrast-enhanced hydrodynamic-MDCT images. Materials and methods: The arterial phase scans of 47 patients with adenocarcinoma (AC) and a histologic tumor grade of [AC-G1, n = 4, G1, n = 4; AC-G2, n = 7; AC-G3, n = 16]; GIST, n = 15; and lymphoma, n = 5, and the venous phase scans of 48 patients with AC-G1, n = 3; AC-G2, n = 6; AC-G3, n = 14; GIST, n = 17; lymphoma, n = 8, were retrospectively reviewed. Based on regions of interest, texture analysis was performed, and features derived from the gray-level histogram, run-length and co-occurrence matrix, absolute gradient, autoregressive model, and wavelet transform were calculated. Fisher coefficients, probability of classification error, average correlation coefficients, and mutual information coefficients were used to create combinations of texture features that were optimized for tumor differentiation. Linear discriminant analysis in combination with a k-nearest neighbor classifier was used for tumor classification. Results: On arterial-phase scans, texture-based lesion classification was highly successful in differentiating between AC and lymphoma, and GIST and lymphoma, with misclassification rates of 3.1% and 0%, respectively. On venous-phase scans, texture-based classification was slightly less successful for AC vs. lymphoma (9.7% misclassification) and GIST vs. lymphoma (8% misclassification), but enabled the differentiation between AC and GIST (10% misclassification), and between the different grades of AC (4.4% misclassification). No texture feature combination was able to adequately distinguish between all three tumor types. Conclusion: Classification of different gastric tumors based on textural information may aid radiologists in establishing the correct diagnosis, at least in cases where the differential diagnosis can be narrowed down to two

  9. Texture-based classification of different gastric tumors at contrast-enhanced CT

    International Nuclear Information System (INIS)

    Ba-Ssalamah, Ahmed; Muin, Dina; Schernthaner, Ruediger; Kulinna-Cosentini, Christiana; Bastati, Nina; Stift, Judith; Gore, Richard; Mayerhoefer, Marius E.

    2013-01-01

    Purpose: To determine the feasibility of texture analysis for the classification of gastric adenocarcinoma, lymphoma, and gastrointestinal stromal tumors on contrast-enhanced hydrodynamic-MDCT images. Materials and methods: The arterial phase scans of 47 patients with adenocarcinoma (AC) and a histologic tumor grade of [AC-G1, n = 4, G1, n = 4; AC-G2, n = 7; AC-G3, n = 16]; GIST, n = 15; and lymphoma, n = 5, and the venous phase scans of 48 patients with AC-G1, n = 3; AC-G2, n = 6; AC-G3, n = 14; GIST, n = 17; lymphoma, n = 8, were retrospectively reviewed. Based on regions of interest, texture analysis was performed, and features derived from the gray-level histogram, run-length and co-occurrence matrix, absolute gradient, autoregressive model, and wavelet transform were calculated. Fisher coefficients, probability of classification error, average correlation coefficients, and mutual information coefficients were used to create combinations of texture features that were optimized for tumor differentiation. Linear discriminant analysis in combination with a k-nearest neighbor classifier was used for tumor classification. Results: On arterial-phase scans, texture-based lesion classification was highly successful in differentiating between AC and lymphoma, and GIST and lymphoma, with misclassification rates of 3.1% and 0%, respectively. On venous-phase scans, texture-based classification was slightly less successful for AC vs. lymphoma (9.7% misclassification) and GIST vs. lymphoma (8% misclassification), but enabled the differentiation between AC and GIST (10% misclassification), and between the different grades of AC (4.4% misclassification). No texture feature combination was able to adequately distinguish between all three tumor types. Conclusion: Classification of different gastric tumors based on textural information may aid radiologists in establishing the correct diagnosis, at least in cases where the differential diagnosis can be narrowed down to two

  10. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-11-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  11. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-01-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  12. Texture Classification in Lung CT Using Local Binary Patterns

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Shaker, Saher B.; de Bruijne, Marleen

    2008-01-01

    the k nearest neighbor classifier with histogram similarity as distance measure. The proposed method is evaluated on a set of 168 regions of interest comprising normal tissue and different emphysema patterns, and compared to a filter bank based on Gaussian derivatives. The joint LBP and intensity...

  13. Neighbor Rupture Degree of Some Middle Graphs

    Directory of Open Access Journals (Sweden)

    Gökşen BACAK-TURAN

    2017-12-01

    Full Text Available Networks have an important place in our daily lives. Internet networks, electricity networks, water networks, transportation networks, social networks and biological networks are some of the networks we run into every aspects of our lives. A network consists of centers connected by links. A network is represented when centers and connections modelled by vertices and edges, respectively. In consequence of the failure of some centers or connection lines, measurement of the resistance of the network until the communication interrupted is called vulnerability of the network. In this study, neighbor rupture degree which is a parameter that explores the vulnerability values of the resulting graphs due to the failure of some centers of a communication network and its neighboring centers becoming nonfunctional were applied to some middle graphs and neighbor rupture degree of the $M(C_{n},$ $M(P_{n},$ $M(K_{1,n},$ $M(W_{n},$ $M(P_{n}\\times K_{2}$ and $M(C_{n}\\times K_{2}$ have been found.

  14. Classification of interstitial lung disease patterns with topological texture features

    Science.gov (United States)

    Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel

    2010-03-01

    Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  15. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

    Directory of Open Access Journals (Sweden)

    Ali Dashti

    Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.

  16. Improving Fraudster Detection in Online Auctions by Using Neighbor-Driven Attributes

    Directory of Open Access Journals (Sweden)

    Jun-Lin Lin

    2015-12-01

    Full Text Available Online auction websites use a simple reputation system to help their users to evaluate the trustworthiness of sellers and buyers. However, to improve their reputation in the reputation system, fraudulent users can easily deceive the reputation system by creating fake transactions. This inflated-reputation fraud poses a major problem for online auction websites because it can lead legitimate users into scams. Numerous approaches have been proposed in the literature to address this problem, most of which involve using social network analysis (SNA to derive critical features (e.g., k-core, center weight, and neighbor diversity for distinguishing fraudsters from legitimate users. This paper discusses the limitations of these SNA features and proposes a class of SNA features referred to as neighbor-driven attributes (NDAs. The NDAs of users are calculated from the features of their neighbors. Because fraudsters require collusive neighbors to provide them with positive ratings in the reputation system, using NDAs can be helpful for detecting fraudsters. Although the idea of NDAs is not entirely new, experimental results on a real-world dataset showed that using NDAs improves classification accuracy compared with state-of-the-art methods that use the k-core, center weight, and neighbor diversity.

  17. D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2016-06-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  18. Comparative study of PCA in classification of multichannel EMG signals.

    Science.gov (United States)

    Geethanjali, P

    2015-06-01

    Electromyographic (EMG) signals are abundantly used in the field of rehabilitation engineering in controlling the prosthetic device and significantly essential to find fast and accurate EMG pattern recognition system, to avoid intrusive delay. The main objective of this paper is to study the influence of Principal component analysis (PCA), a transformation technique, in pattern recognition of six hand movements using four channel surface EMG signals from ten healthy subjects. For this reason, time domain (TD) statistical as well as auto regression (AR) coefficients are extracted from the four channel EMG signals. The extracted statistical features as well as AR coefficients are transformed using PCA to 25, 50 and 75 % of corresponding original feature vector space. The classification accuracy of PCA transformed and non-PCA transformed TD statistical features as well as AR coefficients are studied with simple logistic regression (SLR), decision tree (DT) with J48 algorithm, logistic model tree (LMT), k nearest neighbor (kNN) and neural network (NN) classifiers in the identification of six different movements. The Kruskal-Wallis (KW) statistical test shows that there is a significant reduction (P PCA transformed features compared to non-PCA transformed features. SLR with non-PCA transformed time domain (TD) statistical features performs better in accuracy and computational power compared to other features considered in this study. In addition, the motion control of three drives for six movements of the hand is implemented with SLR using TD statistical features in off-line with TMSLF2407 digital signal controller (DSC).

  19. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Chaotic particle swarm optimization with mutation for classification.

    Science.gov (United States)

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.

  1. Chaotic Particle Swarm Optimization with Mutation for Classification

    Science.gov (United States)

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms. PMID:25709937

  2. ACTION RECOGNITION USING SALIENT NEIGHBORING HISTOGRAMS

    DEFF Research Database (Denmark)

    Ren, Huamin; Moeslund, Thomas B.

    2013-01-01

    Combining spatio-temporal interest points with Bag-of-Words models achieves state-of-the-art performance in action recognition. However, existing methods based on “bag-ofwords” models either are too local to capture the variance in space/time or fail to solve the ambiguity problem in spatial...... and temporal dimensions. Instead, we propose a salient vocabulary construction algorithm to select visual words from a global point of view, and form compact descriptors to represent discriminative histograms in the neighborhoods. Those salient neighboring histograms are then trained to model different actions...

  3. A dumbed-down approach to unite Fermilab, its neighbors

    CERN Multimedia

    Constable, B

    2004-01-01

    "...Fermilab is reaching out to its suburban neighbors...With the nation on orange alert, Fermilab scientists no longer can sit on the front porch and invite neighbors in for coffee and quasars" (1 page).

  4. Parallel exploitation of a spatial-spectral classification approach for hyperspectral images on RVC-CAL

    Science.gov (United States)

    Lazcano, R.; Madroñal, D.; Fabelo, H.; Ortega, S.; Salvador, R.; Callicó, G. M.; Juárez, E.; Sanz, C.

    2017-10-01

    Hyperspectral Imaging (HI) assembles high resolution spectral information from hundreds of narrow bands across the electromagnetic spectrum, thus generating 3D data cubes in which each pixel gathers the spectral information of the reflectance of every spatial pixel. As a result, each image is composed of large volumes of data, which turns its processing into a challenge, as performance requirements have been continuously tightened. For instance, new HI applications demand real-time responses. Hence, parallel processing becomes a necessity to achieve this requirement, so the intrinsic parallelism of the algorithms must be exploited. In this paper, a spatial-spectral classification approach has been implemented using a dataflow language known as RVCCAL. This language represents a system as a set of functional units, and its main advantage is that it simplifies the parallelization process by mapping the different blocks over different processing units. The spatial-spectral classification approach aims at refining the classification results previously obtained by using a K-Nearest Neighbors (KNN) filtering process, in which both the pixel spectral value and the spatial coordinates are considered. To do so, KNN needs two inputs: a one-band representation of the hyperspectral image and the classification results provided by a pixel-wise classifier. Thus, spatial-spectral classification algorithm is divided into three different stages: a Principal Component Analysis (PCA) algorithm for computing the one-band representation of the image, a Support Vector Machine (SVM) classifier, and the KNN-based filtering algorithm. The parallelization of these algorithms shows promising results in terms of computational time, as the mapping of them over different cores presents a speedup of 2.69x when using 3 cores. Consequently, experimental results demonstrate that real-time processing of hyperspectral images is achievable.

  5. Chirality dependence of dipole matrix element of carbon nanotubes in axial magnetic field: A third neighbor tight binding approach

    Science.gov (United States)

    Chegel, Raad; Behzad, Somayeh

    2014-02-01

    We have studied the electronic structure and dipole matrix element, D, of carbon nanotubes (CNTs) under magnetic field, using the third nearest neighbor tight binding model. It is shown that the 1NN and 3NN-TB band structures show differences such as the spacing and mixing of neighbor subbands. Applying the magnetic field leads to breaking the degeneracy behavior in the D transitions and creates new allowed transitions corresponding to the band modifications. It is found that |D| is proportional to the inverse tube radius and chiral angle. Our numerical results show that amount of filed induced splitting for the first optical peak is proportional to the magnetic field by the splitting rate ν11. It is shown that ν11 changes linearly and parabolicly with the chiral angle and radius, respectively.

  6. Cryptosporidiosis in Saudi Arabia and neighboring countries

    International Nuclear Information System (INIS)

    Areeshi, Mohammed Y.; Hart, C.A.; Beeching, N.J.

    2007-01-01

    Cryptosporidium is a coccidian protozoan parasite of the intestinal tract that causes severe and sometimes fatal watery diarrhea in immunocompromised patients and self-limiting but prolonged diarrheal disease in immunocompetent individuals. It exists naturally in animals and can be zoonotic. Although cryptosporidiosis is a significant cause of diarrheal disease in both developing and developed countries, it is more prevalent in developing countries and in tropical environments. We examined the epidemiology and disease burden of Cryptosporidium in Saudi Arabia and neighboring countries by reviewing 23 published studies of Cryptosporidium and etiology of diarrhea in between 1986 and 2006. The prevalence of Cryptosporidium infection in human's ranged from 1% to 37% with a median of 4%, while in animals it was for different species of animals and geographic locations of the studies. Most cases of cryptosporidiosis occurred among children less than 7 years of age and particularly in the first two years of life. The seasonality of Cryptosporidium varied depending on the geographic locations of the studies but it generally most prevalent in the rainy season. The most commonly identified species was Cryptosporidium parvum while C.hominis was detected only in one study from Kuwait. The cumulative experience from Saudi Arabia and four neighboring countries (Kuwait, Oman, Jordan and Iraq) suggest that Cryptosporidium is an important cause of diarrhea in human and cattle. However, the findings of this review also demonstrate the limitations of the available data regarding Cryptosporidium species and strains in circulation in these countries. (author)

  7. New Results on the Nearest OB Association: Sco-Cen (Sco OB2)

    Science.gov (United States)

    Mamajek, Eric E.

    2013-01-01

    The Scorpius-Centaurus OB association (Sco OB2) is the nearest site of recent massive star formation to the Sun. The primary stellar groups in the Sco-Cen complex (including OB subgroups Upper Sco, Upper Cen Lup, and Lower Cen Cru, the neighboring molecular cloud complexes Lup, Cha, CrA, Oph, and dispersed young groups Eta Cha, Epsilon Cha, TW Hya, and Beta Pic) have been participants in a complex episode of stellar birth (and some stellar death) over the past ~20 Myr. Here I summarize some recent results on the Sco-Cen complex from the U. Rochester group: (1) isochronal analysis of the HR diagram positions for >1 Msun stars in the Upper Scorpius subgroup shows it to be twice as old as previously thought (11 Myr vs. 5 Myr), (2) analysis of high resolution optical echelle spectra show that the subgroups are approximately solar in composition, (3) surveys for lower mass members are showing that the complex shows more substructure than previously recognized, including at least one new subgroup ("Lower Sco"), and the velocity and age data for the nearest OB subgroup Lower Cen Cru argue for a bifurcation into a younger 10 Myr) southern part ("Crux") and an older 20 Myr) northern part ("Lower Centaurus"), (4) an eclipsing, multi-ring dust disk system was serendipitously discovered in the SuperWASP and ASAS light curve for the newly discovered K5-type Sco-Cen member 1SWASP J140747.93-394542.6. With regard to some recent results by other investigators, we find that (1) attempts by some authors to subsume the Sco-Cen subgroups into a single sample of a single age are unnecessarily mixing samples with a wide range in ages, and (2) I have been unable to replicate the expansion age determinations claimed by some investigators for the TW Hya and Beta Pic groups (both purported to have expansion ages of 8 and 12 Myr, respectively), which have been used by some investigators to independently age-date the Sco-Cen subgroups. We acknowledge support from NSF grant AST-1008908 and the

  8. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng; Gao, Xin

    2016-01-01

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated

  9. Multispectral and Panchromatic used Enhancement Resolution and Study Effective Enhancement on Supervised and Unsupervised Classification Land – Cover

    Science.gov (United States)

    Salman, S. S.; Abbas, W. A.

    2018-05-01

    The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.

  10. Gender classification from face images by using local binary pattern and gray-level co-occurrence matrix

    Science.gov (United States)

    Uzbaş, Betül; Arslan, Ahmet

    2018-04-01

    Gender is an important step for human computer interactive processes and identification. Human face image is one of the important sources to determine gender. In the present study, gender classification is performed automatically from facial images. In order to classify gender, we propose a combination of features that have been extracted face, eye and lip regions by using a hybrid method of Local Binary Pattern and Gray-Level Co-Occurrence Matrix. The features have been extracted from automatically obtained face, eye and lip regions. All of the extracted features have been combined and given as input parameters to classification methods (Support Vector Machine, Artificial Neural Networks, Naive Bayes and k-Nearest Neighbor methods) for gender classification. The Nottingham Scan face database that consists of the frontal face images of 100 people (50 male and 50 female) is used for this purpose. As the result of the experimental studies, the highest success rate has been achieved as 98% by using Support Vector Machine. The experimental results illustrate the efficacy of our proposed method.

  11. A Novel Feature Level Fusion for Heart Rate Variability Classification Using Correntropy and Cauchy-Schwarz Divergence.

    Science.gov (United States)

    Goshvarpour, Ateke; Goshvarpour, Atefeh

    2018-04-30

    Heart rate variability (HRV) analysis has become a widely used tool for monitoring pathological and psychological states in medical applications. In a typical classification problem, information fusion is a process whereby the effective combination of the data can achieve a more accurate system. The purpose of this article was to provide an accurate algorithm for classifying HRV signals in various psychological states. Therefore, a novel feature level fusion approach was proposed. First, using the theory of information, two similarity indicators of the signal were extracted, including correntropy and Cauchy-Schwarz divergence. Applying probabilistic neural network (PNN) and k-nearest neighbor (kNN), the performance of each index in the classification of meditators and non-meditators HRV signals was appraised. Then, three fusion rules, including division, product, and weighted sum rules were used to combine the information of both similarity measures. For the first time, we propose an algorithm to define the weights of each feature based on the statistical p-values. The performance of HRV classification using combined features was compared with the non-combined features. Totally, the accuracy of 100% was obtained for discriminating all states. The results showed the strong ability and proficiency of division and weighted sum rules in the improvement of the classifier accuracies.

  12. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    Science.gov (United States)

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  13. Model of directed lines for square ice with second-neighbor and third-neighbor interactions

    Science.gov (United States)

    Kirov, Mikhail V.

    2018-02-01

    The investigation of the properties of nanoconfined systems is one of the most rapidly developing scientific fields. Recently it has been established that water monolayer between two graphene sheets forms square ice. Because of the energetic disadvantage, in the structure of the square ice there are no longitudinally arranged molecules. The result is that the structure is formed by unidirectional straight-lines of hydrogen bonds only. A simple but accurate discrete model of square ice with second-neighbor and third-neighbor interactions is proposed. According to this model, the ground state includes all configurations which do not contain three neighboring unidirectional chains of hydrogen bonds. Each triplet increases the energy by the same value. This new model differs from an analogous model with long-range interactions where in the ground state all neighboring chains are antiparallel. The new model is suitable for the corresponding system of point electric (and magnetic) dipoles on the square lattice. It allows separately estimating the different contributions to the total binding energy and helps to understand the properties of infinite monolayers and finite nanostructures. Calculations of the binding energy for square ice and for point dipole system are performed using the packages TINKER and LAMMPS.

  14. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    Science.gov (United States)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  15. Arabic Text Categorization Using Improved k-Nearest neighbour Algorithm

    Directory of Open Access Journals (Sweden)

    Wail Hamood KHALED

    2014-10-01

    Full Text Available The quantity of text information published in Arabic language on the net requires the implementation of effective techniques for the extraction and classifying of relevant information contained in large corpus of texts. In this paper we presented an implementation of an enhanced k-NN Arabic text classifier. We apply the traditional k-NN and Naive Bayes from Weka Toolkit for comparison purpose. Our proposed modified k-NN algorithm features an improved decision rule to skip the classes that are less similar and identify the right class from k nearest neighbours which increases the accuracy. The study evaluates the improved decision rule technique using the standard of recall, precision and f-measure as the basis of comparison. We concluded that the effectiveness of the proposed classifier is promising and outperforms the classical k-NN classifier.

  16. The surprising power of neighborly advice.

    Science.gov (United States)

    Gilbert, Daniel T; Killingsworth, Matthew A; Eyre, Rebecca N; Wilson, Timothy D

    2009-03-20

    Two experiments revealed that (i) people can more accurately predict their affective reactions to a future event when they know how a neighbor in their social network reacted to the event than when they know about the event itself and (ii) people do not believe this. Undergraduates made more accurate predictions about their affective reactions to a 5-minute speed date (n = 25) and to a peer evaluation (n = 88) when they knew only how another undergraduate had reacted to these events than when they had information about the events themselves. Both participants and independent judges mistakenly believed that predictions based on information about the event would be more accurate than predictions based on information about how another person had reacted to it.

  17. Observing Literacy Practices in Neighbor Institutions

    DEFF Research Database (Denmark)

    Reusch, Charlotte

    ’procedures on language and literacy. Based on this material, we developed an observation scheme and a guide for preschool teachers to follow, inspired by an action learning concept.During fall 2015, a pilot project is carried out. Preschool teachers from one institution visit a neighbor institution one by one during...... work hours, in order to observe and register how language and literacy events look like there. Afterwards, they share their registrations at a team meeting, and discuss and decide which procedures to test in their own institution. Thus, they form a professional learning network. In the pilot project......The Danish National Centre for Reading and a municipality in southern Denmark cooperate to develop a program to improve preschool children’s early literacy skills. The project aims to support preschool teachers’ ability to create a rich literacy environment for children age 3‒6. Recent research...

  18. Giant Planets: Good Neighbors for Habitable Worlds?

    Science.gov (United States)

    Georgakarakos, Nikolaos; Eggl, Siegfried; Dobbs-Dixon, Ian

    2018-04-01

    The presence of giant planets influences potentially habitable worlds in numerous ways. Massive celestial neighbors can facilitate the formation of planetary cores and modify the influx of asteroids and comets toward Earth analogs later on. Furthermore, giant planets can indirectly change the climate of terrestrial worlds by gravitationally altering their orbits. Investigating 147 well-characterized exoplanetary systems known to date that host a main-sequence star and a giant planet, we show that the presence of “giant neighbors” can reduce a terrestrial planet’s chances to remain habitable, even if both planets have stable orbits. In a small fraction of systems, however, giant planets slightly increase the extent of habitable zones provided that the terrestrial world has a high climate inertia. In providing constraints on where giant planets cease to affect the habitable zone size in a detrimental fashion, we identify prime targets in the search for habitable worlds.

  19. Object Classification in Semi Structured Enviroment Using Forward-Looking Sonar

    Directory of Open Access Journals (Sweden)

    Matheus dos Santos

    2017-09-01

    Full Text Available The submarine exploration using robots has been increasing in recent years. The automation of tasks such as monitoring, inspection, and underwater maintenance requires the understanding of the robot’s environment. The object recognition in the scene is becoming a critical issue for these systems. On this work, an underwater object classification pipeline applied in acoustic images acquired by Forward-Looking Sonar (FLS are studied. The object segmentation combines thresholding, connected pixels searching and peak of intensity analyzing techniques. The object descriptor extract intensity and geometric features of the detected objects. A comparison between the Support Vector Machine, K-Nearest Neighbors, and Random Trees classifiers are presented. An open-source tool was developed to annotate and classify the objects and evaluate their classification performance. The proposed method efficiently segments and classifies the structures in the scene using a real dataset acquired by an underwater vehicle in a harbor area. Experimental results demonstrate the robustness and accuracy of the method described in this paper.

  20. Raman scattering mediated by neighboring molecules

    Science.gov (United States)

    Williams, Mathew D.; Bradshaw, David S.; Andrews, David L.

    2016-05-01

    Raman scattering is most commonly associated with a change in vibrational state within individual molecules, the corresponding frequency shift in the scattered light affording a key way of identifying material structures. In theories where both matter and light are treated quantum mechanically, the fundamental scattering process is represented as the concurrent annihilation of a photon from one radiation mode and creation of another in a different mode. Developing this quantum electrodynamical formulation, the focus of the present work is on the spectroscopic consequences of electrodynamic coupling between neighboring molecules or other kinds of optical center. To encompass these nanoscale interactions, through which the molecular states evolve under the dual influence of the input light and local fields, this work identifies and determines two major mechanisms for each of which different selection rules apply. The constituent optical centers are considered to be chemically different and held in a fixed orientation with respect to each other, either as two components of a larger molecule or a molecular assembly that can undergo free rotation in a fluid medium or as parts of a larger, solid material. The two centers are considered to be separated beyond wavefunction overlap but close enough together to fall within an optical near-field limit, which leads to high inverse power dependences on their local separation. In this investigation, individual centers undergo a Stokes transition, whilst each neighbor of a different species remains in its original electronic and vibrational state. Analogous principles are applicable for the anti-Stokes case. The analysis concludes by considering the experimental consequences of applying this spectroscopic interpretation to fluid media; explicitly, the selection rules and the impact of pressure on the radiant intensity of this process.

  1. Raman scattering mediated by neighboring molecules

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Mathew D.; Bradshaw, David S.; Andrews, David L., E-mail: david.andrews@physics.org [School of Chemistry, University of East Anglia, Norwich NR4 7TJ (United Kingdom)

    2016-05-07

    Raman scattering is most commonly associated with a change in vibrational state within individual molecules, the corresponding frequency shift in the scattered light affording a key way of identifying material structures. In theories where both matter and light are treated quantum mechanically, the fundamental scattering process is represented as the concurrent annihilation of a photon from one radiation mode and creation of another in a different mode. Developing this quantum electrodynamical formulation, the focus of the present work is on the spectroscopic consequences of electrodynamic coupling between neighboring molecules or other kinds of optical center. To encompass these nanoscale interactions, through which the molecular states evolve under the dual influence of the input light and local fields, this work identifies and determines two major mechanisms for each of which different selection rules apply. The constituent optical centers are considered to be chemically different and held in a fixed orientation with respect to each other, either as two components of a larger molecule or a molecular assembly that can undergo free rotation in a fluid medium or as parts of a larger, solid material. The two centers are considered to be separated beyond wavefunction overlap but close enough together to fall within an optical near-field limit, which leads to high inverse power dependences on their local separation. In this investigation, individual centers undergo a Stokes transition, whilst each neighbor of a different species remains in its original electronic and vibrational state. Analogous principles are applicable for the anti-Stokes case. The analysis concludes by considering the experimental consequences of applying this spectroscopic interpretation to fluid media; explicitly, the selection rules and the impact of pressure on the radiant intensity of this process.

  2. Empirical mode decomposition and k-nearest embedding vectors for timely analyses of antibiotic resistance trends.

    Science.gov (United States)

    Teodoro, Douglas; Lovis, Christian

    2013-01-01

    Antibiotic resistance is a major worldwide public health concern. In clinical settings, timely antibiotic resistance information is key for care providers as it allows appropriate targeted treatment or improved empirical treatment when the specific results of the patient are not yet available. To improve antibiotic resistance trend analysis algorithms by building a novel, fully data-driven forecasting method from the combination of trend extraction and machine learning models for enhanced biosurveillance systems. We investigate a robust model for extraction and forecasting of antibiotic resistance trends using a decade of microbiology data. Our method consists of breaking down the resistance time series into independent oscillatory components via the empirical mode decomposition technique. The resulting waveforms describing intrinsic resistance trends serve as the input for the forecasting algorithm. The algorithm applies the delay coordinate embedding theorem together with the k-nearest neighbor framework to project mappings from past events into the future dimension and estimate the resistance levels. The algorithms that decompose the resistance time series and filter out high frequency components showed statistically significant performance improvements in comparison with a benchmark random walk model. We present further qualitative use-cases of antibiotic resistance trend extraction, where empirical mode decomposition was applied to highlight the specificities of the resistance trends. The decomposition of the raw signal was found not only to yield valuable insight into the resistance evolution, but also to produce novel models of resistance forecasters with boosted prediction performance, which could be utilized as a complementary method in the analysis of antibiotic resistance trends.

  3. Fusion of Airborne Discrete-Return LiDAR and Hyperspectral Data for Land Cover Classification

    Directory of Open Access Journals (Sweden)

    Shezhou Luo

    2015-12-01

    Full Text Available Accurate land cover classification information is a critical variable for many applications. This study presents a method to classify land cover using the fusion data of airborne discrete return LiDAR (Light Detection and Ranging and CASI (Compact Airborne Spectrographic Imager hyperspectral data. Four LiDAR-derived images (DTM, DSM, nDSM, and intensity and CASI data (48 bands with 1 m spatial resolution were spatially resampled to 2, 4, 8, 10, 20 and 30 m resolutions using the nearest neighbor resampling method. These data were thereafter fused using the layer stacking and principal components analysis (PCA methods. Land cover was classified by commonly used supervised classifications in remote sensing images, i.e., the support vector machine (SVM and maximum likelihood (MLC classifiers. Each classifier was applied to four types of datasets (at seven different spatial resolutions: (1 the layer stacking fusion data; (2 the PCA fusion data; (3 the LiDAR data alone; and (4 the CASI data alone. In this study, the land cover category was classified into seven classes, i.e., buildings, road, water bodies, forests, grassland, cropland and barren land. A total of 56 classification results were produced, and the classification accuracies were assessed and compared. The results show that the classification accuracies produced from two fused datasets were higher than that of the single LiDAR and CASI data at all seven spatial resolutions. Moreover, we find that the layer stacking method produced higher overall classification accuracies than the PCA fusion method using both the SVM and MLC classifiers. The highest classification accuracy obtained (OA = 97.8%, kappa = 0.964 using the SVM classifier on the layer stacking fusion data at 1 m spatial resolution. Compared with the best classification results of the CASI and LiDAR data alone, the overall classification accuracies improved by 9.1% and 19.6%, respectively. Our findings also demonstrated that the

  4. Carbon-hydrogen defects with a neighboring oxygen atom in n-type Si

    Science.gov (United States)

    Gwozdz, K.; Stübner, R.; Kolkovsky, Vl.; Weber, J.

    2017-07-01

    We report on the electrical activation of neutral carbon-oxygen complexes in Si by wet-chemical etching at room temperature. Two deep levels, E65 and E75, are observed by deep level transient spectroscopy in n-type Czochralski Si. The activation enthalpies of E65 and E75 are obtained as EC-0.11 eV (E65) and EC-0.13 eV (E75). The electric field dependence of their emission rates relates both levels to single acceptor states. From the analysis of the depth profiles, we conclude that the levels belong to two different defects, which contain only one hydrogen atom. A configuration is proposed, where the CH1BC defect, with hydrogen in the bond-centered position between neighboring C and Si atoms, is disturbed by interstitial oxygen in the second nearest neighbor position to substitutional carbon. The significant reduction of the CH1BC concentration in samples with high oxygen concentrations limits the use of this defect for the determination of low concentrations of substitutional carbon in Si samples.

  5. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  6. Detection of nearest neighbors to specific fluorescently tagged ligands in rod outer segment and lymphocyte plasma membranes by photosensitization of 5-iodonaphthyl 1-azide

    International Nuclear Information System (INIS)

    Raviv, Y.; Bercovici, T.; Gitler, C.; Salomon, Y.

    1989-01-01

    Lima bean agglutinin-fluorescein 5-isothiocyanate conjugate (FluNCS-lima bean lectin) interacts with specific receptor molecules on membranes both from the rod outer segment (ROS) of the frog retina and from S49 mouse lymphoma cells. When [125I]-5-iodonaphthyl 1-azide (125I-INA), which freely and randomly partitions into the lipid bilayer, is added to membranes and the suspension is irradiated at 480 nm, the FluNCS-conjugated lectin photosensitizes the [125I]INA but only at discrete sites. This results in the selective labeling of specific proteins: an 88-kDa protein on ROS membranes and a 56-kDa protein on S49 plasma membranes. Labeling is dependent upon the interaction of the FluNCS-lectin with glycosylated receptor sites, since N-acetylgalactosamine, but not methyl alpha-mannoside, blocked labeling of the 56-kDa protein on S49 membranes. In contrast, a random labeling pattern of membrane proteins was observed upon irradiation at 480 nm using other fluorescein conjugates, such as FluNCS-bovine serum albumin (FluNCS-BSA) or FluNCS-soybean trypsin inhibitor (FluNCS-STI), which interact with cell membranes in a nonselective manner, or with N-(fluorescein-5-thiocarbamoyl)-n-undecyclamine (FluNCS-NHC11), which is freely miscible in the membrane lipid. Random labeling was also obtained by direct photoexcitation of [125I]INA at 314 nm, with no distinct labeling of the 88- and 56-kDa proteins in the respective membranes. These results suggest that protein ligands can be used to guide sensitizers to discrete receptor sites and lead to their selective labeling by photosensitized activation of [125I]INA

  7. A systematic molecular dynamics study of nearest-neighbor effects on base pair and base pair step conformations and fluctuations in B-DNA

    Czech Academy of Sciences Publication Activity Database

    Lavery, R.; Zakrzewska, K.; Beveridge, D.; Bishop, T. C.; Case, D. A.; Cheatham III, T. E.; Dixit, S.; Jayaram, B.; Lankaš, Filip; Laughton, Ch.; Maddocks, J. H.; Michon, A.; Osman, R.; Orozco, M.; Pérez, A.; Singh, T.; Špačková, Naďa; Šponer, Jiří

    Roč. 38, č. 1 ( 2010 ), s. 299-313 ISSN 0305-1048 R&D Projects: GA MŠk(CZ) LC06030; GA AV ČR(CZ) IAA400040802; GA ČR GA203/09/1476; GA MŠk LC512 Institutional research plan: CEZ:AV0Z40550506; CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : B-DNA * molecular dynamics * sequence dependet structure and dynamics Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 7.836, year: 2010

  8. Studying nearest neighbor correlations by atom probe tomography (APT) in metallic glasses as exemplified for Fe40Ni40B20 glassy ribbons

    KAUST Repository

    Shariq, Ahmed; Al-Kassab, Talaat; Kirchheim, Reiner

    2012-01-01

    resolution of the analytical technique. However, fitting Gaussian distributions to the distribution of atomic distances yields average distances with statistical uncertainties of 2 to 3 hundredth of an Angstrom. Fe 40Ni40B20 metallic glass ribbons

  9. Lithological Classification Using Sentinel-2A Data in the Shibanjing Ophiolite Complex in Inner Mongolia, China

    Directory of Open Access Journals (Sweden)

    Wenyan Ge

    2018-04-01

    Full Text Available As a source of data continuity between Landsat and SPOT, Sentinel-2 is an Earth observation mission developed by the European Space Agency (ESA, which acquires 13 bands in the visible and near-infrared (VNIR to shortwave infrared (SWIR range. In this study, a Sentinel-2A imager was utilized to assess its ability to perform lithological classification in the Shibanjing ophiolite complex in Inner Mongolia, China. Five conventional machine learning methods, including artificial neural network (ANN, k-nearest neighbor (k-NN, maximum likelihood classification (MLC, random forest classifier (RFC, and support vector machine (SVM, were compared in order to find an optimal classifier for lithological mapping. The experiment revealed that the MLC method offered the highest overall accuracy. After that, Sentinel-2A image was compared with common multispectral data ASTER and Landsat-8 OLI (operational land imager for lithological mapping using the MLC method. The comparison results showed that the Sentinel-2A imagery yielded a classification accuracy of 74.5%, which was 2.5% and 5.08% higher than those of the ASTER and OLI imagery, respectively, indicating that Sentinel-2A imagery is adequate for lithological discrimination, due to its high spectral resolution in the VNIR to SWIR range. Moreover, different data combinations of Sentinel-2A + ASTER + DEM (digital elevation model and OLI + ASTER + DEM data were tested on lithological mapping using the MLC method. The best mapping result was obtained from Sentinel-2A + ASTER + DEM dataset, demonstrating that OLI can be replaced by Sentinel-2A, which, when combined with ASTER, can achieve sufficient bandpasses for lithological classification.

  10. Biodiesel classification by base stock type (vegetable oil) using near infrared spectroscopy data

    Energy Technology Data Exchange (ETDEWEB)

    Balabin, Roman M., E-mail: balabin@org.chem.ethz.ch [Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich (Switzerland); Safieva, Ravilya Z. [Gubkin Russian State University of Oil and Gas, 119991 Moscow (Russian Federation)

    2011-03-18

    The use of biofuels, such as bioethanol or biodiesel, has rapidly increased in the last few years. Near infrared (near-IR, NIR, or NIRS) spectroscopy (>4000 cm{sup -1}) has previously been reported as a cheap and fast alternative for biodiesel quality control when compared with infrared, Raman, or nuclear magnetic resonance (NMR) methods; in addition, NIR can easily be done in real time (on-line). In this proof-of-principle paper, we attempt to find a correlation between the near infrared spectrum of a biodiesel sample and its base stock. This correlation is used to classify fuel samples into 10 groups according to their origin (vegetable oil): sunflower, coconut, palm, soy/soya, cottonseed, castor, Jatropha, etc. Principal component analysis (PCA) is used for outlier detection and dimensionality reduction of the NIR spectral data. Four different multivariate data analysis techniques are used to solve the classification problem, including regularized discriminant analysis (RDA), partial least squares method/projection on latent structures (PLS-DA), K-nearest neighbors (KNN) technique, and support vector machines (SVMs). Classifying biodiesel by feedstock (base stock) type can be successfully solved with modern machine learning techniques and NIR spectroscopy data. KNN and SVM methods were found to be highly effective for biodiesel classification by feedstock oil type. A classification error (E) of less than 5% can be reached using an SVM-based approach. If computational time is an important consideration, the KNN technique (E = 6.2%) can be recommended for practical (industrial) implementation. Comparison with gasoline and motor oil data shows the relative simplicity of this methodology for biodiesel classification.

  11. Automatic classification and detection of clinically relevant images for diabetic retinopathy

    Science.gov (United States)

    Xu, Xinyu; Li, Baoxin

    2008-03-01

    We proposed a novel approach to automatic classification of Diabetic Retinopathy (DR) images and retrieval of clinically-relevant DR images from a database. Given a query image, our approach first classifies the image into one of the three categories: microaneurysm (MA), neovascularization (NV) and normal, and then it retrieves DR images that are clinically-relevant to the query image from an archival image database. In the classification stage, the query DR images are classified by the Multi-class Multiple-Instance Learning (McMIL) approach, where images are viewed as bags, each of which contains a number of instances corresponding to non-overlapping blocks, and each block is characterized by low-level features including color, texture, histogram of edge directions, and shape. McMIL first learns a collection of instance prototypes for each class that maximizes the Diverse Density function using Expectation- Maximization algorithm. A nonlinear mapping is then defined using the instance prototypes and maps every bag to a point in a new multi-class bag feature space. Finally a multi-class Support Vector Machine is trained in the multi-class bag feature space. In the retrieval stage, we retrieve images from the archival database who bear the same label with the query image, and who are the top K nearest neighbors of the query image in terms of similarity in the multi-class bag feature space. The classification approach achieves high classification accuracy, and the retrieval of clinically-relevant images not only facilitates utilization of the vast amount of hidden diagnostic knowledge in the database, but also improves the efficiency and accuracy of DR lesion diagnosis and assessment.

  12. Biodiesel classification by base stock type (vegetable oil) using near infrared spectroscopy data

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Safieva, Ravilya Z.

    2011-01-01

    The use of biofuels, such as bioethanol or biodiesel, has rapidly increased in the last few years. Near infrared (near-IR, NIR, or NIRS) spectroscopy (>4000 cm -1 ) has previously been reported as a cheap and fast alternative for biodiesel quality control when compared with infrared, Raman, or nuclear magnetic resonance (NMR) methods; in addition, NIR can easily be done in real time (on-line). In this proof-of-principle paper, we attempt to find a correlation between the near infrared spectrum of a biodiesel sample and its base stock. This correlation is used to classify fuel samples into 10 groups according to their origin (vegetable oil): sunflower, coconut, palm, soy/soya, cottonseed, castor, Jatropha, etc. Principal component analysis (PCA) is used for outlier detection and dimensionality reduction of the NIR spectral data. Four different multivariate data analysis techniques are used to solve the classification problem, including regularized discriminant analysis (RDA), partial least squares method/projection on latent structures (PLS-DA), K-nearest neighbors (KNN) technique, and support vector machines (SVMs). Classifying biodiesel by feedstock (base stock) type can be successfully solved with modern machine learning techniques and NIR spectroscopy data. KNN and SVM methods were found to be highly effective for biodiesel classification by feedstock oil type. A classification error (E) of less than 5% can be reached using an SVM-based approach. If computational time is an important consideration, the KNN technique (E = 6.2%) can be recommended for practical (industrial) implementation. Comparison with gasoline and motor oil data shows the relative simplicity of this methodology for biodiesel classification.

  13. Classification of gene expression data: A hubness-aware semi-supervised approach.

    Science.gov (United States)

    Buza, Krisztian

    2016-04-01

    Classification of gene expression data is the common denominator of various biomedical recognition tasks. However, obtaining class labels for large training samples may be difficult or even impossible in many cases. Therefore, semi-supervised classification techniques are required as semi-supervised classifiers take advantage of unlabeled data. Gene expression data is high-dimensional which gives rise to the phenomena known under the umbrella of the curse of dimensionality, one of its recently explored aspects being the presence of hubs or hubness for short. Therefore, hubness-aware classifiers have been developed recently, such as Naive Hubness-Bayesian k-Nearest Neighbor (NHBNN). In this paper, we propose a semi-supervised extension of NHBNN which follows the self-training schema. As one of the core components of self-training is the certainty score, we propose a new hubness-aware certainty score. We performed experiments on publicly available gene expression data. These experiments show that the proposed classifier outperforms its competitors. We investigated the impact of each of the components (classification algorithm, semi-supervised technique, hubness-aware certainty score) separately and showed that each of these components are relevant to the performance of the proposed approach. Our results imply that our approach may increase classification accuracy and reduce computational costs (i.e., runtime). Based on the promising results presented in the paper, we envision that hubness-aware techniques will be used in various other biomedical machine learning tasks. In order to accelerate this process, we made an implementation of hubness-aware machine learning techniques publicly available in the PyHubs software package (http://www.biointelligence.hu/pyhubs) implemented in Python, one of the most popular programming languages of data science. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Dynamics of Nearest-Neighbour Competitions on Graphs

    Science.gov (United States)

    Rador, Tonguç

    2017-10-01

    Considering a collection of agents representing the vertices of a graph endowed with integer points, we study the asymptotic dynamics of the rate of the increase of their points according to a very simple rule: we randomly pick an an edge from the graph which unambiguously defines two agents we give a point the the agent with larger point with probability p and to the lagger with probability q such that p+q=1. The model we present is the most general version of the nearest-neighbour competition model introduced by Ben-Naim, Vazquez and Redner. We show that the model combines aspects of hyperbolic partial differential equations—as that of a conservation law—graph colouring and hyperplane arrangements. We discuss the properties of the model for general graphs but we confine in depth study to d-dimensional tori. We present a detailed study for the ring graph, which includes a chemical potential approximation to calculate all its statistics that gives rather accurate results. The two-dimensional torus, not studied in depth as the ring, is shown to possess critical behaviour in that the asymptotic speeds arrange themselves in two-coloured islands separated by borders of three other colours and the size of the islands obey power law distribution. We also show that in the large d limit the d-dimensional torus shows inverse sine law for the distribution of asymptotic speeds.

  15. Supergalactic studies. I. Supergalactic distribution of the nearest galaxies

    International Nuclear Information System (INIS)

    de Vaucouleurs, G.

    1975-01-01

    The supergalactic distribution of the nearest galaxies is investigated to test the nature of the Local Supercluster and to determine whether the Local Group is inside or outside its boundaries. Objectively selected samples of galaxies generally nearer than 10 Mpc are defined by members of the Local Group, the largest and/or brightest galaxies (mag 10', with V 0 -1 ), low-velocity galaxies (V 0 -1 ), the DDO dwarfs. The great majority of these objects are distributed in a broad belt well populated in both hemispheres and inclined 14degree to the supergalactic equator. This belt, including the Local Group, the Sculptor ring, the Centaurus chain, the M51, M81, M101, and IC 342 groups, and several others as well as isolated, nearby field galaxies, is a supergalactic analog to the Gould belt in galactic structure. Its north pole is at L=172degree, B=+76degree, and there is a small dip of about -4degree indicating that the Galaxy is approx.0.3 Mpc to the north of the equatorial plane of this Local Cloud of galaxies. The nearby intergalactic H i clouds, and in particular the Magellanic Stream, are also close to the same plane. The probability that the observed distributions could arise by chance if nearby groups and galaxies were randomly distributed is in the range P -3 to P -5 for the varous classes of objects. It is concluded that the Local Supercluster is a disklike physical and dynamical system, and the Local Group is well within the borders of the system. The alternative hypothesis that it is an appearance resulting from a random clumping accident has negligibly small probability

  16. Constrained parameter estimation for semi-supervised learning : The case of the nearest mean classifier

    NARCIS (Netherlands)

    Loog, M.

    2011-01-01

    A rather simple semi-supervised version of the equally simple nearest mean classifier is presented. However simple, the proposed approach is of practical interest as the nearest mean classifier remains a relevant tool in biomedical applications or other areas dealing with relatively high-dimensional

  17. Performance modeling of neighbor discovery in proactive routing protocols

    Directory of Open Access Journals (Sweden)

    Andres Medina

    2011-07-01

    Full Text Available It is well known that neighbor discovery is a critical component of proactive routing protocols in wireless ad hoc networks. However there is no formal study on the performance of proposed neighbor discovery mechanisms. This paper provides a detailed model of key performance metrics of neighbor discovery algorithms, such as node degree and the distribution of the distance to symmetric neighbors. The model accounts for the dynamics of neighbor discovery as well as node density, mobility, radio and interference. The paper demonstrates a method for applying these models to the evaluation of global network metrics. In particular, it describes a model of network connectivity. Validation of the models shows that the degree estimate agrees, within 5% error, with simulations for the considered scenarios. The work presented in this paper serves as a basis for the performance evaluation of remaining performance metrics of routing protocols, vital for large scale deployment of ad hoc networks.

  18. Topological side-chain classification of beta-turns: ideal motifs for peptidomimetic development.

    Science.gov (United States)

    Tran, Tran Trung; McKie, Jim; Meutermans, Wim D F; Bourne, Gregory T; Andrews, Peter R; Smythe, Mark L

    2005-08-01

    Beta-turns are important topological motifs for biological recognition of proteins and peptides. Organic molecules that sample the side chain positions of beta-turns have shown broad binding capacity to multiple different receptors, for example benzodiazepines. Beta-turns have traditionally been classified into various types based on the backbone dihedral angles (phi2, psi2, phi3 and psi3). Indeed, 57-68% of beta-turns are currently classified into 8 different backbone families (Type I, Type II, Type I', Type II', Type VIII, Type VIa1, Type VIa2 and Type VIb and Type IV which represents unclassified beta-turns). Although this classification of beta-turns has been useful, the resulting beta-turn types are not ideal for the design of beta-turn mimetics as they do not reflect topological features of the recognition elements, the side chains. To overcome this, we have extracted beta-turns from a data set of non-homologous and high-resolution protein crystal structures. The side chain positions, as defined by C(alpha)-C(beta) vectors, of these turns have been clustered using the kth nearest neighbor clustering and filtered nearest centroid sorting algorithms. Nine clusters were obtained that cluster 90% of the data, and the average intra-cluster RMSD of the four C(alpha)-C(beta) vectors is 0.36. The nine clusters therefore represent the topology of the side chain scaffold architecture of the vast majority of beta-turns. The mean structures of the nine clusters are useful for the development of beta-turn mimetics and as biological descriptors for focusing combinatorial chemistry towards biologically relevant topological space.

  19. Classification and global distribution of ocean precipitation types based on satellite passive microwave signatures

    Science.gov (United States)

    Gautam, Nitin

    The main objectives of this thesis are to develop a robust statistical method for the classification of ocean precipitation based on physical properties to which the SSM/I is sensitive and to examine how these properties vary globally and seasonally. A two step approach is adopted for the classification of oceanic precipitation classes from multispectral SSM/I data: (1)we subjectively define precipitation classes using a priori information about the precipitating system and its possible distinct signature on SSM/I data such as scattering by ice particles aloft in the precipitating cloud, emission by liquid rain water below freezing level, the difference of polarization at 19 GHz-an indirect measure of optical depth, etc.; (2)we then develop an objective classification scheme which is found to reproduce the subjective classification with high accuracy. This hybrid strategy allows us to use the characteristics of the data to define and encode classes and helps retain the physical interpretation of classes. The classification methods based on k-nearest neighbor and neural network are developed to objectively classify six precipitation classes. It is found that the classification method based neural network yields high accuracy for all precipitation classes. An inversion method based on minimum variance approach was used to retrieve gross microphysical properties of these precipitation classes such as column integrated liquid water path, column integrated ice water path, and column integrated min water path. This classification method is then applied to 2 years (1991-92) of SSM/I data to examine and document the seasonal and global distribution of precipitation frequency corresponding to each of these objectively defined six classes. The characteristics of the distribution are found to be consistent with assumptions used in defining these six precipitation classes and also with well known climatological patterns of precipitation regions. The seasonal and global

  20. Object-based vegetation classification with high resolution remote sensing imagery

    Science.gov (United States)

    Yu, Qian

    Vegetation species are valuable indicators to understand the earth system. Information from mapping of vegetation species and community distribution at large scales provides important insight for studying the phenological (growth) cycles of vegetation and plant physiology. Such information plays an important role in land process modeling including climate, ecosystem and hydrological models. The rapidly growing remote sensing technology has increased its potential in vegetation species mapping. However, extracting information at a species level is still a challenging research topic. I proposed an effective method for extracting vegetation species distribution from remotely sensed data and investigated some ways for accuracy improvement. The study consists of three phases. Firstly, a statistical analysis was conducted to explore the spatial variation and class separability of vegetation as a function of image scale. This analysis aimed to confirm that high resolution imagery contains the information on spatial vegetation variation and these species classes can be potentially separable. The second phase was a major effort in advancing classification by proposing a method for extracting vegetation species from high spatial resolution remote sensing data. The proposed classification employs an object-based approach that integrates GIS and remote sensing data and explores the usefulness of ancillary information. The whole process includes image segmentation, feature generation and selection, and nearest neighbor classification. The third phase introduces a spatial regression model for evaluating the mapping quality from the above vegetation classification results. The effects of six categories of sample characteristics on the classification uncertainty are examined: topography, sample membership, sample density, spatial composition characteristics, training reliability and sample object features. This evaluation analysis answered several interesting scientific questions

  1. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, Saira; Bissell, Mina J

    2004-12-17

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double

  2. Our Galactic Neighbor Hosts Complex Organic Molecules

    Science.gov (United States)

    Hensley, Kerry

    2018-03-01

    For the first time, data from the Atacama Large Millimeter/submillimeter Array (ALMA) reveal the presence of methyl formate and dimethyl ether in a star-forming region outside our galaxy. This discovery has important implications for the formation and survival of complex organic compounds importantfor the formation of life in low-metallicity galaxies bothyoung and old.No Simple Picture of Complex Molecule FormationALMA, pictured here with the Magellanic Clouds above, has observed organic molecules in our Milky Way Galaxy and beyond. [ESO/C. Malin]Complex organic molecules (those with at least six atoms, one or more of which must be carbon) are the precursors to the building blocks of life. Knowing how and where complex organic molecules can form is a key part of understanding how life came to be on Earth and how it might arise elsewhere in the universe. From exoplanet atmospheres to interstellar space, complex organic molecules are ubiquitous in the Milky Way.In our galaxy, complex organic molecules are often found in the intense environments of hot cores clumps of dense molecular gas surrounding the sites of star formation. However, its not yet fully understood how the complex organic molecules found in hot cores come to be. One possibility is that the compounds condense onto cold dust grains long before the young stars begin heating their natal shrouds. Alternatively, they might assemble themselves from the hot, dense gas surrounding the blazing protostars.Composite infrared and optical image of the N 113 star-forming region in the LMC. The ALMA coverage is indicated by the gray line. Click to enlarge. [Sewio et al. 2018]Detecting Complexity, a Galaxy AwayUsing ALMA, a team of researchers led by Marta Sewio (NASA Goddard Space Flight Center) recently detected two complex organic molecules methyl formate and dimethyl ether for the first time in our neighboring galaxy, the Large Magellanic Cloud (LMC). Previous searches for organic molecules in the LMC detected

  3. Estimating local scaling properties for the classification of interstitial lung disease patterns

    Science.gov (United States)

    Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel

    2011-03-01

    Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  4. Development of classification models to detect Salmonella Enteritidis and Salmonella Typhimurium found in poultry carcass rinses by visible-near infrared hyperspectral imaging

    Science.gov (United States)

    Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.

    2013-05-01

    Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification

  5. Hubble Space Telescope Trigonometric Parallax of Polaris B, Companion of the Nearest Cepheid

    Science.gov (United States)

    Bond, Howard E.; Nelan, Edmund P.; Remage Evans, Nancy; Schaefer, Gail H.; Harmer, Dianne

    2018-01-01

    Polaris, the nearest and brightest Cepheid, is a potential anchor point for the Leavitt period–luminosity relation. However, its distance is a matter of contention, with recent advocacy for a parallax of ∼10 mas, in contrast with the Hipparcos measurement of 7.54 ± 0.11 mas. We report an independent trigonometric parallax determination, using the Fine Guidance Sensors (FGS) on the Hubble Space Telescope. Polaris itself is too bright for FGS, so we measured its eighth-magnitude companion Polaris B, relative to a network of background reference stars. We converted the FGS relative parallax to absolute, using estimated distances to the reference stars from ground-based photometry and spectral classification. Our result, 6.26 ± 0.24 mas, is even smaller than that found by Hipparcos. We note other objects for which Hipparcos appears to have overestimated parallaxes, including the well-established case of the Pleiades. We consider possible sources of systematic error in the FGS parallax, but find no evidence they are significant. If our “long” distance is correct, the high luminosity of Polaris indicates that it is pulsating in the second overtone of its fundamental mode. Our results raise several puzzles, including a long pulsation period for Polaris compared to second-overtone pulsators in the Magellanic Clouds, and a conflict between the isochrone age of Polaris B (∼2.1 Gyr) and the much younger age of Polaris A. We discuss possibilities that B is not a physical companion of A, in spite of the strong evidence that it is, or that one of the stars is a merger remnant. These issues may be resolved when Gaia provides parallaxes for both stars. Based in part on observations made with the NASA/ESA Hubble Space Telescope, obtained by the Space Telescope Science Institute. STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555.

  6. A LITERATURE SURVEY ON VARIOUS ILLUMINATION NORMALIZATION TECHNIQUES FOR FACE RECOGNITION WITH FUZZY K NEAREST NEIGHBOUR CLASSIFIER

    Directory of Open Access Journals (Sweden)

    A. Thamizharasi

    2015-05-01

    Full Text Available The face recognition is popular in video surveillance, social networks and criminal identifications nowadays. The performance of face recognition would be affected by variations in illumination, pose, aging and partial occlusion of face by Wearing Hats, scarves and glasses etc. The illumination variations are still the challenging problem in face recognition. The aim is to compare the various illumination normalization techniques. The illumination normalization techniques include: Log transformations, Power Law transformations, Histogram equalization, Adaptive histogram equalization, Contrast stretching, Retinex, Multi scale Retinex, Difference of Gaussian, DCT, DCT Normalization, DWT, Gradient face, Self Quotient, Multi scale Self Quotient and Homomorphic filter. The proposed work consists of three steps. First step is to preprocess the face image with the above illumination normalization techniques; second step is to create the train and test database from the preprocessed face images and third step is to recognize the face images using Fuzzy K nearest neighbor classifier. The face recognition accuracy of all preprocessing techniques is compared using the AR face database of color images.

  7. Color and neighbor edge directional difference feature for image retrieval

    Institute of Scientific and Technical Information of China (English)

    Chaobing Huang; Shengsheng Yu; Jingli Zhou; Hongwei Lu

    2005-01-01

    @@ A novel image feature termed neighbor edge directional difference unit histogram is proposed, in which the neighbor edge directional difference unit is defined and computed for every pixel in the image, and is used to generate the neighbor edge directional difference unit histogram. This histogram and color histogram are used as feature indexes to retrieve color image. The feature is invariant to image scaling and translation and has more powerful descriptive for the natural color images. Experimental results show that the feature can achieve better retrieval performance than other color-spatial features.

  8. Automatic multi-modal MR tissue classification for the assessment of response to bevacizumab in patients with glioblastoma

    International Nuclear Information System (INIS)

    Liberman, Gilad; Louzoun, Yoram; Aizenstein, Orna; Blumenthal, Deborah T.; Bokstein, Felix; Palmon, Mika; Corn, Benjamin W.; Ben Bashat, Dafna

    2013-01-01

    Background: Current methods for evaluation of treatment response in glioblastoma are inaccurate, limited and time-consuming. This study aimed to develop a multi-modal MRI automatic classification method to improve accuracy and efficiency of treatment response assessment in patients with recurrent glioblastoma (GB). Materials and methods: A modification of the k-Nearest-Neighbors (kNN) classification method was developed and applied to 59 longitudinal MR data sets of 13 patients with recurrent GB undergoing bevacizumab (anti-angiogenic) therapy. Changes in the enhancing tumor volume were assessed using the proposed method and compared with Macdonald's criteria and with manual volumetric measurements. The edema-like area was further subclassified into peri- and non-peri-tumoral edema, using both the kNN method and an unsupervised method, to monitor longitudinal changes. Results: Automatic classification using the modified kNN method was applicable in all scans, even when the tumors were infiltrative with unclear borders. The enhancing tumor volume obtained using the automatic method was highly correlated with manual measurements (N = 33, r = 0.96, p < 0.0001), while standard radiographic assessment based on Macdonald's criteria matched manual delineation and automatic results in only 68% of cases. A graded pattern of tumor infiltration within the edema-like area was revealed by both automatic methods, showing high agreement. All classification results were confirmed by a senior neuro-radiologist and validated using MR spectroscopy. Conclusion: This study emphasizes the important role of automatic tools based on a multi-modal view of the tissue in monitoring therapy response in patients with high grade gliomas specifically under anti-angiogenic therapy

  9. Science and Technology Text Mining Basic Concepts

    National Research Council Canada - National Science Library

    Losiewicz, Paul

    2003-01-01

    ...). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods, such as nearest neighbor, relational learning models, and genetic...

  10. Estimation and Mapping Forest Attributes Using “k Nearest Neighbor” Method on IRS-P6 LISS III Satellite Image Data

    Directory of Open Access Journals (Sweden)

    Amir Eslam Bonyad

    2015-06-01

    Full Text Available In this study, we explored the utility of k Nearest Neighbor (kNN algorithm to integrate IRS-P6 LISS III satellite imagery data and ground inventory data for application in forest attributes (DBH, trees height, volume, basal area, density and forest cover type estimation and mapping. The ground inventory data was based on a systematic-random sampling grid and the numbers of sampling plots were 408 circular plots in a plantation in Guilan province, north of Iran. We concluded that kNN method was useful tool for mapping at a fine accuracy between 80% and 93.94%. Values of k between 5 and 8 seemed appropriate. The best distance metrics were found Euclidean, Fuzzy and Mahalanobis. Results showed that kNN was accurate enough for practical applicability for mapping forest areas.

  11. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  12. Machine learning for radioxenon event classification for the Comprehensive Nuclear-Test-Ban Treaty

    Energy Technology Data Exchange (ETDEWEB)

    Stocki, Trevor J., E-mail: trevor_stocki@hc-sc.gc.c [Radiation Protection Bureau, 775 Brookfield Road, A.L. 6302D1, Ottawa, ON, K1A 1C1 (Canada); Li, Guichong; Japkowicz, Nathalie [School of Information Technology and Engineering, University of Ottawa, 800 King Edward Avenue, Ottawa, ON, K1N 6N5 (Canada); Ungar, R. Kurt [Radiation Protection Bureau, 775 Brookfield Road, A.L. 6302D1, Ottawa, ON, K1A 1C1 (Canada)

    2010-01-15

    A method of weapon detection for the Comprehensive nuclear-Test-Ban-Treaty (CTBT) consists of monitoring the amount of radioxenon in the atmosphere by measuring and sampling the activity concentration of {sup 131m}Xe, {sup 133}Xe, {sup 133m}Xe, and {sup 135}Xe by radionuclide monitoring. Several explosion samples were simulated based on real data since the measured data of this type is quite rare. These data sets consisted of different circumstances of a nuclear explosion, and are used as training data sets to establish an effective classification model employing state-of-the-art technologies in machine learning. A study was conducted involving classic induction algorithms in machine learning including Naive Bayes, Neural Networks, Decision Trees, k-Nearest Neighbors, and Support Vector Machines, that revealed that they can successfully be used in this practical application. In particular, our studies show that many induction algorithms in machine learning outperform a simple linear discriminator when a signal is found in a high radioxenon background environment.

  13. Automated segmentation of geographic atrophy in fundus autofluorescence images using supervised pixel classification.

    Science.gov (United States)

    Hu, Zhihong; Medioni, Gerard G; Hernandez, Matthias; Sadda, Srinivas R

    2015-01-01

    Geographic atrophy (GA) is a manifestation of the advanced or late stage of age-related macular degeneration (AMD). AMD is the leading cause of blindness in people over the age of 65 in the western world. The purpose of this study is to develop a fully automated supervised pixel classification approach for segmenting GA, including uni- and multifocal patches in fundus autofluorescene (FAF) images. The image features include region-wise intensity measures, gray-level co-occurrence matrix measures, and Gaussian filter banks. A [Formula: see text]-nearest-neighbor pixel classifier is applied to obtain a GA probability map, representing the likelihood that the image pixel belongs to GA. Sixteen randomly chosen FAF images were obtained from 16 subjects with GA. The algorithm-defined GA regions are compared with manual delineation performed by a certified image reading center grader. Eight-fold cross-validation is applied to evaluate the algorithm performance. The mean overlap ratio (OR), area correlation (Pearson's [Formula: see text]), accuracy (ACC), true positive rate (TPR), specificity (SPC), positive predictive value (PPV), and false discovery rate (FDR) between the algorithm- and manually defined GA regions are [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], respectively.

  14. In silico prediction of ROCK II inhibitors by different classification approaches.

    Science.gov (United States)

    Cai, Chuipu; Wu, Qihui; Luo, Yunxia; Ma, Huili; Shen, Jiangang; Zhang, Yongbin; Yang, Lei; Chen, Yunbo; Wen, Zehuai; Wang, Qi

    2017-11-01

    ROCK II is an important pharmacological target linked to central nervous system disorders such as Alzheimer's disease. The purpose of this research is to generate ROCK II inhibitor prediction models by machine learning approaches. Firstly, four sets of descriptors were calculated with MOE 2010 and PaDEL-Descriptor, and optimized by F-score and linear forward selection methods. In addition, four classification algorithms were used to initially build 16 classifiers with k-nearest neighbors [Formula: see text], naïve Bayes, Random forest, and support vector machine. Furthermore, three sets of structural fingerprint descriptors were introduced to enhance the predictive capacity of classifiers, which were assessed with fivefold cross-validation, test set validation and external test set validation. The best two models, MFK + MACCS and MLR + SubFP, have both MCC values of 0.925 for external test set. After that, a privileged substructure analysis was performed to reveal common chemical features of ROCK II inhibitors. Finally, binding modes were analyzed to identify relationships between molecular descriptors and activity, while main interactions were revealed by comparing the docking interaction of the most potent and the weakest ROCK II inhibitors. To the best of our knowledge, this is the first report on ROCK II inhibitors utilizing machine learning approaches that provides a new method for discovering novel ROCK II inhibitors.

  15. Machine learning for radioxenon event classification for the Comprehensive Nuclear-Test-Ban Treaty

    International Nuclear Information System (INIS)

    Stocki, Trevor J.; Li, Guichong; Japkowicz, Nathalie; Ungar, R. Kurt

    2010-01-01

    A method of weapon detection for the Comprehensive nuclear-Test-Ban-Treaty (CTBT) consists of monitoring the amount of radioxenon in the atmosphere by measuring and sampling the activity concentration of 131m Xe, 133 Xe, 133m Xe, and 135 Xe by radionuclide monitoring. Several explosion samples were simulated based on real data since the measured data of this type is quite rare. These data sets consisted of different circumstances of a nuclear explosion, and are used as training data sets to establish an effective classification model employing state-of-the-art technologies in machine learning. A study was conducted involving classic induction algorithms in machine learning including Naive Bayes, Neural Networks, Decision Trees, k-Nearest Neighbors, and Support Vector Machines, that revealed that they can successfully be used in this practical application. In particular, our studies show that many induction algorithms in machine learning outperform a simple linear discriminator when a signal is found in a high radioxenon background environment.

  16. Classification of Carotid Plaque Echogenicity by Combining Texture Features and Morphologic Characteristics.

    Science.gov (United States)

    Huang, Xiaowei; Zhang, Yanling; Qian, Ming; Meng, Long; Xiao, Yang; Niu, Lili; Zheng, Rongqin; Zheng, Hairong

    2016-10-01

    Anechoic carotid plaques on sonography have been used to predict future cardiovascular or cerebrovascular events. The purpose of this study was to investigate whether carotid plaque echogenicity could be assessed objectively by combining texture features extracted by MaZda software (Institute of Electronics, Technical University of Lodz, Lodz, Poland) and morphologic characteristics, which may provide a promising method for early prediction of acute cardiovascular disease. A total of 268 plaque images were collected from 136 volunteers and classified into 85 hyperechoic, 83 intermediate, and 100 anechoic plaques. About 300 texture features were extracted from histogram, absolute gradient, run-length matrix, gray-level co-occurrence matrix, autoregressive model, and wavelet transform algorithms by MaZda. The morphologic characteristics, including degree of stenosis, maximum plaque intima-media thickness, and maximum plaque length, were measured by B-mode sonography. Statistically significant features were selected by analysis of covariance. The most discriminative features were obtained from statistically significant features by linear discriminant analysis. The K-nearest neighbor classifier was used to classify plaque echogenicity based on statistically significant and most discriminative features. A total of 30 statistically significant features were selected among the plaques, and 2 most discriminative features were obtained from the statistically significant features. The classification accuracy rates for 3 types of plaques based on statistically significant and most discriminative features were 72.03% (κ= 0.571; P MaZda and morphologic characteristics.

  17. Feature extraction and classification of clouds in high resolution panchromatic satellite imagery

    Science.gov (United States)

    Sharghi, Elan

    The development of sophisticated remote sensing sensors is rapidly increasing, and the vast amount of satellite imagery collected is too much to be analyzed manually by a human image analyst. It has become necessary for a tool to be developed to automate the job of an image analyst. This tool would need to intelligently detect and classify objects of interest through computer vision algorithms. Existing software called the Rapid Image Exploitation Resource (RAPIER®) was designed by engineers at Space and Naval Warfare Systems Center Pacific (SSC PAC) to perform exactly this function. This software automatically searches for anomalies in the ocean and reports the detections as a possible ship object. However, if the image contains a high percentage of cloud coverage, a high number of false positives are triggered by the clouds. The focus of this thesis is to explore various feature extraction and classification methods to accurately distinguish clouds from ship objects. An examination of a texture analysis method, line detection using the Hough transform, and edge detection using wavelets are explored as possible feature extraction methods. The features are then supplied to a K-Nearest Neighbors (KNN) or Support Vector Machine (SVM) classifier. Parameter options for these classifiers are explored and the optimal parameters are determined.

  18. Rapid corn and soybean mapping in US Corn Belt and neighboring areas

    Science.gov (United States)

    Zhong, Liheng; Yu, Le; Li, Xuecao; Hu, Lina; Gong, Peng

    2016-11-01

    The goal of this study was to promptly map the extent of corn and soybeans early in the growing season. A classification experiment was conducted for the US Corn Belt and neighboring states, which is the most important production area of corn and soybeans in the world. To improve the timeliness of the classification algorithm, training was completely based on reference data and images from other years, circumventing the need to finish reference data collection in the current season. To account for interannual variability in crop development in the cross-year classification scenario, several innovative strategies were used. A random forest classifier was used in all tests, and MODIS surface reflectance products from the years 2008-2014 were used for training and cross-year validation. It is concluded that the fuzzy classification approach is necessary to achieve satisfactory results with R-squared ~0.9 (compared with the USDA Cropland Data Layer). The year of training data is an important factor, and it is recommended to select a year with similar crop phenology as the mapping year. With this phenology-based and cross-year-training method, in 2015 we mapped the cropping proportion of corn and soybeans around mid-August, when the two crops just reached peak growth.

  19. Enhanced land use/cover classification of heterogeneous tropical landscapes using support vector machines and textural homogeneity

    Science.gov (United States)

    Paneque-Gálvez, Jaime; Mas, Jean-François; Moré, Gerard; Cristóbal, Jordi; Orta-Martínez, Martí; Luz, Ana Catarina; Guèze, Maximilien; Macía, Manuel J.; Reyes-García, Victoria

    2013-08-01

    Land use/cover classification is a key research field in remote sensing and land change science as thematic maps derived from remotely sensed data have become the basis for analyzing many socio-ecological issues. However, land use/cover classification remains a difficult task and it is especially challenging in heterogeneous tropical landscapes where nonetheless such maps are of great importance. The present study aims at establishing an efficient classification approach to accurately map all broad land use/cover classes in a large, heterogeneous tropical area, as a basis for further studies (e.g., land use/cover change, deforestation and forest degradation). Specifically, we first compare the performance of parametric (maximum likelihood), non-parametric (k-nearest neighbor and four different support vector machines - SVM), and hybrid (unsupervised-supervised) classifiers, using hard and soft (fuzzy) accuracy assessments. We then assess, using the maximum likelihood algorithm, what textural indices from the gray-level co-occurrence matrix lead to greater classification improvements at the spatial resolution of Landsat imagery (30 m), and rank them accordingly. Finally, we use the textural index that provides the most accurate classification results to evaluate whether its usefulness varies significantly with the classifier used. We classified imagery corresponding to dry and wet seasons and found that SVM classifiers outperformed all the rest. We also found that the use of some textural indices, but particularly homogeneity and entropy, can significantly improve classifications. We focused on the use of the homogeneity index, which has so far been neglected in land use/cover classification efforts, and found that this index along with reflectance bands significantly increased the overall accuracy of all the classifiers, but particularly of SVM. We observed that improvements in producer's and user's accuracies through the inclusion of homogeneity were different

  20. The role of orthography in the semantic activation of neighbors.

    Science.gov (United States)

    Hino, Yasushi; Lupker, Stephen J; Taylor, Tamsen E

    2012-09-01

    There is now considerable evidence that a letter string can activate semantic information appropriate to its orthographic neighbors (e.g., Forster & Hector's, 2002, TURPLE effect). This phenomenon is the focus of the present research. Using Japanese words, we examined whether semantic activation of neighbors is driven directly by orthographic similarity alone or whether there is also a role for phonological similarity. In Experiment 1, using a relatedness judgment task in which a Kanji word-Katakana word pair was presented on each trial, an inhibitory effect was observed when the initial Kanji word was related to an orthographic and phonological neighbor of the Katakana word target but not when the initial Kanji word was related to a phonological but not orthographic neighbor of the Katakana word target. This result suggests that phonology plays little, if any, role in the activation of neighbors' semantics when reading familiar words. In Experiment 2, the targets were transcribed into Hiragana, a script they are typically not written in, requiring readers to engage in phonological coding. In that experiment, inhibitory effects were observed in both conditions. This result indicates that phonologically mediated semantic activation of neighbors will emerge when phonological processing is necessary in order to understand a written word (e.g., when that word is transcribed into an unfamiliar script). PsycINFO Database Record (c) 2012 APA, all rights reserved.

  1. IDM-PhyChm-Ens: intelligent decision-making ensemble methodology for classification of human breast cancer using physicochemical properties of amino acids.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul; Khan, Asifullah

    2014-04-01

    Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our

  2. Air Pollution from Livestock Farms Is Associated with Airway Obstruction in Neighboring Residents.

    Science.gov (United States)

    Borlée, Floor; Yzermans, C Joris; Aalders, Bernadette; Rooijackers, Jos; Krop, Esmeralda; Maassen, Catharina B M; Schellevis, François; Brunekreef, Bert; Heederik, Dick; Smit, Lidwien A M

    2017-11-01

    Livestock farm emissions may not only affect respiratory health of farmers but also of neighboring residents. To explore associations between spatial and temporal variation in pollutant emissions from livestock farms and lung function in a general, nonfarming, rural population in the Netherlands. We conducted a cross-sectional study in 2,308 adults (age, 20-72 yr). A pulmonary function test was performed measuring prebronchodilator and post-bronchodilator FEV 1 , FVC, FEV 1 /FVC, and maximum mid-expiratory flow (MMEF). Spatial exposure was assessed as (1) number of farms within 500 m and 1,000 m of the home, (2) distance to the nearest farm, and (3) modeled annual average fine dust emissions from farms within 500 m and 1,000 m of the home address. Temporal exposure was assessed as week-average ambient particulate matter livestock farms within a 1,000-m buffer from the home address and MMEF, which was more pronounced in participants without atopy. No associations were found with other spatial exposure variables. Week-average particulate matter livestock air pollution emissions are associated with lung function deficits in nonfarming residents.

  3. Classification of Cytochrome P450 1A2 Inhibitors and Non-Inhibitors by Machine Learning Techniques

    DEFF Research Database (Denmark)

    Vasanthanathan, Poongavanam; Taboureau, Olivier; Oostenbrink, Chris

    2009-01-01

    of CYP1A2 inhibitors and non-inhibitors. Training and test sets consisted of about 400 and 7000 compounds, respectively. Various machine learning techniques, like binary QSAR, support vector machine (SVM), random forest, kappa nearest neighbors (kNN), and decision tree methods were used to develop...

  4. Bamboo Classification Using WorldView-2 Imagery of Giant Panda Habitat in a Large Shaded Area in Wolong, Sichuan Province, China

    Directory of Open Access Journals (Sweden)

    Yunwei Tang

    2016-11-01

    Full Text Available This study explores the ability of WorldView-2 (WV-2 imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA. Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.

  5. Beyond formal groups: neighboring acts and watershed protection in Appalachia

    Directory of Open Access Journals (Sweden)

    Heather Lukacs

    2016-09-01

    Full Text Available This paper explores how watershed organizations in Appalachia have persisted in addressing water quality issues in areas with a history of coal mining. We identified two watershed groups that have taken responsibility for restoring local creeks that were previously highly degraded and sporadically managed. These watershed groups represent cases of self-organized commons governance in resource-rich, economically poor Appalachian communities. We describe the extent and characteristics of links between watershed group volunteers and watershed residents who are not group members. Through surveys, participant observation, and key-informant consultation, we found that neighbors – group members as well as non-group-members – supported the group's function through informal neighboring acts. Past research has shown that local commons governance institutions benefit from being nested in supportive external structures. We found that the persistence and success of community watershed organizations depends on the informal participation of local residents, affirming the necessity of looking beyond formal, organized groups to understand the resources, expertise, and information needed to address complex water pollution at the watershed level. Our findings augment the concept of nestedness in commons governance to include that of a formal organization acting as a neighbor that exchanges informal neighboring acts with local residents. In this way, we extend the concept of neighboring to include interactions between individuals and a group operating in the same geographic area.

  6. Classification of Vessels in Single-Pol COSMO-SkyMed Images Based on Statistical and Structural Features

    Directory of Open Access Journals (Sweden)

    Fan Wu

    2015-05-01

    Full Text Available Vessel monitoring is one of the most important maritime applications of Synthetic Aperture Radar (SAR data. Because of the dihedral reflections between the vessel hull and sea surface and the trihedral reflections among superstructures, vessels usually have strong backscattering in SAR images. Furthermore, in high-resolution SAR images, detailed information on vessel structures can be observed, allowing for vessel classification in high-resolution SAR images. This paper focuses on the feature analysis of merchant vessels, including bulk carriers, container ships and oil tankers, in 3 m resolution COSMO-SkyMed stripmap HIMAGE mode images and proposes a method for vessel classification. After preprocessing, a feature vector is estimated by calculating the average value of the kernel density estimation, three structural features and the mean backscattering coefficient. Support vector machine (SVM classifier is used for the vessel classification, and the results are compared with traditional methods, such as the K-nearest neighbor algorithm (K-NN and minimum distance classifier (MDC. In situ investigations are conducted during the SAR data acquisition. Corresponding Automatic Identification System (AIS reports are also obtained as ground truth to evaluate the effectiveness of the classifier. The preliminary results show that the combination of the average value of the kernel density estimation and mean backscattering coefficient has good ability for classifying the three types of vessels. When adding the three structural features, the results slightly improve. The result of the SVM classifier is better than that of K-NN and MDC. However, the SVM requires more time, when the parameters of the kernel are estimated.

  7. Classification of lung sounds using higher-order statistics: A divide-and-conquer approach.

    Science.gov (United States)

    Naves, Raphael; Barbosa, Bruno H G; Ferreira, Danton D

    2016-06-01

    Lung sound auscultation is one of the most commonly used methods to evaluate respiratory diseases. However, the effectiveness of this method depends on the physician's training. If the physician does not have the proper training, he/she will be unable to distinguish between normal and abnormal sounds generated by the human body. Thus, the aim of this study was to implement a pattern recognition system to classify lung sounds. We used a dataset composed of five types of lung sounds: normal, coarse crackle, fine crackle, monophonic and polyphonic wheezes. We used higher-order statistics (HOS) to extract features (second-, third- and fourth-order cumulants), Genetic Algorithms (GA) and Fisher's Discriminant Ratio (FDR) to reduce dimensionality, and k-Nearest Neighbors and Naive Bayes classifiers to recognize the lung sound events in a tree-based system. We used the cross-validation procedure to analyze the classifiers performance and the Tukey's Honestly Significant Difference criterion to compare the results. Our results showed that the Genetic Algorithms outperformed the Fisher's Discriminant Ratio for feature selection. Moreover, each lung class had a different signature pattern according to their cumulants showing that HOS is a promising feature extraction tool for lung sounds. Besides, the proposed divide-and-conquer approach can accurately classify different types of lung sounds. The classification accuracy obtained by the best tree-based classifier was 98.1% for classification accuracy on training, and 94.6% for validation data. The proposed approach achieved good results even using only one feature extraction tool (higher-order statistics). Additionally, the implementation of the proposed classifier in an embedded system is feasible. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. Classification of diabetic retinopathy using fractal dimension analysis of eye fundus image

    Science.gov (United States)

    Safitri, Diah Wahyu; Juniati, Dwi

    2017-08-01

    Diabetes Mellitus (DM) is a metabolic disorder when pancreas produce inadequate insulin or a condition when body resist insulin action, so the blood glucose level is high. One of the most common complications of diabetes mellitus is diabetic retinopathy which can lead to a vision problem. Diabetic retinopathy can be recognized by an abnormality in eye fundus. Those abnormalities are characterized by microaneurysms, hemorrhage, hard exudate, cotton wool spots, and venous's changes. The diabetic retinopathy is classified depends on the conditions of abnormality in eye fundus, that is grade 1 if there is a microaneurysm only in the eye fundus; grade 2, if there are a microaneurysm and a hemorrhage in eye fundus; and grade 3: if there are microaneurysm, hemorrhage, and neovascularization in the eye fundus. This study proposed a method and a process of eye fundus image to classify of diabetic retinopathy using fractal analysis and K-Nearest Neighbor (KNN). The first phase was image segmentation process using green channel, CLAHE, morphological opening, matched filter, masking, and morphological opening binary image. After segmentation process, its fractal dimension was calculated using box-counting method and the values of fractal dimension were analyzed to make a classification of diabetic retinopathy. Tests carried out by used k-fold cross validation method with k=5. In each test used 10 different grade K of KNN. The accuracy of the result of this method is 89,17% with K=3 or K=4, it was the best results than others K value. Based on this results, it can be concluded that the classification of diabetic retinopathy using fractal analysis and KNN had a good performance.

  9. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  10. Real-Time Classification of Patients with Balance Disorders vs. Normal Subjects Using a Low-Cost Small Wireless Wearable Gait Sensor

    Directory of Open Access Journals (Sweden)

    Bhargava Teja Nukala

    2016-11-01

    Full Text Available Gait analysis using wearable wireless sensors can be an economical, convenient and effective way to provide diagnostic and clinical information for various health-related issues. In this work, our custom designed low-cost wireless gait analysis sensor that contains a basic inertial measurement unit (IMU was used to collect the gait data for four patients diagnosed with balance disorders and additionally three normal subjects, each performing the Dynamic Gait Index (DGI tests while wearing the custom wireless gait analysis sensor (WGAS. The small WGAS includes a tri-axial accelerometer integrated circuit (IC, two gyroscopes ICs and a Texas Instruments (TI MSP430 microcontroller and is worn by each subject at the T4 position during the DGI tests. The raw gait data are wirelessly transmitted from the WGAS to a near-by PC for real-time gait data collection and analysis. In order to perform successful classification of patients vs. normal subjects, we used several different classification algorithms, such as the back propagation artificial neural network (BP-ANN, support vector machine (SVM, k-nearest neighbors (KNN and binary decision trees (BDT, based on features extracted from the raw gait data of the gyroscopes and accelerometers. When the range was used as the input feature, the overall classification accuracy obtained is 100% with BP-ANN, 98% with SVM, 96% with KNN and 94% using BDT. Similar high classification accuracy results were also achieved when the standard deviation or other values were used as input features to these classifiers. These results show that gait data collected from our very low-cost wearable wireless gait sensor can effectively differentiate patients with balance disorders from normal subjects in real time using various classifiers, the success of which may eventually lead to accurate and objective diagnosis of abnormal human gaits and their underlying etiologies in the future, as more patient data are being collected.

  11. Bees do not use nearest-neighbour rules for optimization of multi-location routes.

    Science.gov (United States)

    Lihoreau, Mathieu; Chittka, Lars; Le Comber, Steven C; Raine, Nigel E

    2012-02-23

    Animals collecting patchily distributed resources are faced with complex multi-location routing problems. Rather than comparing all possible routes, they often find reasonably short solutions by simply moving to the nearest unvisited resources when foraging. Here, we report the travel optimization performance of bumble-bees (Bombus terrestris) foraging in a flight cage containing six artificial flowers arranged such that movements between nearest-neighbour locations would lead to a long suboptimal route. After extensive training (80 foraging bouts and at least 640 flower visits), bees reduced their flight distances and prioritized shortest possible routes, while almost never following nearest-neighbour solutions. We discuss possible strategies used during the establishment of stable multi-location routes (or traplines), and how these could allow bees and other animals to solve complex routing problems through experience, without necessarily requiring a sophisticated cognitive representation of space.

  12. Contrasting demographic histories of the neighboring bonobo and chimpanzee

    DEFF Research Database (Denmark)

    Hvilsom, Christina; Carlsen, Frands; Heller, Rasmus

    2014-01-01

    of the neighboring bonobo remained constant. The changes in population size are likely linked to changes in habitat area due to climate oscillations during the late Pleistocene. Furthermore, the timing of population expansion for the rainforest-adapted chimpanzee is concurrent with the expansion of the savanna...

  13. Local randomization in neighbor selection improves PRM roadmap quality

    KAUST Repository

    McMahon, Troy

    2012-10-01

    Probabilistic Roadmap Methods (PRMs) are one of the most used classes of motion planning methods. These sampling-based methods generate robot configurations (nodes) and then connect them to form a graph (roadmap) containing representative feasible pathways. A key step in PRM roadmap construction involves identifying a set of candidate neighbors for each node. Traditionally, these candidates are chosen to be the k-closest nodes based on a given distance metric. In this paper, we propose a new neighbor selection policy called LocalRand(k,K\\'), that first computes the K\\' closest nodes to a specified node and then selects k of those nodes at random. Intuitively, LocalRand attempts to benefit from random sampling while maintaining the higher levels of local planner success inherent to selecting more local neighbors. We provide a methodology for selecting the parameters k and K\\'. We perform an experimental comparison which shows that for both rigid and articulated robots, LocalRand results in roadmaps that are better connected than the traditional k-closest policy or a purely random neighbor selection policy. The cost required to achieve these results is shown to be comparable to k-closest. © 2012 IEEE.

  14. Local randomization in neighbor selection improves PRM roadmap quality

    KAUST Repository

    McMahon, Troy; Jacobs, Sam; Boyd, Bryan; Tapia, Lydia; Amato, Nancy M.

    2012-01-01

    Probabilistic Roadmap Methods (PRMs) are one of the most used classes of motion planning methods. These sampling-based methods generate robot configurations (nodes) and then connect them to form a graph (roadmap) containing representative feasible pathways. A key step in PRM roadmap construction involves identifying a set of candidate neighbors for each node. Traditionally, these candidates are chosen to be the k-closest nodes based on a given distance metric. In this paper, we propose a new neighbor selection policy called LocalRand(k,K'), that first computes the K' closest nodes to a specified node and then selects k of those nodes at random. Intuitively, LocalRand attempts to benefit from random sampling while maintaining the higher levels of local planner success inherent to selecting more local neighbors. We provide a methodology for selecting the parameters k and K'. We perform an experimental comparison which shows that for both rigid and articulated robots, LocalRand results in roadmaps that are better connected than the traditional k-closest policy or a purely random neighbor selection policy. The cost required to achieve these results is shown to be comparable to k-closest. © 2012 IEEE.

  15. Neighboring Genes Show Correlated Evolution in Gene Expression

    Science.gov (United States)

    Ghanbarian, Avazeh T.; Hurst, Laurence D.

    2015-01-01

    When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543

  16. Thermodynamic systematics of oxides of americium, curium, and neighboring elements

    International Nuclear Information System (INIS)

    Morss, L.R.

    1984-01-01

    Recently-obtained calorimetric data on the sesquioxides and dioxides of americium and curium are summarized. These data are combined with other properties of the actinide elements to elucidate the stability relationships among these oxides and to predict the behavior of neighboring actinide oxides. 45 references, 4 figures, 5 tables

  17. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data

    Directory of Open Access Journals (Sweden)

    Yousef Malik

    2016-12-01

    Full Text Available The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN. In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  18. Oil palm fresh fruit bunch ripeness classification based on rule- based expert system of ROI image processing technique results

    International Nuclear Information System (INIS)

    Alfatni, M S M; Shariff, A R M; Marhaban, M H; Shafie, S B; Saaed, O M B; Abdullah, M Z; BAmiruddin, M D

    2014-01-01

    There is a processing need for a fast, easy and accurate classification system for oil palm fruit ripeness. Such a system will be invaluable to farmers and plantation managers who need to sell their oil palm fresh fruit bunch (FFB) for the mill as this will avoid disputes. In this paper,a new approach was developed under the name of expert rules-based systembased on the image processing techniques results of thethree different oil palm FFB region of interests (ROIs), namely; ROI1 (300x300 pixels), ROI2 (50x50 pixels) and ROI3 (100x100 pixels). The results show that the best rule-based ROIs for statistical colour feature extraction with k-nearest neighbors (KNN) classifier at 94% were chosen as well as the ROIs that indicated results higher than the rule-based outcome, such as the ROIs of statistical colour feature extraction with artificial neural network (ANN) classifier at 94%, were selected for further FFB ripeness inspection system

  19. Incidence and Prevalence of Tuberculosis in Iran and Neighboring Countries

    Directory of Open Access Journals (Sweden)

    Arezoo Tavakoli

    2017-07-01

    Full Text Available Background Tuberculosis is one of the major public health concerns in many countries, however the available and effective treatment is known. Tuberculosis typically determined with socio-economic problems such as war, malnutrition and HIV prevalence. In Iran, many progresses are carried to control tuberculosis but, different factors such as immigration from neighboring countries are affective to tuberculosis infection. Objectives In this paper, the incidence and prevalence of tuberculosis is evaluated in different regions of Iran and neighboring countries. Methods The data are collected from different and valid sources such as Scopus, Pubmed and also many reports from world health organization (WHO and center of disease control and prevention (CDC for a period of 25 years (1990 - 2015 evaluated for Iran and neighboring countries. Results This study as a descriptive- analytical research is conducted cross- sectional among Iran and neighboring countries since 1990. The information is obtained from exact and valid informative data from web of sciences. The east and west border countries of Iran which are faced with war and immigration in Afghanistan, Pakistan and Iraq are source of tuberculosis infection that effect on tuberculosis prevalence in Iran. The data were analyzed by SPSS 22 and Excel 2013. Conclusions The incidence of tuberculosis in Iran has been decreased because of many controlling actions such as BCG vaccination, electronic reporting system for tuberculosis and free access to tuberculosis medication. Some of Iran neighboring countries such as Tajikistan and Pakistan have the highest incidence of tuberculosis which known as a challenge for tuberculosis control in Iran while Saudi Arabia and Turkey have the lowest incidence.

  20. Local biotic adaptation of trees and shrubs to plant neighbors

    Science.gov (United States)

    Grady, Kevin C.; Wood, Troy E.; Kolb, Thomas E.; Hersch-Green, Erika; Shuster, Stephen M.; Gehring, Catherine A.; Hart, Stephen C.; Allan, Gerard J.; Whitham, Thomas G.

    2017-01-01

    Natural selection as a result of plant–plant interactions can lead to local biotic adaptation. This may occur where species frequently interact and compete intensely for resources limiting growth, survival, and reproduction. Selection is demonstrated by comparing a genotype interacting with con- or hetero-specific sympatric neighbor genotypes with a shared site-level history (derived from the same source location), to the same genotype interacting with foreign neighbor genotypes (from different sources). Better genotype performance in sympatric than allopatric neighborhoods provides evidence of local biotic adaptation. This pattern might be explained by selection to avoid competition by shifting resource niches (differentiation) or by interactions benefitting one or more members (facilitation). We tested for local biotic adaptation among two riparian trees, Populus fremontii and Salix gooddingii, and the shrub Salix exigua by transplanting replicated genotypes from multiple source locations to a 17 000 tree common garden with sympatric and allopatric treatments along the Colorado River in California. Three major patterns were observed: 1) across species, 62 of 88 genotypes grew faster with sympatric neighbors than allopatric neighbors; 2) these growth rates, on an individual tree basis, were 44, 15 and 33% higher in sympatric than allopatric treatments for P. fremontii, S. exigua and S. gooddingii, respectively, and; 3) survivorship was higher in sympatric treatments for P. fremontiiand S. exigua. These results support the view that fitness of foundation species supporting diverse communities and dominating ecosystem processes is determined by adaptive interactions among multiple plant species with the outcome that performance depends on the genetic identity of plant neighbors. The occurrence of evolution in a plant-community context for trees and shrubs builds on ecological evolutionary research that has demonstrated co-evolution among herbaceous taxa, and

  1. Reasons patients leave their nearest healthcare service to attend Karen Park Clinic, Pretoria North

    Directory of Open Access Journals (Sweden)

    Agnes T. Masango- Makgobela

    2013-10-01

    Conclusion: The majority of patients who had attended their nearest clinic were adamant that they would not return. It is necessary to reduce waiting times, thus reducing long queues. This can be achieved by having adequate, satisfied healthcare providers to render a quality service and by organising training for management. Patients can thus be redirected to their nearest clinic and the health centre’s capacity can be increased by procuring adequate drugs. There is a need to follow up on patients’ complaints about staff attitudes.

  2. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  3. Raman spectroscopy combined with principal component analysis and k nearest neighbour analysis for non-invasive detection of colon cancer

    Science.gov (United States)

    Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Wang, Deli; Song, Youtao; Zhang, Su

    2016-03-01

    This paper attempts to investigate the feasibility of using Raman spectroscopy for the diagnosis of colon cancer. Serum taken from 75 healthy volunteers, 65 colon cancer patients and 60 post-operation colon cancer patients was measured in this experiment. In the Raman spectra of all three groups, the Raman peaks at 750, 1083, 1165, 1321, 1629 and 1779 cm-1 assigned to nucleic acids, amino acids and chromophores were consistently observed. All of these six Raman peaks were observed to have statistically significant differences between groups. For quantitative analysis, the multivariate statistical techniques of principal component analysis (PCA) and k nearest neighbour analysis (KNN) were utilized to develop diagnostic algorithms for classification. In PCA, several peaks in the principal component (PC) loadings spectra were identified as the major contributors to the PC scores. Some of the peaks in the PC loadings spectra were also reported as characteristic peaks for colon tissues, which implies correlation between peaks in PC loadings spectra and those in the original Raman spectra. KNN was also performed on the obtained PCs, and a diagnostic accuracy of 91.0% and a specificity of 92.6% were achieved.

  4. Raman spectroscopy combined with principal component analysis and k nearest neighbour analysis for non-invasive detection of colon cancer

    International Nuclear Information System (INIS)

    Li, Xiaozhou; Yang, Tianyue; Wang, Deli; Li, Siqi; Song, Youtao; Zhang, Su

    2016-01-01

    This paper attempts to investigate the feasibility of using Raman spectroscopy for the diagnosis of colon cancer. Serum taken from 75 healthy volunteers, 65 colon cancer patients and 60 post-operation colon cancer patients was measured in this experiment. In the Raman spectra of all three groups, the Raman peaks at 750, 1083, 1165, 1321, 1629 and 1779 cm −1 assigned to nucleic acids, amino acids and chromophores were consistently observed. All of these six Raman peaks were observed to have statistically significant differences between groups. For quantitative analysis, the multivariate statistical techniques of principal component analysis (PCA) and k nearest neighbour analysis (KNN) were utilized to develop diagnostic algorithms for classification. In PCA, several peaks in the principal component (PC) loadings spectra were identified as the major contributors to the PC scores. Some of the peaks in the PC loadings spectra were also reported as characteristic peaks for colon tissues, which implies correlation between peaks in PC loadings spectra and those in the original Raman spectra. KNN was also performed on the obtained PCs, and a diagnostic accuracy of 91.0% and a specificity of 92.6% were achieved. (paper)

  5. Correlations in a chain of three oscillators with nearest neighbour coupling

    Science.gov (United States)

    Idrus, B.; Konstadopoulou, A.; Spiller, T.; Vourdas, A.

    2010-04-01

    A chain of three oscillators A, B, C with nearest neighbour coupling, is considered. It is shown that the correlations between A, C (which are not coupled directly) can be stronger than the correlations between A, B. Also in some cases various witnesses of entanglement show that A, C are entangled but they cannot lead to any conclusion about A, B.

  6. Segmenting Multiple Sclerosis Lesions using a Spatially Constrained K-Nearest Neighbour approach

    DEFF Research Database (Denmark)

    Lyksborg, Mark; Larsen, Rasmus; Sørensen, Per Soelberg

    2012-01-01

    We propose a method for the segmentation of Multiple Sclerosis lesions. The method is based on probability maps derived from a K-Nearest Neighbours classication. These are used as a non parametric likelihood in a Bayesian formulation with a prior that assumes connectivity of neighbouring voxels. ...

  7. An initialization method for the k-means using the concept of useful nearest centers

    OpenAIRE

    Ismkhan, Hassan

    2017-01-01

    The aim of the k-means is to minimize squared sum of Euclidean distance from the mean (SSEDM) of each cluster. The k-means can effectively optimize this function, but it is too sensitive for initial centers (seeds). This paper proposed a method for initialization of the k-means using the concept of useful nearest center for each data point.

  8. The influence of further-neighbor spin-spin interaction on a ground state of 2D coupled spin-electron model in a magnetic field

    Science.gov (United States)

    Čenčariková, Hana; Strečka, Jozef; Gendiar, Andrej; Tomašovičová, Natália

    2018-05-01

    An exhaustive ground-state analysis of extended two-dimensional (2D) correlated spin-electron model consisting of the Ising spins localized on nodal lattice sites and mobile electrons delocalized over pairs of decorating sites is performed within the framework of rigorous analytical calculations. The investigated model, defined on an arbitrary 2D doubly decorated lattice, takes into account the kinetic energy of mobile electrons, the nearest-neighbor Ising coupling between the localized spins and mobile electrons, the further-neighbor Ising coupling between the localized spins and the Zeeman energy. The ground-state phase diagrams are examined for a wide range of model parameters for both ferromagnetic as well as antiferromagnetic interaction between the nodal Ising spins and non-zero value of external magnetic field. It is found that non-zero values of further-neighbor interaction leads to a formation of new quantum states as a consequence of competition between all considered interaction terms. Moreover, the new quantum states are accompanied with different magnetic features and thus, several kinds of field-driven phase transitions are observed.

  9. Active Metric Learning for Supervised Classification

    OpenAIRE

    Kumaran, Krishnan; Papageorgiou, Dimitri; Chang, Yutong; Li, Minhan; Takáč, Martin

    2018-01-01

    Clustering and classification critically rely on distance metrics that provide meaningful comparisons between data points. We present mixed-integer optimization approaches to find optimal distance metrics that generalize the Mahalanobis metric extensively studied in the literature. Additionally, we generalize and improve upon leading methods by removing reliance on pre-designated "target neighbors," "triplets," and "similarity pairs." Another salient feature of our method is its ability to en...

  10. Social dilemma alleviated by sharing the gains with immediate neighbors

    Science.gov (United States)

    Wu, Zhi-Xi; Yang, Han-Xin

    2014-01-01

    We study the evolution of cooperation in the evolutionary spatial prisoner's dilemma game (PDG) and snowdrift game (SG), within which a fraction α of the payoffs of each player gained from direct game interactions is shared equally by the immediate neighbors. The magnitude of the parameter α therefore characterizes the degree of the relatedness among the neighboring players. By means of extensive Monte Carlo simulations as well as an extended mean-field approximation method, we trace the frequency of cooperation in the stationary state. We find that plugging into relatedness can significantly promote the evolution of cooperation in the context of both studied games. Unexpectedly, cooperation can be more readily established in the spatial PDG than that in the spatial SG, given that the degree of relatedness and the cost-to-benefit ratio of mutual cooperation are properly formulated. The relevance of our model with the stakeholder theory is also briefly discussed.

  11. Grain price spikes and beggar-thy-neighbor policy responses

    DEFF Research Database (Denmark)

    Boysen, Ole; Jensen, Hans Grinsted

    on the agenda of various international policy fora, including the annual meetings of G20 countries in recent years. For that reason, recent studies have attempted to quantify the extent to which such policy actions contributed to the rise in food prices. A study by Jensen & Anderson (2014) uses the global AGE...... model GTAP and the corresponding database to quantify the global policy actions contributions to the raise in food prices by modeling the changes in distortions to agricultural incentives in the period 2006 to 2008. We link the results from this global model into a national AGE model, highlighting how...... global "Beggar-thy-Neighbor Policy Responses" impacted on poor households in Uganda. More specifically we examine the following research questions: What were the Ugandan economy-wide and poverty impacts of the price spikes? What was the impact of other countries "Beggar-thy-Neighbor Policy Responses...

  12. Crimean-Congo hemorrhagic fever in Iran and neighboring countries

    DEFF Research Database (Denmark)

    Chinikar, S; Ghiasi, Seyed Mojtaba; Hewson, R

    2010-01-01

    Crimean-Congo hemorrhagic fever (CCHF) is a zoonotic viral disease that is asymptomatic in infected livestock, but a serious threat to humans. Human infections begin with nonspecific febrile symptoms, but progress to a serious hemorrhagic syndrome with a case fatality rate of 2-50%. Although the ...... in Iran and neighboring countries and provide evidence of over 5000 confirmed cases of CCHF in a single period/season....

  13. ENTROPY CHARACTERISTICS IN MODELS FOR COORDINATION OF NEIGHBORING ROAD SECTIONS

    Directory of Open Access Journals (Sweden)

    N. I. Kulbashnaya

    2016-01-01

    Full Text Available The paper considers an application of entropy characteristics as criteria to coordinate traffic conditions at neighboring road sections. It has been proved that the entropy characteristics are widely used in the methods that take into account information influence of the environment on drivers and in the mechanisms that create such traffic conditions which ensure preservation of the optimal level of driver’s emotional tension during the drive. Solution of such problem is considered in the aspect of coordination of traffic conditions at neighboring road sections that, in its turn, is directed on exclusion of any driver’s transitional processes. Methodology for coordination of traffic conditions at neighboring road sections is based on the E. V. Gavrilov’s concept on coordination of some parameters of road sections which can be expressed in the entropy characteristics. The paper proposes to execute selection of coordination criteria according to accident rates because while moving along neighboring road sections traffic conditions change drastically that can result in creation of an accident situation. Relative organization of a driver’s perception field and driver’s interaction with the traffic environment has been selected as entropy characteristics. Therefore, the given characteristics are made conditional to the road accidents rate. The investigation results have revealed a strong correlation between the relative organization of the driver’s perception field and the relative organization of the driver’s interaction with the traffic environment and the accident rate. Results of the executed experiment have proved an influence of the accident rate on the investigated entropy characteristics.

  14. Do alcohol compliance checks decrease underage sales at neighboring establishments?

    Science.gov (United States)

    Erickson, Darin J; Smolenski, Derek J; Toomey, Traci L; Carlin, Bradley P; Wagenaar, Alexander C

    2013-11-01

    Underage alcohol compliance checks conducted by law enforcement agencies can reduce the likelihood of illegal alcohol sales at checked alcohol establishments, and theory suggests that an alcohol establishment that is checked may warn nearby establishments that compliance checks are being conducted in the area. In this study, we examined whether the effects of compliance checks diffuse to neighboring establishments. We used data from the Complying with the Minimum Drinking Age trial, which included more than 2,000 compliance checks conducted at more than 900 alcohol establishments. The primary outcome was the sale of alcohol to a pseudo-underage buyer without the need for age identification. A multilevel logistic regression was used to model the effect of a compliance check at each establishment as well as the effect of compliance checks at neighboring establishments within 500 m (stratified into four equal-radius concentric rings), after buyer, license, establishment, and community-level variables were controlled for. We observed a decrease in the likelihood of establishments selling alcohol to underage youth after they had been checked by law enforcement, but these effects quickly decayed over time. Establishments that had a close neighbor (within 125 m) checked in the past 90 days were also less likely to sell alcohol to young-appearing buyers. The spatial effect of compliance checks on other establishments decayed rapidly with increasing distance. Results confirm the hypothesis that the effects of police compliance checks do spill over to neighboring establishments. These findings have implications for the development of an optimal schedule of police compliance checks.

  15. Single cell transcriptomics of neighboring hyphae of Aspergillus niger

    Science.gov (United States)

    2011-01-01

    Single cell profiling was performed to assess differences in RNA accumulation in neighboring hyphae of the fungus Aspergillus niger. A protocol was developed to isolate and amplify RNA from single hyphae or parts thereof. Microarray analysis resulted in a present call for 4 to 7% of the A. niger genes, of which 12% showed heterogeneous RNA levels. These genes belonged to a wide range of gene categories. PMID:21816052

  16. Evidence for cultural differences between neighboring chimpanzee communities.

    Science.gov (United States)

    Luncz, Lydia V; Mundry, Roger; Boesch, Christophe

    2012-05-22

    The majority of evidence for cultural behavior in animals has come from comparisons between populations separated by large geographical distances that often inhabit different environments. The difficulty of excluding ecological and genetic variation as potential explanations for observed behaviors has led some researchers to challenge the idea of animal culture. Chimpanzees (Pan troglodytes verus) in the Taï National Park, Côte d'Ivoire, crack Coula edulis nuts using stone and wooden hammers and tree root anvils. In this study, we compare for the first time hammer selection for nut cracking across three neighboring chimpanzee communities that live in the same forest habitat, which reduces the likelihood of ecological variation. Furthermore, the study communities experience frequent dispersal of females at maturity, which eliminates significant genetic variation. We compared key ecological factors, such as hammer availability and nut hardness, between the three neighboring communities and found striking differences in group-specific hammer selection among communities despite similar ecological conditions. Differences were found in the selection of hammer material and hammer size in response to changes in nut resistance over time. Our findings highlight the subtleties of cultural differences in wild chimpanzees and illustrate how cultural knowledge is able to shape behavior, creating differences among neighboring social groups. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Functional Basis of Microorganism Classification

    Science.gov (United States)

    Zhu, Chengsheng; Delmont, Tom O.; Vogel, Timothy M.; Bromberg, Yana

    2015-01-01

    Correctly identifying nearest “neighbors” of a given microorganism is important in industrial and clinical applications where close relationships imply similar treatment. Microbial classification based on similarity of physiological and genetic organism traits (polyphasic similarity) is experimentally difficult and, arguably, subjective. Evolutionary relatedness, inferred from phylogenetic markers, facilitates classification but does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Using over thirteen hundred sequenced bacterial genomes, we built a novel function-based microorganism classification scheme, functional-repertoire similarity-based organism network (FuSiON; flattened to fusion). Our scheme is phenetic, based on a network of quantitatively defined organism relationships across the known prokaryotic space. It correlates significantly with the current taxonomy, but the observed discrepancies reveal both (1) the inconsistency of functional diversity levels among different taxa and (2) an (unsurprising) bias towards prioritizing, for classification purposes, relatively minor traits of particular interest to humans. Our dynamic network-based organism classification is independent of the arbitrary pairwise organism similarity cut-offs traditionally applied to establish taxonomic identity. Instead, it reveals natural, functionally defined organism groupings and is thus robust in handling organism diversity. Additionally, fusion can use organism meta-data to highlight the specific environmental factors that drive microbial diversification. Our approach provides a complementary view to cladistic assignments and holds important clues for further exploration of microbial lifestyles. Fusion is a more practical fit for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment and are less concerned

  18. Personalised news filtering and recommendation system using Chi-square statistics-based K-nearest neighbour (χ2SB-KNN) model

    Science.gov (United States)

    Adeniyi, D. A.; Wei, Z.; Yang, Y.

    2017-10-01

    Recommendation problem has been extensively studied by researchers in the field of data mining, database and information retrieval. This study presents the design and realisation of an automated, personalised news recommendations system based on Chi-square statistics-based K-nearest neighbour (χ2SB-KNN) model. The proposed χ2SB-KNN model has the potential to overcome computational complexity and information overloading problems, reduces runtime and speeds up execution process through the use of critical value of χ2 distribution. The proposed recommendation engine can alleviate scalability challenges through combined online pattern discovery and pattern matching for real-time recommendations. This work also showcases the development of a novel method of feature selection referred to as Data Discretisation-Based feature selection method. This is used for selecting the best features for the proposed χ2SB-KNN algorithm at the preprocessing stage of the classification procedures. The implementation of the proposed χ2SB-KNN model is achieved through the use of a developed in-house Java program on an experimental website called OUC newsreaders' website. Finally, we compared the performance of our system with two baseline methods which are traditional Euclidean distance K-nearest neighbour and Naive Bayesian techniques. The result shows a significant improvement of our method over the baseline methods studied.

  19. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  20. Heart murmur detection based on wavelet transformation and a synergy between artificial neural network and modified neighbor annealing methods.

    Science.gov (United States)

    Eslamizadeh, Gholamhossein; Barati, Ramin

    2017-05-01

    Early recognition of heart disease plays a vital role in saving lives. Heart murmurs are one of the common heart problems. In this study, Artificial Neural Network (ANN) is trained with Modified Neighbor Annealing (MNA) to classify heart cycles into normal and murmur classes. Heart cycles are separated from heart sounds using wavelet transformer. The network inputs are features extracted from individual heart cycles, and two classification outputs. Classification accuracy of the proposed model is compared with five multilayer perceptron trained with Levenberg-Marquardt, Extreme-learning-machine, back-propagation, simulated-annealing, and neighbor-annealing algorithms. It is also compared with a Self-Organizing Map (SOM) ANN. The proposed model is trained and tested using real heart sounds available in the Pascal database to show the applicability of the proposed scheme. Also, a device to record real heart sounds has been developed and used for comparison purposes too. Based on the results of this study, MNA can be used to produce considerable results as a heart cycle classifier. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Watch Out for Your Neighbor: Climbing onto Shrubs Is Related to Risk of Cannibalism in the Scorpion Buthus cf. occitanus.

    Science.gov (United States)

    Sánchez-Piñero, Francisco; Urbano-Tenorio, Fernando

    The distribution and behavior of foraging animals usually imply a balance between resource availability and predation risk. In some predators such as scorpions, cannibalism constitutes an important mortality factor determining their ecology and behavior. Climbing on vegetation by scorpions has been related both to prey availability and to predation (cannibalism) risk. We tested different hypotheses proposed to explain climbing on vegetation by scorpions. We analyzed shrub climbing in Buthus cf. occitanus with regard to the following: a) better suitability of prey size for scorpions foraging on shrubs than on the ground, b) selection of shrub species with higher prey load, c) seasonal variations in prey availability on shrubs, and d) whether or not cannibalism risk on the ground increases the frequency of shrub climbing. Prey availability on shrubs was compared by estimating prey abundance in sticky traps placed in shrubs. A prey sample from shrubs was measured to compare prey size. Scorpions were sampled in six plots (50 m x 10 m) to estimate the proportion of individuals climbing on shrubs. Size difference and distance between individuals and their closest scorpion neighbor were measured to assess cannibalism risk. The results showed that mean prey size was two-fold larger on the ground. Selection of particular shrub species was not related to prey availability. Seasonal variations in the number of scorpions on shrubs were related to the number of active scorpions, but not with fluctuations in prey availability. Size differences between a scorpion and its nearest neighbor were positively related with a higher probability for a scorpion to climb onto a shrub when at a disadvantage, but distance was not significantly related. These results do not support hypotheses explaining shrub climbing based on resource availability. By contrast, our results provide evidence that shrub climbing is related to cannibalism risk.

  2. Mountain tourism development in Serbia and neighboring countries

    Directory of Open Access Journals (Sweden)

    Krunić Nikola

    2010-01-01

    Full Text Available Mountain areas with their surroundings are important parts of tourism regions with potentials for all-season tourism development and complementary activities. Development possibilities are based on size of high mountain territory, nature protection regimes, infrastructural equipment, provided conditions for leisure and recreation as well as involvement of local population in processes of development and protection. This paper analyses the key aspects of tourism development, winter tourism in high-mountain areas of Serbia and some neighboring countries (Slovakia, Romania, Bulgaria, and Greece. Common determinants of cohesion between nature protection and mountain tourism development, national development policies, applied models and concepts and importance of trans-border cooperation are indicated.

  3. Neighboring Structure Visualization on a Grid-based Layout.

    Science.gov (United States)

    Marcou, G; Horvath, D; Varnek, A

    2017-10-01

    Here, we describe an algorithm to visualize chemical structures on a grid-based layout in such a way that similar structures are neighboring. It is based on structure reordering with the help of the Hilbert Schmidt Independence Criterion, representing an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator. The method can be applied to any layout of bi- or three-dimensional shape. The approach is demonstrated on a set of dopamine D5 ligands visualized on squared, disk and spherical layouts. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Boosting nearest-neighbour to long-range integrable spin chains

    International Nuclear Information System (INIS)

    Bargheer, Till; Beisert, Niklas; Loebbert, Florian

    2008-01-01

    We present an integrability-preserving recursion relation for the explicit construction of long-range spin chain Hamiltonians. These chains are generalizations of the Haldane–Shastry and Inozemtsev models and they play an important role in recent advances in string/gauge duality. The method is based on arbitrary nearest-neighbour integrable spin chains and it sheds light on the moduli space of deformation parameters. We also derive the closed chain asymptotic Bethe equations. (letter)

  5. Low-field susceptibility of classical Heisenberg chains with arbitrary and different nearest-neighbour exchange

    International Nuclear Information System (INIS)

    Cregg, P J; Murphy, K; Garcia-Palacios, J L; Svedlindh, P

    2008-01-01

    Interest in molecular magnets continues to grow, offering a link between the atomic and nanoscale properties. The classical Heisenberg model has been effective in modelling exchange interactions in such systems. In this, the magnetization and susceptibility are calculated through the partition function, where the Hamiltonian contains both Zeeman and exchange energy. For an ensemble of N spins, this requires integrals in 2N dimensions. For two, three and four spin nearest-neighbour chains these integrals reduce to sums of known functions. For the case of the three and four spin chains, the sums are equivalent to results of Joyce. Expanding these sums, the effect of the exchange on the linear susceptibility appears as Langevin functions with exchange term arguments. These expressions are generalized here to describe an N spin nearest-neighbour chain, where the exchange between each pair of nearest neighbours is different and arbitrary. For a common exchange constant, this reduces to the result of Fisher. The high-temperature expansion of the Langevin functions for the different exchange constants leads to agreement with the appropriate high-temperature quantum formula of Schmidt et al, when the spin number is large. Simulations are presented for open linear chains of three, four and five spins with up to four different exchange constants, illustrating how the exchange constants can be retrieved successfully

  6. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  7. A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules.

    Science.gov (United States)

    Vyas, Renu; Bapat, Sanket; Jain, Esha; Tambe, Sanjeev S; Karthikeyan, Muthukumarasamy; Kulkarni, Bhaskar D

    2015-01-01

    The ligand-based virtual screening of combinatorial libraries employs a number of statistical modeling and machine learning methods. A comprehensive analysis of the application of these methods for the diversity oriented virtual screening of biological targets/drug classes is presented here. A number of classification models have been built using three types of inputs namely structure based descriptors, molecular fingerprints and therapeutic category for performing virtual screening. The activity and affinity descriptors of a set of inhibitors of four target classes DHFR, COX, LOX and NMDA have been utilized to train a total of six classifiers viz. Artificial Neural Network (ANN), k nearest neighbor (k-NN), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree--(DT) and Random Forest--(RF). Among these classifiers, the ANN was found as the best classifier with an AUC of 0.9 irrespective of the target. New molecular fingerprints based on pharmacophore, toxicophore and chemophore (PTC), were used to build the ANN models for each dataset. A good accuracy of 87.27% was obtained using 296 chemophoric binary fingerprints for the COX-LOX inhibitors compared to pharmacophoric (67.82%) and toxicophoric (70.64%). The methodology was validated on the classical Ames mutagenecity dataset of 4337 molecules. To evaluate it further, selectivity and promiscuity of molecules from five drug classes viz. anti-anginal, anti-convulsant, anti-depressant, anti-arrhythmic and anti-diabetic were studied. The TPC fingerprints computed for each category were able to capture the drug-class specific features using the k-NN classifier. These models can be useful for selecting optimal molecules for drug design.

  8. Automatic classification of fluorescence and optical diffusion spectroscopy data in neuro-oncology

    Science.gov (United States)

    Savelieva, T. A.; Loshchenov, V. B.; Goryajnov, S. A.; Potapov, A. A.

    2018-04-01

    The complexity of the biological tissue spectroscopic analysis due to the overlap of biological molecules' absorption spectra, multiple scattering effect, as well as measurement geometry in vivo has caused the relevance of this work. In the neurooncology the problem of tumor boundaries delineation is especially acute and requires the development of new methods of intraoperative diagnosis. Methods of optical spectroscopy allow detecting various diagnostically significant parameters non-invasively. 5-ALA induced protoporphyrin IX is frequently used as fluorescent tumor marker in neurooncology. At the same time analysis of the concentration and the oxygenation level of haemoglobin and significant changes of light scattering in tumor tissues have a high diagnostic value. This paper presents an original method for the simultaneous registration of backward diffuse reflectance and fluorescence spectra, which allows defining all the parameters listed above simultaneously. The clinical studies involving 47 patients with intracranial glial tumors of II-IV Grades were carried out in N.N. Burdenko National Medical Research Center of Neurosurgery. To register the spectral dependences the spectroscopic system LESA- 01-BIOSPEC was used with specially developed w-shaped diagnostic fiber optic probe. The original algorithm of combined spectroscopic signal processing was developed. We have created a software and hardware, which allowed (as compared with the methods currently used in neurosurgical practice) to increase the sensitivity of intraoperative demarcation of intracranial tumors from 78% to 96%, specificity of 60% to 82%. The result of analysis of different techniques of automatic classification shows that in our case the most appropriate is the k Nearest Neighbors algorithm with cubic metrics.

  9. Next neighbors effect along the Ca-Sr-Ba-åkermanite join: Long-range vs. short-range structural features

    Science.gov (United States)

    Dondi, Michele; Ardit, Matteo; Cruciani, Giuseppe

    2013-06-01

    An original approach has been developed herein to explore the correlations between short- and long-range structural properties of solid solutions. X-ray diffraction (XRD) and electronic absorption spectroscopy (EAS) data were combined on a (Ca,Sr,Ba)2(Mg0.7Co0.3)Si2O7 join to determine average and local distances, respectively. Instead of varying the EAS-active ion concentration along the join, as has commonly been performed in previous studies, the constant replacement of Mg2+ by a minimal fraction of a similar size cation (Co2+) has been used to assess the effects of varying second-nearest neighbor cations (Ca, Sr, Ba) on the local distances of the first shell. A comparison between doped and un-doped series has shown that, although the overall symmetry of the Co-centered T1-site was retained, greater relaxation occurs at the CoO4 tetrahedra which become increasingly large and more distorted than the MgO4 tetrahedra. This is indicated by an increase in both the quadratic elongation (λT1) and the bond angle variance (σ2T1) distortion indices, as the whole structure expands due to an increase in size in the second-nearest neighbors. This behavior highlights the effect of the different electronic configurations of Co2+ (3d7) and Mg2+ (2p6) in spite of their very similar ionic size. Furthermore, although the overall symmetry of the Co-centered T1-site is retained, relatively limited (Co2+-O occur along the solid solution series and large changes are found in molar absorption coefficients showing that EAS Co2+-bands are highly sensitive to change in the local structure.

  10. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  11. Large margin image set representation and classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-07-06

    In this paper, we propose a novel image set representation and classification method by maximizing the margin of image sets. The margin of an image set is defined as the difference of the distance to its nearest image set from different classes and the distance to its nearest image set of the same class. By modeling the image sets by using both their image samples and their affine hull models, and maximizing the margins of the images sets, the image set representation parameter learning problem is formulated as an minimization problem, which is further optimized by an expectation - maximization (EM) strategy with accelerated proximal gradient (APG) optimization in an iterative algorithm. To classify a given test image set, we assign it to the class which could provide the largest margin. Experiments on two applications of video-sequence-based face recognition demonstrate that the proposed method significantly outperforms state-of-the-art image set classification methods in terms of both effectiveness and efficiency.

  12. Large margin image set representation and classification

    KAUST Repository

    Wang, Jim Jing-Yan; Alzahrani, Majed A.; Gao, Xin

    2014-01-01

    In this paper, we propose a novel image set representation and classification method by maximizing the margin of image sets. The margin of an image set is defined as the difference of the distance to its nearest image set from different classes and the distance to its nearest image set of the same class. By modeling the image sets by using both their image samples and their affine hull models, and maximizing the margins of the images sets, the image set representation parameter learning problem is formulated as an minimization problem, which is further optimized by an expectation - maximization (EM) strategy with accelerated proximal gradient (APG) optimization in an iterative algorithm. To classify a given test image set, we assign it to the class which could provide the largest margin. Experiments on two applications of video-sequence-based face recognition demonstrate that the proposed method significantly outperforms state-of-the-art image set classification methods in terms of both effectiveness and efficiency.

  13. Using Generalized Entropies and OC-SVM with Mahalanobis Kernel for Detection and Classification of Anomalies in Network Traffic

    Directory of Open Access Journals (Sweden)

    Jayro Santiago-Paz

    2015-09-01

    Full Text Available Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS, which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD, and One Class Support Vector Machine (OC-SVM with different kernels (Radial Basis Function (RBF and Mahalanobis Kernel (MK for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN, and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with

  14. Radionuclide content of an exhumed canyon vessel and neighboring soil

    International Nuclear Information System (INIS)

    Holcomb, H.P.

    1976-11-01

    The long-term hazard potential associated with burial of process equipment from radiochemical separations plants is being evaluated. As part of this evaluation, a feed adjustment tank was exhumed eighteen years after burial. The tank had been in service in the fuel reprocessing plant for twenty-nine months before it was retired. Assay of the exhumed tank indicated that 7 mg (0.4 mCi) of 239 Pu and 1 mCi of 137 Cs remained on its surfaces; 1.1 mg (0.07 mCi) 239 Pu, 0.4 mCi 137 Cs, and 3.5 mCi 90 Sr were found in neighboring soil. The vessel and surrounding soil have met the present guidelines (less than or equal to 10 nCi/g) of the U. S. Energy Research and Development Administration (ERDA) for nonretrievable waste

  15. Reduction of Conflicts in Mining Development Using "Good Neighbor Agreements"

    Science.gov (United States)

    Masaitis, A.

    2013-05-01

    New environmental and social challenges for the mining industry in both developed and developing countries show the obvious need to implement "responsible" mining practices that include improved community involvement. Good Neighbor Agreements (GNA's) are a relatively new mechanism for improving communication and trust between a mining company and the community. The focus of a GNA will be to provide a written and enforceable agreement, negotiated between the concerned public and the respective mining company to respond to concerns from the public, and also provide a mechanism for conflict resolution, when there is mutual benefit to maintain a working relationship. Development of GNA's, a recently evolving process that promotes environmentally sound relationships between mines and the surrounding communities. Modify and apply the resulting GNA formulas to the developing countries and countries with transitional economies. This is particularly important for countries that have poorly functioning regulatory systems that cannot guarantee a healthy and safe environment for the communities. The fundamental questions addressed by this research. 1. This is a three-year research project started in August 2012 at the University of Nevada, Reno (UNR) to develop a Good Neighbor Agreements standards as well as to investigate the details of mine development. 2. Identify spheres of possible cooperation between mining companies, government organizations, and the Non-Governmental Organizations (NGO's). Use this cooperation to develop international standards for the GNA, to promote exchange of environmental information, and exchange of successful environmental, health, and safety practices between mining operations from different countries. Discussion: The Good Neighbor Agreement currently evolving will address the following: 1. Provide an economically viable mechanism for developing a partnership between mining operations and the local communities that will increase mining industry

  16. Building good relationships with neighbors of Japan's oldest plant, Tsuruga

    International Nuclear Information System (INIS)

    Hata, Emi

    1992-01-01

    Since its establishment in 1957 as a pioneer company of nuclear power development in Japan, the Japan Atomic Power Company (JAPC) has gained a great deal of experience with construction and operation of four nuclear power plants - one gas-cooled reactor, two boiling water reactors (BWRs), and one pressurized water reactor (PWR) - at two sites, Tsuruga and Tokai. To gain the understanding and cooperation of the local community, the Tsuruga station must keep running. Each employee is encouraged to make every possible effort not only to ensure the safe and reliable operation of the two units, but also to ensure conscientious coexistence and coprosperity within the local community. The Tsuruga office in the city and the Public Relations (PR) Pavilion (visitor's center) at the site work together as an open window of communication with the local community. Under these basic philosophies, various good neighbor activities are developed and carried out

  17. Neighboring Optimal Aircraft Guidance in a General Wind Environment

    Science.gov (United States)

    Jardin, Matthew R. (Inventor)

    2003-01-01

    Method and system for determining an optimal route for an aircraft moving between first and second waypoints in a general wind environment. A selected first wind environment is analyzed for which a nominal solution can be determined. A second wind environment is then incorporated; and a neighboring optimal control (NOC) analysis is performed to estimate an optimal route for the second wind environment. In particular examples with flight distances of 2500 and 6000 nautical miles in the presence of constant or piecewise linearly varying winds, the difference in flight time between a nominal solution and an optimal solution is 3.4 to 5 percent. Constant or variable winds and aircraft speeds can be used. Updated second wind environment information can be provided and used to obtain an updated optimal route.

  18. Radiative energy loss of neighboring subjets arXiv

    CERN Document Server

    Mehtar-Tani, Yacine

    We compute the in-medium energy loss probability distribution of two neighboring subjets at leading order, in the large-$N_c$ approximation. Our result exhibits a gradual onset of color decoherence of the system and accounts for two expected limiting cases. When the angular separation is smaller than the characteristic angle for medium-induced radiation, the two-pronged substructure lose energy coherently as a single color charge, namely that of the parent parton. At large angular separation the two subjets lose energy independently. Our result is a first step towards quantifying effects of energy loss as a result of the fluctuation of the multi-parton jet substructure and therefore goes beyond the standard approach to jet quenching based on single parton energy loss. We briefly discuss applications to jet observables in heavy-ion collisions.

  19. An interactive cooperation model for neighboring virtual power plants

    International Nuclear Information System (INIS)

    Shabanzadeh, Morteza; Sheikh-El-Eslami, Mohammad-Kazem; Haghifam, Mahmoud-Reza

    2017-01-01

    Highlights: •The trading strategies of a VPP in cooperation with its neighboring VPPs are addressed. •A portfolio of inter-regional contracts is considered to model this cooperation scheme. •A novel mathematical formulation for possible inadvertent transactions is provided. •A two-stage stochastic programming approach is applied to characterize the uncertainty. •Two efficient risk measures, SSD and CVaR, are implemented in the VPP decision-making problem. -- Abstract: Future distribution systems will accommodate an increasing share of distributed energy resources (DERs). Facing with this new reality, virtual power plants (VPPs) play a key role to aggregate DERs with the aim of facilitating their involvement in wholesale electricity markets. In this paper, the trading strategies of a VPP in cooperation with its neighboring VPPs are addressed. Toward this aim, a portfolio of inter-regional contracts is considered to model this cooperation and maximize the energy trade opportunities of the VPP within a medium-term horizon. To hedge against profit variability caused by market price uncertainties, two efficient risk management approaches are also implemented in the VPP decision-making problem based on the concepts of conditional value at risk (CVaR) and second-order stochastic dominance constraints (SSD). The resulting models are formulated as mixed-integer linear programming (MILP) problems that can be solved using off-the-shelf software packages. The efficiency of the proposed risk-hedging models is analyzed through a detailed case study, and thereby relevant conclusions are drawn.

  20. Video based object representation and classification using multiple covariance matrices.

    Science.gov (United States)

    Zhang, Yurong; Liu, Quan

    2017-01-01

    Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.

  1. 3D NEAREST NEIGHBOUR SEARCH USING A CLUSTERED HIERARCHICAL TREE STRUCTURE

    Directory of Open Access Journals (Sweden)

    A. Suhaibah

    2016-06-01

    Full Text Available Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  2. Spinon decay in the spin-1/2 Heisenberg chain with weak next nearest neighbour exchange

    International Nuclear Information System (INIS)

    Groha, Stefan; Essler, Fabian H L

    2017-01-01

    Integrable models support elementary excitations with infinite lifetimes. In the spin-1/2 Heisenberg chain these are known as spinons. We consider the stability of spinons when a weak integrability breaking perturbation is added to the Heisenberg chain in a magnetic field. We focus on the case where the perturbation is a next nearest neighbour exchange interaction. We calculate the spinon decay rate in leading order in perturbation theory using methods of integrability and identify the dominant decay channels. The decay rate is found to be small, which indicates that spinons remain well-defined excitations even though integrability is broken. (paper)

  3. Dispersion of a layered electron gas with nearest neighbour-tunneling

    International Nuclear Information System (INIS)

    Miesenboeck, H.M.

    1988-09-01

    The dispersion of the first plasmon band is calculated within the Random Phase Approximation for a superlattice of two-dimensional electron-gases, mutually interacting, and with nearest neighbour hopping between the planes. It is further shown that the deviations of this dispersion from the one in systems with zero interplane motion are very small in commonly realized experimental situations and that they are expected to be observable only in samples with plane distances of 100A and less. (author). 15 refs, 3 figs, 1 tab

  4. Unwanted Behaviors and Nuisance Behaviors Among Neighbors in a Belgian Community Sample.

    Science.gov (United States)

    Michaux, Emilie; Groenen, Anne; Uzieblo, Katarzyna

    2015-06-30

    Unwanted behaviors between (ex-)intimates have been extensively studied, while those behaviors within other contexts such as neighbors have received much less scientific consideration. Research indicates that residents are likely to encounter problem behaviors from their neighbors. Besides the lack of clarity in the conceptualization of problem behaviors among neighbors, little is known on which types of behaviors characterize neighbor problems. In this study, the occurrence of two types of problem behaviors encountered by neighbors was explored within a Belgian community sample: unwanted behaviors such as threats and neighbor nuisance issues such as noise nuisance. By clearly distinguishing those two types of behaviors, this study aimed at contributing to the conceptualization of neighbor problems. Next, the coping strategies used to deal with the neighbor problems were investigated. Our results indicated that unwanted behaviors were more frequently encountered by residents compared with nuisance problems. Four out of 10 respondents reported both unwanted pursuit behavior and nuisance problems. It was especially unlikely to encounter nuisance problems in isolation of unwanted pursuit behaviors. While different coping styles (avoiding the neighbor, confronting the neighbor, and enlisting help from others) were equally used by the stalked participants, none of them was perceived as being more effective in reducing the stalking behaviors. Strikingly, despite being aware of specialized help services such as community mediation services, only a very small subgroup enlisted this kind of professional help. © The Author(s) 2015.

  5. Self-avoiding trails with nearest-neighbour interactions on the square lattice

    International Nuclear Information System (INIS)

    Bedini, A; Owczarek, A L; Prellberg, T

    2013-01-01

    Self-avoiding walks and self-avoiding trails, two models of a polymer coil in dilute solution, have been shown to be governed by the same universality class. On the other hand, self-avoiding walks interacting via nearest-neighbour contacts (ISAW) and self-avoiding trails interacting via multiply visited sites (ISAT) are two models of the coil-globule, or collapse transition of a polymer in dilute solution. On the square lattice it has been established numerically that the collapse transition of each model lies in a different universality class. The models differ in two substantial ways. They differ in the types of subsets of random walk configurations utilized (site self-avoidance versus bond self-avoidance) and in the type of attractive interaction. It is therefore of some interest to consider self-avoiding trails interacting via nearest-neighbour attraction (INNSAT) in order to ascertain the source of the difference in the collapse universality class. Using the flatPERM algorithm, we have performed computer simulations of this model. We present numerical evidence that the singularity in the free energy of INNSAT at the collapse transition has a similar exponent to that of the ISAW model rather than the ISAT model. This would indicate that the type of interaction used in ISAW and ISAT is the source of the difference in the universality class. (paper)

  6. Evaluating a k-nearest neighbours-based classifier for locating faulty areas in power systems

    Directory of Open Access Journals (Sweden)

    Juan José Mora Flórez

    2008-09-01

    Full Text Available This paper reports a strategy for identifying and locating faults in a power distribution system. The strategy was based on the K-nearest neighbours technique. This technique simply helps to estimate a distance from the features used for describing a particu-lar fault being classified to the faults presented during the training stage. If new data is presented to the proposed fault locator, it is classified according to the nearest example recovered. A characterisation of the voltage and current measurements obtained at one single line end is also presented in this document for assigning the area in the case of a fault in a power system. The pro-posed strategy was tested in a real power distribution system, average 93% confidence indexes being obtained which gives a good indicator of the proposal’s high performance. The results showed how a fault could be located by using features obtained from voltage and current, improving utility response and thereby improving system continuity indexes in power distribution sys-tems.

  7. Estimating persistence of brominated and chlorinated organic pollutants in air, water, soil, and sediments with the QSPR-based classification scheme.

    Science.gov (United States)

    Puzyn, T; Haranczyk, M; Suzuki, N; Sakurai, T

    2011-02-01

    We have estimated degradation half-lives of both brominated and chlorinated dibenzo-p-dioxins (PBDDs and PCDDs), furans (PBDFs and PCDFs), biphenyls (PBBs and PCBs), naphthalenes (PBNs and PCNs), diphenyl ethers (PBDEs and PCDEs) as well as selected unsubstituted polycyclic aromatic hydrocarbons (PAHs) in air, surface water, surface soil, and sediments (in total of 1,431 compounds in four compartments). Next, we compared the persistence between chloro- (relatively well-studied) and bromo- (less studied) analogs. The predictions have been performed based on the quantitative structure-property relationship (QSPR) scheme with use of k-nearest neighbors (kNN) classifier and the semi-quantitative system of persistence classes. The classification models utilized principal components derived from the principal component analysis of a set of 24 constitutional and quantum mechanical descriptors as input variables. Accuracies of classification (based on an external validation) were 86, 85, 87, and 75% for air, surface water, surface soil, and sediments, respectively. The persistence of all chlorinated species increased with increasing halogenation degree. In the case of brominated organic pollutants (Br-OPs), the trend was the same for air and sediments. However, we noticed that the opposite trend for persistence in surface water and soil. The results suggest that, due to high photoreactivity of C-Br chemical bonds, photolytic processes occurring in surface water and soil are able to play significant role in transforming and removing Br-OPs from these compartments. This contribution is the first attempt of classifying together Br-OPs and Cl-OPs according to their persistence, in particular, environmental compartments.

  8. Multisource multibeam backscatter data: developing a strategy for the production of benthic habitat maps using semi-automated seafloor classification methods

    Science.gov (United States)

    Lacharité, Myriam; Brown, Craig J.; Gazzola, Vicki

    2018-06-01

    The establishment of multibeam echosounders (MBES) as a mainstream tool in ocean mapping has facilitated integrative approaches towards nautical charting, benthic habitat mapping, and seafloor geotechnical surveys. The bathymetric and backscatter information generated by MBES enables marine scientists to present highly accurate bathymetric data with a spatial resolution closely matching that of terrestrial mapping, and can generate customized thematic seafloor maps to meet multiple ocean management needs. However, when a variety of MBES systems are used, the creation of objective habitat maps can be hindered by the lack of backscatter calibration, due for example, to system-specific settings, yielding relative rather than absolute values. Here, we describe an approach using object-based image analysis to combine 4 non-overlapping and uncalibrated (backscatter) MBES coverages to form a seamless habitat map on St. Anns Bank (Atlantic Canada), a marine protected area hosting a diversity of benthic habitats. The benthoscape map was produced by analysing each coverage independently with supervised classification (k-nearest neighbor) of image-objects based on a common suite of 7 benthoscapes (determined with 4214 ground-truthing photographs at 61 stations, and characterized with backscatter, bathymetry, and bathymetric position index). Manual re-classification based on uncertainty in membership values to individual classes—especially at the boundaries between coverages—was used to build the final benthoscape map. Given the costs and scarcity of MBES surveys in offshore marine ecosystems—particularly in large ecosystems in need of adequate conservation strategies, such as in Canadian waters—developing approaches to synthesize multiple datasets to meet management needs is warranted.

  9. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  10. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...... that call for inquiries into the theoretical foundation of bibliographic classification theory....

  11. Unsynchronized influenza epidemics in two neighboring subtropical cities

    Directory of Open Access Journals (Sweden)

    Xiujuan Tang

    2018-04-01

    Full Text Available Objective: The aim of this study was to examine the synchrony of influenza epidemics between Hong Kong and Shenzhen, two neighboring subtropical cities in South China. Methods: Laboratory-confirmed influenza data for the period January 2006 to December 2016 were obtained from the Shenzhen Center for Disease Control and Prevention and the Department of Health in Hong Kong. The population data were retrieved from the 2011 population censuses. The weekly rates of laboratory-confirmed influenza cases were compared between Shenzhen and Hong Kong. Results: Unsynchronized influenza epidemics between Hong Kong and Shenzhen were frequently observed during the study period. Influenza A/H1N1 caused a more severe pandemic in Hong Kong in 2009, but the subsequent seasonal epidemics showed similar magnitudes in both cities. Two influenza A/H3N2 dominant epidemic waves were seen in Hong Kong in 2015, but these epidemics were very minor in Shenzhen. More influenza B epidemics occurred in Shenzhen than in Hong Kong. Conclusions: Influenza epidemics appeared to be unsynchronized between Hong Kong and Shenzhen most of the time. Given the close geographical locations of these two cities, this could be due to the strikingly different age structures of their populations. Keywords: Influenza epidemics, Synchrony, Shenzhen, Hong Kong

  12. Identification of influential users by neighbors in online social networks

    Science.gov (United States)

    Sheikhahmadi, Amir; Nematbakhsh, Mohammad Ali; Zareie, Ahmad

    2017-11-01

    Identification and ranking of influential users in social networks for the sake of news spreading and advertising has recently become an attractive field of research. Given the large number of users in social networks and also the various relations that exist among them, providing an effective method to identify influential users has been gradually considered as an essential factor. In most of the already-provided methods, those users who are located in an appropriate structural position of the network are regarded as influential users. These methods do not usually pay attention to the interactions among users, and also consider those relations as being binary in nature. This paper, therefore, proposes a new method to identify influential users in a social network by considering those interactions that exist among the users. Since users tend to act within the frame of communities, the network is initially divided into different communities. Then the amount of interaction among users is used as a parameter to set the weight of relations existing within the network. Afterward, by determining the neighbors' role for each user, a two-level method is proposed for both detecting users' influence and also ranking them. Simulation and experimental results on twitter data shows that those users who are selected by the proposed method, comparing to other existing ones, are distributed in a more appropriate distance. Moreover, the proposed method outperforms the other ones in terms of both the influential speed and capacity of the users it selects.

  13. A classification scheme for young stellar objects using the wide-field infrared survey explorer AllWISE catalog: revealing low-density star formation in the outer galaxy

    Energy Technology Data Exchange (ETDEWEB)

    Koenig, X. P. [Department of Astronomy, Yale University, New Haven, CT 06511 (United States); Leisawitz, D. T. [NASA Goddard Space Flight Center, Greenbelt, MD 20771 (United States)

    2014-08-20

    We present an assessment of the performance of WISE and the AllWISE data release for a section of the Galactic Plane. We lay out an approach to increasing the reliability of point-source photometry extracted from the AllWISE catalog in Galactic Plane regions using parameters provided in the catalog. We use the resulting catalog to construct a new, revised young star detection and classification scheme combining WISE and 2MASS near- and mid-infrared colors and magnitudes and test it in a section of the outer Milky Way. The clustering properties of the candidate Class I and II stars using a nearest neighbor density calculation and the two-point correlation function suggest that the majority of stars do form in massive star-forming regions, and any isolated mode of star formation is at most a small fraction of the total star forming output of the Galaxy. We also show that the isolated component may be very small and could represent the tail end of a single mechanism of star formation in line with models of molecular cloud collapse with supersonic turbulence and not a separate mode all to itself.

  14. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  15. Infrared polarimetry of the nucleus of Centaurus A: the nearest blazar

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, J; Sparks, W B; Hough, J H; Axon, D J

    1986-07-10

    As one of the nearest examples of an active galaxy, NGC5128 (Centaurus A) has been studied in detail over a wide range of wavelengths. The authors have made polarization observations of the infrared nucleus at wavelengths from 1.2 to 3.8 ..mu..m. The nucleus is found to have a large intrinsic polarization of approx.=9% at position angle 147/sup 0/. This position angle is perpendicular to the direction of the X-ray and radio jet. The polarized emission from the nucleus is interpreted as synchrotron radiation from a region whose magnetic field is parallel to the jet direction. The properties of the Cen A nucleus are essentially identical to those of the much more luminous blazars. This suggest that blazar-type activity extends over a very wide range in luminosity, and low-luminosity blazars may be common in elliptical galaxies.

  16. Phase correlation and clustering of a nearest neighbour coupled oscillators system

    International Nuclear Information System (INIS)

    EI-Nashar, Hassan F.

    2002-09-01

    We investigated the phases in a system of nearest neighbour coupled oscillators before complete synchronization in frequency occurs. We found that when oscillators under the influence of coupling form a cluster of the same time-average frequency, their phases start to correlate. An order parameter, which measures this correlation, starts to grow at this stage until it reaches maximum. This means that a time-average phase locked state is reached between the oscillators inside the cluster of the same time- average frequency. At this strength the cluster attracts individual oscillators or a cluster to join in. We also observe that clustering in averaged frequencies orders the phases of the oscillators. This behavior is found at all the transition points studied. (author)

  17. Nearest-cell: a fast and easy tool for locating crystal matches in the PDB

    International Nuclear Information System (INIS)

    Ramraj, V.; Evans, G.; Diprose, J. M.; Esnouf, R. M.

    2012-01-01

    A fast and easy tool to locate unit-cell matches in the PDB is described. When embarking upon X-ray diffraction data collection from a potentially novel macromolecular crystal form, it can be useful to ascertain whether the measured data reflect a crystal form that is already recorded in the Protein Data Bank and, if so, whether it is part of a large family of related structures. Providing such information to crystallographers conveniently and quickly, as soon as the first images have been recorded and the unit cell characterized at an X-ray beamline, has the potential to save time and effort as well as pointing to possible search models for molecular replacement. Given an input unit cell, and optionally a space group, Nearest-cell rapidly scans the Protein Data Bank and retrieves near-matches

  18. Phase correlation and clustering of a nearest neighbour coupled oscillators system

    CERN Document Server

    Ei-Nashar, H F

    2002-01-01

    We investigated the phases in a system of nearest neighbour coupled oscillators before complete synchronization in frequency occurs. We found that when oscillators under the influence of coupling form a cluster of the same time-average frequency, their phases start to correlate. An order parameter, which measures this correlation, starts to grow at this stage until it reaches maximum. This means that a time-average phase locked state is reached between the oscillators inside the cluster of the same time- average frequency. At this strength the cluster attracts individual oscillators or a cluster to join in. We also observe that clustering in averaged frequencies orders the phases of the oscillators. This behavior is found at all the transition points studied.

  19. Prediction of monthly electric energy consumption using pattern-based fuzzy nearest neighbour regression

    Directory of Open Access Journals (Sweden)

    Pełka Paweł

    2017-01-01

    Full Text Available Electricity demand forecasting is of important role in power system planning and operation. In this work, fuzzy nearest neighbour regression has been utilised to estimate monthly electricity demands. The forecasting model was based on the pre-processed energy consumption time series, where input and output variables were defined as patterns representing unified fragments of the time series. Relationships between inputs and outputs, which were simplified due to patterns, were modelled using nonparametric regression with weighting function defined as a fuzzy membership of learning points to the neighbourhood of a query point. In an experimental part of the work the model was evaluated using real-world data. The results are encouraging and show high performances of the model and its competitiveness compared to other forecasting models.

  20. Nearest Neighborhood Grayscale Operator for Hardware-Efficient Microscale Texture Extraction

    Directory of Open Access Journals (Sweden)

    Andreas König

    2007-01-01

    Full Text Available First-stage feature computation and data rate reduction play a crucial role in an efficient visual information processing system. Hardware-based first stages usually win out where power consumption, dynamic range, and speed are the issue, but have severe limitations with regard to flexibility. In this paper, the local orientation coding (LOC, a nearest neighborhood grayscale operator, is investigated and enhanced for hardware implementation. The features produced by this operator are easy and fast to compute, compress the salient information contained in an image, and lend themselves naturally to various medium-to-high-level postprocessing methods such as texture segmentation, image decomposition, and feature tracking. An image sensor architecture based on the LOC has been elaborated, that combines high dynamic range (HDR image aquisition, feature computation, and inherent pixel-level ADC in the pixel cells. The mixed-signal design allows for simple readout as digital memory.

  1. 3D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    DEFF Research Database (Denmark)

    Suhaibah, A.; Uznir, U.; Antón Castro, Francesc/François

    2016-01-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage......, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level...... of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However...

  2. EarthFinder: A Precise Radial Velocity Survey Probe Mission of our Nearest Stellar Neighbors for Earth-Mass Habitable Zone Analogs Using High-Resolution UV-Vis-NIR Echelle Spectroscopy on a Space Platform

    Science.gov (United States)

    Plavchan, Peter; EarthFinder Team

    2018-01-01

    We are investigating the science case for a 1.0-1.4 meter space telescope to survey the closest, brightest FGKM main sequence stars to search for Habitable Zone (HZ) Earth analogs using the precise radial velocity (PRV) technique at a precision of 1-10 cm/s. Our baseline instrument concept uses two diffraction-limited spectrographs operating in the 0.4-1.0 microns and 1.0-2.4 microns spectral regions each with a spectral resolution of R=150,000~200,000, with the possibility of a third UV arm. Because the instrument utilizes a diffraction-limited input beam, the spectrograph would be extremely compact, less than 50 cm on a side, and illumination can be stabilized with the coupling of starlight into single mode fibers. With two octaves of wavelength coverage and a cadence unimpeded by any diurnal, seasonal, and atmospheric effects, EarthFinder will offer a unique platform for recovering stellar activity signals from starspots, plages, granulation, etc. to detect exoplanets at velocity semi-amplitudes currently not obtainable from the ground. Variable telluric absorption and emission lines may potentially preclude achieving PRV measurements at or below 10 cm/s in the visible and advantage compared to an annual ~3-6 month observing season from the ground for mitigating stellar activity and detecting the orbital periods of HZ Earth-mass analogs (e.g. ~6-months to ~2 years). Finally, we are compiling a list of ancillary science cases for the observatory, ranging from asteroseismology to the direct measurement of the expansion of the Universe.

  3. A nearest-neighbour discretisation of the regularized stokeslet boundary integral equation

    Science.gov (United States)

    Smith, David J.

    2018-04-01

    The method of regularized stokeslets is extensively used in biological fluid dynamics due to its conceptual simplicity and meshlessness. This simplicity carries a degree of cost in computational expense and accuracy because the number of degrees of freedom used to discretise the unknown surface traction is generally significantly higher than that required by boundary element methods. We describe a meshless method based on nearest-neighbour interpolation that significantly reduces the number of degrees of freedom required to discretise the unknown traction, increasing the range of problems that can be practically solved, without excessively complicating the task of the modeller. The nearest-neighbour technique is tested against the classical problem of rigid body motion of a sphere immersed in very viscous fluid, then applied to the more complex biophysical problem of calculating the rotational diffusion timescales of a macromolecular structure modelled by three closely-spaced non-slender rods. A heuristic for finding the required density of force and quadrature points by numerical refinement is suggested. Matlab/GNU Octave code for the key steps of the algorithm is provided, which predominantly use basic linear algebra operations, with a full implementation being provided on github. Compared with the standard Nyström discretisation, more accurate and substantially more efficient results can be obtained by de-refining the force discretisation relative to the quadrature discretisation: a cost reduction of over 10 times with improved accuracy is observed. This improvement comes at minimal additional technical complexity. Future avenues to develop the algorithm are then discussed.

  4. Supergalactic studies. II. Supergalactic distribution of the nearest intergalactic gas clouds

    International Nuclear Information System (INIS)

    de Vaucouleurs, G.; Corwin, H.G. Jr.

    1975-01-01

    The report by Mathewson, Cleary, and Murray that the nearby ''high velocity'' H i clouds, and in particular the Magellanic Stream, are strongly concentrated toward the supergalactic plane is confirmed. The observed concentration within +-30degree from the supergalactic equator of 21 out of 25 clouds in the north galactic hemisphere and 27 out of 31 clouds in the south galactic hemisphere could occur by chance in less than 7 and 3 percent of random samples from a population having a statistically isotropic Poisson distribution. Since the two galactic hemispheres are substantially independent samples, the combined probability of the chance hypothesis is P -3 . It is found that actually the high-velocity clouds are not so much concentrated toward the supergalactic equator (SGE) as toward the equator of the ''Local Cloud'' of galaxies inclined 14degree to the main supergalactic plane. Both galaxies and H i clouds define the same small circle of maximum concentration and exhibit the same standard deviation (15degree) from it, demonstrating closely related space distributions. It is concluded that, with the possible exception of a few of the largest and probably nearest cloud complexes (MS, AC, C), most of the high-velocity clouds are truly intergalactic and associated with the Local Group and nearer groups of galaxies. Half the population in a total sample of 115 nearby galaxies and intergalactic gas coulds is within 11degree from the Local equator, indicating a half-thickness of approx.0.75 Mpc for the Local Cloud. Intergalactic gas clouds have already been identified near 10 of the nearest galaxies (including our Galaxy and the Magellanic Clouds), most within approx.3 Mpc. The estimated space density of intergalactic gas clouds is Napprox. =20--25 Mpc -3 , in approximate agreement with the densities required by the collision theory of ring galaxies

  5. Phagocytic response of astrocytes to damaged neighboring cells.

    Directory of Open Access Journals (Sweden)

    Nicole M Wakida

    Full Text Available This study aims to understand the phagocytic response of astrocytes to the injury of neurons or other astrocytes at the single cell level. Laser nanosurgery was used to damage individual cells in both primary mouse cortical astrocytes and an established astrocyte cell line. In both cases, the release of material/substances from laser-irradiated astrocytes or neurons induced a phagocytic response in near-by astrocytes. Propidium iodide stained DNA originating from irradiated cells was visible in vesicles of neighboring cells, confirming phagocytosis of material from damaged cortical cells. In the presence of an intracellular pH indicator dye, newly formed vesicles correspond to acidic pH fluorescence, thus suggesting lysosome bound degradation of cellular debris. Cells with shared membrane connections prior to laser damage had a significantly higher frequency of induced phagocytosis compared to isolated cells with no shared membrane. The increase in phagocytic response of cells with a shared membrane occurred regardless of the extent of shared membrane (a thin filopodial connection vs. a cell cluster with significant shared membrane. In addition to the presence (or lack of a membrane connection, variation in phagocytic ability was also observed with differences in injury location within the cell and distance separating isolated astrocytes. These results demonstrate the ability of an astrocyte to respond to the damage of a single cell, be it another astrocyte, or a neuron. This single-cell level of analysis results in a better understanding of the role of astrocytes to maintain homeostasis in the CNS, particularly in the sensing and removal of debris in damaged or pathologic nervous tissue.

  6. Correlation of optical energy gap with the nearest neighbour short range order in amorphous V2O5 films

    International Nuclear Information System (INIS)

    Dhawan, Sahil; Vedeshwar, Agnikumar G; Tandon, R P

    2011-01-01

    The optical and structural properties of well characterized vacuum-evaporated amorphous V 2 O 5 films were studied in the thickness range 5-500 nm. The structural analyses show that V-O, O-O and V-V nearest neighbour distances defining the short range order vary nonlinearly with film thickness. The optical absorption shows thickness-dependent energy gap (E g ) and the nonlinear behaviour of thickness-dependent E g is similar to that of nearest neighbour distance with film thickness. The E g correlates linearly very well with all the three nearest neighbour distances. The variation of E g with film thickness is attributed to the residual stress in the film which causes the changes in short range order. The change in E g corresponding to the change in V-O distance was found to be 35 eV nm -1 . This change is almost three times of that with V-V distance.

  7. Neighbors Based Discriminative Feature Difference Learning for Kinship Verification

    DEFF Research Database (Denmark)

    Duan, Xiaodong; Tan, Zheng-Hua

    2015-01-01

    In this paper, we present a discriminative feature difference learning method for facial image based kinship verification. To transform feature difference of an image pair to be discriminative for kinship verification, a linear transformation matrix for feature difference between an image pair...... than the commonly used feature concatenation, leading to a low complexity. Furthermore, there is no positive semi-definitive constrain on the transformation matrix while there is in metric learning methods, leading to an easy solution for the transformation matrix. Experimental results on two public...... databases show that the proposed method combined with a SVM classification method outperforms or is comparable to state-of-the-art kinship verification methods. © Springer International Publishing AG, Part of Springer Science+Business Media...

  8. Detect thy neighbor: Identity recognition at the root level in plants

    NARCIS (Netherlands)

    Chen, B.J.W.; During, H.J.; Anten, N.P.R.

    2012-01-01

    Some plant species increase root allocation at the expense of reproduction in the presence of non-self and non-kin neighbors, indicating the capacity of neighbor-identityrecognition at the rootlevel. Yet in spite of the potential consequences of rootidentityrecognition for the relationship between

  9. Working with Family, Friend, and Neighbor Caregivers: Lessons from Four Diverse Communities

    Science.gov (United States)

    Powell, Douglas R.

    2011-01-01

    This article is excerpted from "Who's Watching the Babies? Improving the Quality of Family, Friend, and Neighbor Care" by Douglas R. Powell ("ZERO TO THREE," 2008). The article explores questions about program development and implementation strategies for supporting Family, Friend, and Neighbor (FFN) caregivers: How do programs and their host…

  10. Characteristics of Broadband Seismic Noise in Taiwan and Neighboring Islands

    Science.gov (United States)

    Chen, Ching-Wei; Rau, Ruey-Juin

    2017-04-01

    We used seismic waveform data from 115 broad-band stations of BATS (Institute of Earth Science, Academia Sinica) and Central Weather Bureau Seismic Network from 2012 to 2016 for noise-level mapping in Taiwan and neighboring islands. We computed Power Spectral Density (PSD) for each station and analyzed long-term variance of microseism energy and polarizations of noise for severe weather events. The island of Taiwan is surrounded by ocean and the Central Range which has the highest peak Jade Mountain at 3,952 meters height occupies more than 66% of the island and departs it into the east and west coasts. The geographic settings then result in the high population density in the western plain and northern Taiwan. The dominant noise source in the microseism band (periods from 4-20 seconds) is the coupling between the near-coast ocean and sea floor which produces the high noise of averaging -130 dB along the west coastal area. In the eastern volcanic-arc coastal areas, the noise level is about 7% smaller than the west coast due to its deeper offshore water depth. As for the shorter periods (0.1-0.25 seconds) band, the so-called culture noise, an anthropic activity variance with the highest -103 dB can be identified in the metropolitan areas, such as the Taipei city and the noise level in the Central Range area is averaging -138 dB. Moreover, the noise also shows a daily and temporal evolution mainly related to the traffic effect. Furthermore, we determined the noise level for the entire island of Taiwan during 26-28 September, 2016, when the typhoon Megi hit the island and retrieved the enhancement of secondary microseism energy for each stations. Typhoon Megi landed in eastern and central Taiwan and reached the maximum wind speed of 45m/s in the surrounded eyewall. The Central Range, as a barrier, decreased the wind speed in southern Taiwan making an enhancement less than 10 dB, while in northern Taiwan where the direction the typhoon headed to, can reach more than 35

  11. Accelerating distributed average consensus by exploring the information of second-order neighbors

    Energy Technology Data Exchange (ETDEWEB)

    Yuan Deming [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Xu Shengyuan, E-mail: syxu02@yahoo.com.c [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Zhao Huanyu [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Chu Yuming [Department of Mathematics, Huzhou Teacher' s College, Huzhou 313000, Zhejiang (China)

    2010-05-17

    The problem of accelerating distributed average consensus by using the information of second-order neighbors in both the discrete- and continuous-time cases is addressed in this Letter. In both two cases, when the information of second-order neighbors is used in each iteration, the network will converge with a speed faster than the algorithm only using the information of first-order neighbors. Moreover, the problem of using partial information of second-order neighbors is considered, and the edges are not chosen randomly from second-order neighbors. In the continuous-time case, the edges are chosen by solving a convex optimization problem which is formed by using the convex relaxation method. In the discrete-time case, for small network the edges are chosen optimally via the brute force method. Finally, simulation examples are provided to demonstrate the effectiveness of the proposed algorithm.

  12. Neighboring trees affect ectomycorrhizal fungal community composition in a woodland-forest ecotone.

    Science.gov (United States)

    Hubert, Nathaniel A; Gehring, Catherine A

    2008-09-01

    Ectomycorrhizal fungi (EMF) are frequently species rich and functionally diverse; yet, our knowledge of the environmental factors that influence local EMF diversity and species composition remains poor. In particular, little is known about the influence of neighboring plants on EMF community structure. We tested the hypothesis that the EMF of plants with heterospecific neighbors would differ in species richness and community composition from the EMF of plants with conspecific neighbors. We conducted our study at the ecotone between pinyon (Pinus edulis)-juniper (Juniperus monosperma) woodland and ponderosa pine (Pinus ponderosa) forest in northern Arizona, USA where the dominant trees formed associations with either EMF (P. edulis and P. ponderosa) or arbuscular mycorrhizal fungi (AMF; J. monosperma). We also compared the EMF communities of pinyon and ponderosa pines where their rhizospheres overlapped. The EMF community composition, but not species richness of pinyon pines was significantly influenced by neighboring AM juniper, but not by neighboring EM ponderosa pine. Ponderosa pine EMF communities were different in species composition when growing in association with pinyon pine than when growing in association with a conspecific. The EMF communities of pinyon and ponderosa pines were similar where their rhizospheres overlapped consisting of primarily the same species in similar relative abundance. Our findings suggest that neighboring tree species identity shaped EMF community structure, but that these effects were specific to host-neighbor combinations. The overlap in community composition between pinyon pine and ponderosa pine suggests that these tree species may serve as reservoirs of EMF inoculum for one another.

  13. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  14. Distribution Route Planning of Clean Coal Based on Nearest Insertion Method

    Science.gov (United States)

    Wang, Yunrui

    2018-01-01

    Clean coal technology has made some achievements for several ten years, but the research in its distribution field is very small, the distribution efficiency would directly affect the comprehensive development of clean coal technology, it is the key to improve the efficiency of distribution by planning distribution route rationally. The object of this paper was a clean coal distribution system which be built in a county. Through the surveying of the customer demand and distribution route, distribution vehicle in previous years, it was found that the vehicle deployment was only distributed by experiences, and the number of vehicles which used each day changed, this resulted a waste of transport process and an increase in energy consumption. Thus, the mathematical model was established here in order to aim at shortest path as objective function, and the distribution route was re-planned by using nearest-insertion method which been improved. The results showed that the transportation distance saved 37 km and the number of vehicles used had also been decreased from the past average of 5 to fixed 4 every day, as well the real loading of vehicles increased by 16.25% while the current distribution volume staying same. It realized the efficient distribution of clean coal, achieved the purpose of saving energy and reducing consumption.

  15. Nearest greedy for solving the waste collection vehicle routing problem: A case study

    Science.gov (United States)

    Mat, Nur Azriati; Benjamin, Aida Mauziah; Abdul-Rahman, Syariza; Wibowo, Antoni

    2017-11-01

    This paper presents a real case study pertaining to an issue related to waste collection in the northern part of Malaysia by using a constructive heuristic algorithm known as the Nearest Greedy (NG) technique. This technique has been widely used to devise initial solutions for issues concerning vehicle routing. Basically, the waste collection cycle involves the following steps: i) each vehicle starts from a depot, ii) visits a number of customers to collect waste, iii) unloads waste at the disposal site, and lastly, iv) returns to the depot. Moreover, the sample data set used in this paper consisted of six areas, where each area involved up to 103 customers. In this paper, the NG technique was employed to construct an initial route for each area. The solution proposed from the technique was compared with the present vehicle routes implemented by a waste collection company within the city. The comparison results portrayed that NG offered better vehicle routes with a 11.07% reduction of the total distance traveled, in comparison to the present vehicle routes.

  16. Sleep disturbances in children with epilepsy compared with their nearest-aged siblings.

    Science.gov (United States)

    Wirrell, Elaine; Blackman, Marlene; Barlow, Karen; Mah, Jean; Hamiwka, Lorie

    2005-11-01

    The aim of the study was to compare sleep patterns in children with epilepsy with those of their non-epileptic siblings and to determine which epilepsy-specific factors predict greater sleep disturbance. We conducted a case-control study of 55 children with epilepsy (mean age 10y, range 4 to 16y; 27 males, 28 females) and their nearest-aged non-epileptic sibling (mean age 10y, range 4 to 18y; 26 males, 29 females). Epilepsy was idiopathic generalized in eight children (15%), symptomatic generalized in seven (13%), and focal in 40 (73%); the mean duration was 5 years 8 months. Parents or caregivers completed the Sleep Behavior Questionnaire (SBQ) and Child Behavior Checklist (CBCL) for patients and controls, and the Quality of Life in Childhood Epilepsy (QOLCE) for patients. Patients had a higher (more adverse) Total Sleep score (p<0.001) and scored worse than controls on nearly all subscales of the SBQ. In patients, higher Total Sleep scores were correlated with higher scores on the Withdrawn, Somatic complaints, Social problems, and Attention subscales of the CBCL, and significantly lower Total Quality of Life Scores. Refractory epilepsy, mental retardation, and remote symptomatic etiology predicted greater sleep problems in those with epilepsy. We conclude that children with epilepsy in this current study had significantly greater sleep problems than their non-epileptic siblings.

  17. THE MOLECULAR WIND IN THE NEAREST SEYFERT GALAXY CIRCINUS REVEALED BY ALMA

    Energy Technology Data Exchange (ETDEWEB)

    Zschaechner, Laura K.; Walter, Fabian; Farina, Emanuele P.; Kruijssen, J. M. Diederik [Max Planck Institute für Astronomie—Königstuhl 17, D-69117 Heidelberg (Germany); Bolatto, Alberto; Veilleux, Sylvain [Department of Astronomy and Joint Space Science Institute, University of Maryland, College Park, MD 20642 (United States); Leroy, Adam [Department of Astronomy, The Ohio State University, 140 West 18th Avenue, Columbus, OH 43210 (United States); Meier, David S. [Department of Physics, New Mexico Institute of Mining and Technology, 801 Leroy Place, Socorro, NM 87801 (United States); Ott, Jürgen, E-mail: zschaechner@mpia.de [National Radio Astronomy Observatory—P.O. Box O, 1003 Lopezville Road, Socorro, NM 87801 (United States)

    2016-12-01

    We present ALMA observations of the inner 1′ (1.2 kpc) of the Circinus galaxy, the nearest Seyfert. We target CO (1–0) in the region associated with a well-known multiphase outflow driven by the central active galactic nucleus (AGN). While the geometry of Circinus and its outflow make disentangling the latter difficult, we see indications of outflowing molecular gas at velocities consistent with the ionized outflow. We constrain the mass of the outflowing molecular gas to be 1.5 × 10{sup 5}−5.1 × 10{sup 6} M {sub ⊙}, yielding a molecular outflow rate of 0.35–12.3 M {sub ⊙} yr{sup −1}. The values within this range are comparable to the star formation (SF) rate in Circinus, indicating that the outflow indeed regulates SF to some degree. The molecular outflow in Circinus is considerably lower in mass and energetics than previously studied AGN-driven outflows, especially given its high ratio of AGN luminosity to bolometric luminosity. The molecular outflow in Circinus is, however, consistent with some trends put forth by Cicone et al., including a linear relation between kinetic power and AGN luminosity, as well as its momentum rate versus bolometric luminosity (although the latter places Circinus among the starburst galaxies in that sample). We detect additional molecular species including CN and C{sup 17}O.

  18. Development of K-Nearest Neighbour Regression Method in Forecasting River Stream Flow

    Directory of Open Access Journals (Sweden)

    Mohammad Azmi

    2012-07-01

    Full Text Available Different statistical, non-statistical and black-box methods have been used in forecasting processes. Among statistical methods, K-nearest neighbour non-parametric regression method (K-NN due to its natural simplicity and mathematical base is one of the recommended methods for forecasting processes. In this study, K-NN method is explained completely. Besides, development and improvement approaches such as best neighbour estimation, data transformation functions, distance functions and proposed extrapolation method are described. K-NN method in company with its development approaches is used in streamflow forecasting of Zayandeh-Rud Dam upper basin. Comparing between final results of classic K-NN method and modified K-NN (number of neighbour 5, transformation function of Range Scaling, distance function of Mahanalobis and proposed extrapolation method shows that modified K-NN in criteria of goodness of fit, root mean square error, percentage of volume of error and correlation has had performance improvement 45% , 59% and 17% respectively. These results approve necessity of applying mentioned approaches to derive more accurate forecasts.

  19. Association between distance to nearest supermarket and provision of fruits and vegetables in English nurseries.

    Science.gov (United States)

    Burgoine, Thomas; Gallis, John A; L Penney, Tarra; Monsivais, Pablo; Benjamin Neelon, Sara E

    2017-07-01

    With 796,500 places available for children in England, pre-school nurseries could serve as an important setting for population-wide dietary intervention. It is critical to understand the determinants of healthy food provision in this setting, which may include access to food stores. This study examined the association between objective, GIS-derived supermarket proximity and fruit and vegetable serving frequency, using data from 623 English nurseries. Overall, 116 (18%) nurseries served fruits and vegetables infrequently (supermarket proximity. In adjusted multivariable regression models, nurseries farthest from their nearest supermarket (Q5, 1.7-19.8km) had 2.38 (95% CI 1.01-5.63) greater odds of infrequent provision. Our results suggest that supermarket access may be important for nurseries in meeting fruit and vegetable provision guidelines. We advance a growing body of international literature, for the first time linking the food practices of institutions to their neighbourhood food retail context. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. RINGED SUBSTRUCTURE AND A GAP AT 1 au IN THE NEAREST PROTOPLANETARY DISK

    Energy Technology Data Exchange (ETDEWEB)

    Andrews, Sean M.; Wilner, David J.; Bai, Xue-Ning; Öberg, Karin I.; Ricci, Luca [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Zhu, Zhaohuan [Department of Astrophysical Sciences, Princeton University, 4 Ivy Lane, Peyton Hall, Princeton, NJ 08544 (United States); Birnstiel, Tilman [Max-Planck-Institut für Astronomie, Königstuhl 17, D-69117 Heidelberg (Germany); Carpenter, John M. [Joint ALMA Observatory (JAO), Alonso de Cordova 3107, Vitacura-Santiago de Chile (Chile); Pérez, Laura M. [Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, D-53121 Bonn (Germany); Hughes, A. Meredith [Department of Astronomy, Wesleyan University, Van Vleck Observatory, 96 Foss Hill Drive, Middletown, CT 06457 (United States); Isella, Andrea, E-mail: sandrews@cfa.harvard.edu [Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, TX 77005 (United States)

    2016-04-01

    We present long baseline Atacama Large Millimeter/submillimeter Array (ALMA) observations of the 870 μm continuum emission from the nearest gas-rich protoplanetary disk, around TW Hya, that trace millimeter-sized particles down to spatial scales as small as 1 au (20 mas). These data reveal a series of concentric ring-shaped substructures in the form of bright zones and narrow dark annuli (1–6 au) with modest contrasts (5%–30%). We associate these features with concentrations of solids that have had their inward radial drift slowed or stopped, presumably at local gas pressure maxima. No significant non-axisymmetric structures are detected. Some of the observed features occur near temperatures that may be associated with the condensation fronts of major volatile species, but the relatively small brightness contrasts may also be a consequence of magnetized disk evolution (the so-called zonal flows). Other features, particularly a narrow dark annulus located only 1 au from the star, could indicate interactions between the disk and young planets. These data signal that ordered substructures on ∼au scales can be common, fundamental factors in disk evolution and that high-resolution microwave imaging can help characterize them during the epoch of planet formation.