WorldWideScience

Sample records for nearest neighbor classifier

  1. Frog sound identification using extended k-nearest neighbor classifier

    Science.gov (United States)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  2. Finger vein identification using fuzzy-based k-nearest centroid neighbor classifier

    Science.gov (United States)

    Rosdi, Bakhtiar Affendi; Jaafar, Haryati; Ramli, Dzati Athiar

    2015-02-01

    In this paper, a new approach for personal identification using finger vein image is presented. Finger vein is an emerging type of biometrics that attracts attention of researchers in biometrics area. As compared to other biometric traits such as face, fingerprint and iris, finger vein is more secured and hard to counterfeit since the features are inside the human body. So far, most of the researchers focus on how to extract robust features from the captured vein images. Not much research was conducted on the classification of the extracted features. In this paper, a new classifier called fuzzy-based k-nearest centroid neighbor (FkNCN) is applied to classify the finger vein image. The proposed FkNCN employs a surrounding rule to obtain the k-nearest centroid neighbors based on the spatial distributions of the training images and their distance to the test image. Then, the fuzzy membership function is utilized to assign the test image to the class which is frequently represented by the k-nearest centroid neighbors. Experimental evaluation using our own database which was collected from 492 fingers shows that the proposed FkNCN has better performance than the k-nearest neighbor, k-nearest-centroid neighbor and fuzzy-based-k-nearest neighbor classifiers. This shows that the proposed classifier is able to identify the finger vein image effectively.

  3. An Improvement To The k-Nearest Neighbor Classifier For ECG Database

    Science.gov (United States)

    Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul

    2018-03-01

    The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.

  4. Error minimizing algorithms for nearest eighbor classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory; Zimmer, G. Beate [TEXAS A& M

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.

  5. On Competitiveness of Nearest-Neighbor-Based Music Classification: A Methodological Critique

    DEFF Research Database (Denmark)

    Pálmason, Haukur; Jónsson, Björn Thór; Amsaleg, Laurent

    2017-01-01

    The traditional role of nearest-neighbor classification in music classification research is that of a straw man opponent for the learning approach of the hour. Recent work in high-dimensional indexing has shown that approximate nearest-neighbor algorithms are extremely scalable, yielding results...... of reasonable quality from billions of high-dimensional features. With such efficient large-scale classifiers, the traditional music classification methodology of aggregating and compressing the audio features is incorrect; instead the approximate nearest-neighbor classifier should be given an extensive data...... collection to work with. We present a case study, using a well-known MIR classification benchmark with well-known music features, which shows that a simple nearest-neighbor classifier performs very competitively when given ample data. In this position paper, we therefore argue that nearest...

  6. Haldane to Dimer Phase Transition in the Spin-1 Haldane System with Bond-Alternating Nearest-Neighbor and Uniform Next-Nearest-Neighbor Exchange Interactions

    OpenAIRE

    Takashi, Tonegawa; Makoto, Kaburagi; Takeshi, Nakao; Department of Physics, Faculty of Science, Kobe University; Faculty of Cross-Cultural Studies, Kobe University; Department of Physics, Faculty of Science, Kobe University

    1995-01-01

    The Haldane to dimer phase transition is studied in the spin-1 Haldane system with bond-alternating nearest-neighbor and uniform next-nearest-neighbor exchange interactions, where both interactions are antiferromagnetic and thus compete with each other. By using a method of exact diagonalization, the ground-state phase diagram on the ratio of the next-nearest-neighbor interaction constant to the nearest-neighbor one versus the bond-alternation parameter of the nearest-neighbor interactions is...

  7. A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction

    Institute of Scientific and Technical Information of China (English)

    Duksan Ryu; Jong-In Jang; Jongmoon Baik; Member; ACM; IEEE

    2015-01-01

    Software defect prediction (SDP) is an active research field in software engineering to identify defect-prone modules. Thanks to SDP, limited testing resources can be effectively allocated to defect-prone modules. Although SDP requires suffcient local data within a company, there are cases where local data are not available, e.g., pilot projects. Companies without local data can employ cross-project defect prediction (CPDP) using external data to build classifiers. The major challenge of CPDP is different distributions between training and test data. To tackle this, instances of source data similar to target data are selected to build classifiers. Software datasets have a class imbalance problem meaning the ratio of defective class to clean class is far low. It usually lowers the performance of classifiers. We propose a Hybrid Instance Selection Using Nearest-Neighbor (HISNN) method that performs a hybrid classification selectively learning local knowledge (via k-nearest neighbor) and global knowledge (via na¨ıve Bayes). Instances having strong local knowledge are identified via nearest-neighbors with the same class label. Previous studies showed low PD (probability of detection) or high PF (probability of false alarm) which is impractical to use. The experimental results show that HISNN produces high overall performance as well as high PD and low PF.

  8. Using K-Nearest Neighbor in Optical Character Recognition

    Directory of Open Access Journals (Sweden)

    Veronica Ong

    2016-03-01

    Full Text Available The growth in computer vision technology has aided society with various kinds of tasks. One of these tasks is the ability of recognizing text contained in an image, or usually referred to as Optical Character Recognition (OCR. There are many kinds of algorithms that can be implemented into an OCR. The K-Nearest Neighbor is one such algorithm. This research aims to find out the process behind the OCR mechanism by using K-Nearest Neighbor algorithm; one of the most influential machine learning algorithms. It also aims to find out how precise the algorithm is in an OCR program. To do that, a simple OCR program to classify alphabets of capital letters is made to produce and compare real results. The result of this research yielded a maximum of 76.9% accuracy with 200 training samples per alphabet. A set of reasons are also given as to why the program is able to reach said level of accuracy.

  9. Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm

    Directory of Open Access Journals (Sweden)

    E. Parvinnia

    2014-01-01

    Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.

  10. Mixed random walks with a trap in scale-free networks including nearest-neighbor and next-nearest-neighbor jumps

    Science.gov (United States)

    Zhang, Zhongzhi; Dong, Yuze; Sheng, Yibin

    2015-10-01

    Random walks including non-nearest-neighbor jumps appear in many real situations such as the diffusion of adatoms and have found numerous applications including PageRank search algorithm; however, related theoretical results are much less for this dynamical process. In this paper, we present a study of mixed random walks in a family of fractal scale-free networks, where both nearest-neighbor and next-nearest-neighbor jumps are included. We focus on trapping problem in the network family, which is a particular case of random walks with a perfect trap fixed at the central high-degree node. We derive analytical expressions for the average trapping time (ATT), a quantitative indicator measuring the efficiency of the trapping process, by using two different methods, the results of which are consistent with each other. Furthermore, we analytically determine all the eigenvalues and their multiplicities for the fundamental matrix characterizing the dynamical process. Our results show that although next-nearest-neighbor jumps have no effect on the leading scaling of the trapping efficiency, they can strongly affect the prefactor of ATT, providing insight into better understanding of random-walk process in complex systems.

  11. Scalable Nearest Neighbor Algorithms for High Dimensional Data.

    Science.gov (United States)

    Muja, Marius; Lowe, David G

    2014-11-01

    For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

  12. Lectures on the nearest neighbor method

    CERN Document Server

    Biau, Gérard

    2015-01-01

    This text presents a wide-ranging and rigorous overview of nearest neighbor methods, one of the most important paradigms in machine learning. Now in one self-contained volume, this book systematically covers key statistical, probabilistic, combinatorial and geometric ideas for understanding, analyzing and developing nearest neighbor methods. Gérard Biau is a professor at Université Pierre et Marie Curie (Paris). Luc Devroye is a professor at the School of Computer Science at McGill University (Montreal).   .

  13. Dimensional testing for reverse k-nearest neighbor search

    DEFF Research Database (Denmark)

    Casanova, Guillaume; Englmeier, Elias; Houle, Michael E.

    2017-01-01

    Given a query object q, reverse k-nearest neighbor (RkNN) search aims to locate those objects of the database that have q among their k-nearest neighbors. In this paper, we propose an approximation method for solving RkNN queries, where the pruning operations and termination tests are guided...... by a characterization of the intrinsic dimensionality of the data. The method can accommodate any index structure supporting incremental (forward) nearest-neighbor search for the generation and verification of candidates, while avoiding impractically-high preprocessing costs. We also provide experimental evidence...

  14. Fast Most Similar Neighbor (MSN) classifiers for Mixed Data

    OpenAIRE

    Hernández Rodríguez, Selene

    2010-01-01

    The k nearest neighbor (k-NN) classifier has been extensively used in Pattern Recognition because of its simplicity and its good performance. However, in large datasets applications, the exhaustive k-NN classifier becomes impractical. Therefore, many fast k-NN classifiers have been developed; most of them rely on metric properties (usually the triangle inequality) to reduce the number of prototype comparisons. Hence, the existing fast k-NN classifiers are applicable only when the comparison f...

  15. Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier

    Science.gov (United States)

    Allen, Victoria W; Shirasu-Hiza, Mimi

    2018-01-01

    Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila. PMID:29485401

  16. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Science.gov (United States)

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  17. Secure Nearest Neighbor Query on Crowd-Sensing Data

    Directory of Open Access Journals (Sweden)

    Ke Cheng

    2016-09-01

    Full Text Available Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.

  18. The Islands Approach to Nearest Neighbor Querying in Spatial Networks

    DEFF Research Database (Denmark)

    Huang, Xuegang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2005-01-01

    , and versatile approach to k nearest neighbor computation that obviates the need for using several k nearest neighbor approaches for supporting a single service scenario. The experimental comparison with the existing techniques uses real-world road network data and considers both I/O and CPU performance...

  19. Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio

    Science.gov (United States)

    Nababan, A. A.; Sitompul, O. S.; Tulus

    2018-04-01

    K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.

  20. Credit scoring analysis using weighted k nearest neighbor

    Science.gov (United States)

    Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.

    2018-05-01

    Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.

  1. Dimensionality reduction with unsupervised nearest neighbors

    CERN Document Server

    Kramer, Oliver

    2013-01-01

    This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustr...

  2. A Nearest Neighbor Classifier Employing Critical Boundary Vectors for Efficient On-Chip Template Reduction.

    Science.gov (United States)

    Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi

    2016-05-01

    Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.

  3. Multiple k Nearest Neighbor Query Processing in Spatial Network Databases

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2006-01-01

    This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... where an upper bound on k is known a priori and then extends the techniques to the case where this is not so. Based on empirical studies with real-world data, the paper offers insight into the circumstances under which the different proposed techniques can be used with advantage for multiple k nearest...

  4. Colorectal Cancer and Colitis Diagnosis Using Fourier Transform Infrared Spectroscopy and an Improved K-Nearest-Neighbour Classifier.

    Science.gov (United States)

    Li, Qingbo; Hao, Can; Kang, Xue; Zhang, Jialin; Sun, Xuejun; Wang, Wenbo; Zeng, Haishan

    2017-11-27

    Combining Fourier transform infrared spectroscopy (FTIR) with endoscopy, it is expected that noninvasive, rapid detection of colorectal cancer can be performed in vivo in the future. In this study, Fourier transform infrared spectra were collected from 88 endoscopic biopsy colorectal tissue samples (41 colitis and 47 cancers). A new method, viz., entropy weight local-hyperplane k-nearest-neighbor (EWHK), which is an improved version of K-local hyperplane distance nearest-neighbor (HKNN), is proposed for tissue classification. In order to avoid limiting high dimensions and small values of the nearest neighbor, the new EWHK method calculates feature weights based on information entropy. The average results of the random classification showed that the EWHK classifier for differentiating cancer from colitis samples produced a sensitivity of 81.38% and a specificity of 92.69%.

  5. Prototype Generation Using Multiobjective Particle Swarm Optimization for Nearest Neighbor Classification.

    Science.gov (United States)

    Hu, Weiwei; Tan, Ying

    2016-12-01

    The nearest neighbor (NN) classifier suffers from high time complexity when classifying a test instance since the need of searching the whole training set. Prototype generation is a widely used approach to reduce the classification time, which generates a small set of prototypes to classify a test instance instead of using the whole training set. In this paper, particle swarm optimization is applied to prototype generation and two novel methods for improving the classification performance are presented: 1) a fitness function named error rank and 2) the multiobjective (MO) optimization strategy. Error rank is proposed to enhance the generation ability of the NN classifier, which takes the ranks of misclassified instances into consideration when designing the fitness function. The MO optimization strategy pursues the performance on multiple subsets of data simultaneously, in order to keep the classifier from overfitting the training set. Experimental results over 31 UCI data sets and 59 additional data sets show that the proposed algorithm outperforms nearly 30 existing prototype generation algorithms.

  6. Nearest neighbors by neighborhood counting.

    Science.gov (United States)

    Wang, Hui

    2006-06-01

    Finding nearest neighbors is a general idea that underlies many artificial intelligence tasks, including machine learning, data mining, natural language understanding, and information retrieval. This idea is explicitly used in the k-nearest neighbors algorithm (kNN), a popular classification method. In this paper, this idea is adopted in the development of a general methodology, neighborhood counting, for devising similarity functions. We turn our focus from neighbors to neighborhoods, a region in the data space covering the data point in question. To measure the similarity between two data points, we consider all neighborhoods that cover both data points. We propose to use the number of such neighborhoods as a measure of similarity. Neighborhood can be defined for different types of data in different ways. Here, we consider one definition of neighborhood for multivariate data and derive a formula for such similarity, called neighborhood counting measure or NCM. NCM was tested experimentally in the framework of kNN. Experiments show that NCM is generally comparable to VDM and its variants, the state-of-the-art distance functions for multivariate data, and, at the same time, is consistently better for relatively large k values. Additionally, NCM consistently outperforms HEOM (a mixture of Euclidean and Hamming distances), the "standard" and most widely used distance function for multivariate data. NCM has a computational complexity in the same order as the standard Euclidean distance function and NCM is task independent and works for numerical and categorical data in a conceptually uniform way. The neighborhood counting methodology is proven sound for multivariate data experimentally. We hope it will work for other types of data.

  7. Nearest Neighbor Search in the Metric Space of a Complex Network for Community Detection

    Directory of Open Access Journals (Sweden)

    Suman Saha

    2016-03-01

    Full Text Available The objective of this article is to bridge the gap between two important research directions: (1 nearest neighbor search, which is a fundamental computational tool for large data analysis; and (2 complex network analysis, which deals with large real graphs but is generally studied via graph theoretic analysis or spectral analysis. In this article, we have studied the nearest neighbor search problem in a complex network by the development of a suitable notion of nearness. The computation of efficient nearest neighbor search among the nodes of a complex network using the metric tree and locality sensitive hashing (LSH are also studied and experimented. For evaluation of the proposed nearest neighbor search in a complex network, we applied it to a network community detection problem. Experiments are performed to verify the usefulness of nearness measures for the complex networks, the role of metric tree and LSH to compute fast and approximate node nearness and the the efficiency of community detection using nearest neighbor search. We observed that nearest neighbor between network nodes is a very efficient tool to explore better the community structure of the real networks. Several efficient approximation schemes are very useful for large networks, which hardly made any degradation of results, whereas they save lot of computational times, and nearest neighbor based community detection approach is very competitive in terms of efficiency and time.

  8. The Application of Determining Students’ Graduation Status of STMIK Palangkaraya Using K-Nearest Neighbors Method

    Science.gov (United States)

    Rusdiana, Lili; Marfuah

    2017-12-01

    K-Nearest Neighbors method is one of methods used for classification which calculate a value to find out the closest in distance. It is used to group a set of data such as students’ graduation status that are got from the amount of course credits taken by them, the grade point average (AVG), and the mini-thesis grade. The study is conducted to know the results of using K-Nearest Neighbors method on the application of determining students’ graduation status, so it can be analyzed from the method used, the data, and the application constructed. The aim of this study is to find out the application results by using K-Nearest Neighbors concept to determine students’ graduation status using the data of STMIK Palangkaraya students. The development of the software used Extreme Programming, since it was appropriate and precise for this study which was to quickly finish the project. The application was created using Microsoft Office Excel 2007 for the training data and Matlab 7 to implement the application. The result of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5%. It could determine the predicate graduation of 94 data used from the initial data before the processing as many as 136 data which the maximal training data was 50data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study. The results of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5% could determine the predicate graduation which is the maximal training data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study.

  9. Anderson localization in one-dimensional quasiperiodic lattice models with nearest- and next-nearest-neighbor hopping

    International Nuclear Information System (INIS)

    Gong, Longyan; Feng, Yan; Ding, Yougen

    2017-01-01

    Highlights: • Quasiperiodic lattice models with next-nearest-neighbor hopping are studied. • Shannon information entropies are used to reflect state localization properties. • Phase diagrams are obtained for the inverse bronze and golden means, respectively. • Our studies present a more complete picture than existing works. - Abstract: We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.

  10. Anderson localization in one-dimensional quasiperiodic lattice models with nearest- and next-nearest-neighbor hopping

    Energy Technology Data Exchange (ETDEWEB)

    Gong, Longyan, E-mail: lygong@njupt.edu.cn [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); National Laboratory of Solid State Microstructures, Nanjing University, Nanjing 210093 (China); Feng, Yan; Ding, Yougen [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China)

    2017-02-12

    Highlights: • Quasiperiodic lattice models with next-nearest-neighbor hopping are studied. • Shannon information entropies are used to reflect state localization properties. • Phase diagrams are obtained for the inverse bronze and golden means, respectively. • Our studies present a more complete picture than existing works. - Abstract: We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.

  11. Consistency Analysis of Nearest Subspace Classifier

    OpenAIRE

    Wang, Yi

    2015-01-01

    The Nearest subspace classifier (NSS) finds an estimation of the underlying subspace within each class and assigns data points to the class that corresponds to its nearest subspace. This paper mainly studies how well NSS can be generalized to new samples. It is proved that NSS is strongly consistent under certain assumptions. For completeness, NSS is evaluated through experiments on various simulated and real data sets, in comparison with some other linear model based classifiers. It is also ...

  12. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  13. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    National Research Council Canada - National Science Library

    Han, Euihong; Karypis, George; Kumar, Vipin

    1999-01-01

    .... The authors present a nearest neighbor classification scheme for text categorization in which the importance of discriminating words is learned using mutual information and weight adjustment techniques...

  14. Nearest unlike neighbor (NUN): an aid to decision confidence estimation

    Science.gov (United States)

    Dasarathy, Belur V.

    1995-09-01

    The concept of nearest unlike neighbor (NUN), proposed and explored previously in the design of nearest neighbor (NN) based decision systems, is further exploited in this study to develop a measure of confidence in the decisions made by NN-based decision systems. This measure of confidence, on the basis of comparison with a user-defined threshold, may be used to determine the acceptability of the decision provided by the NN-based decision system. The concepts, associated methodology, and some illustrative numerical examples using the now classical Iris data to bring out the ease of implementation and effectiveness of the proposed innovations are presented.

  15. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  16. Comparison of Two Classifiers; K-Nearest Neighbor and Artificial Neural Network, for Fault Diagnosis on a Main Engine Journal-Bearing

    Directory of Open Access Journals (Sweden)

    A. Moosavian

    2013-01-01

    Full Text Available Vibration analysis is an accepted method in condition monitoring of machines, since it can provide useful and reliable information about machine working condition. This paper surveys a new scheme for fault diagnosis of main journal-bearings of internal combustion (IC engine based on power spectral density (PSD technique and two classifiers, namely, K-nearest neighbor (KNN and artificial neural network (ANN. Vibration signals for three different conditions of journal-bearing; normal, with oil starvation condition and extreme wear fault were acquired from an IC engine. PSD was applied to process the vibration signals. Thirty features were extracted from the PSD values of signals as a feature source for fault diagnosis. KNN and ANN were trained by training data set and then used as diagnostic classifiers. Variable K value and hidden neuron count (N were used in the range of 1 to 20, with a step size of 1 for KNN and ANN to gain the best classification results. The roles of PSD, KNN and ANN techniques were studied. From the results, it is shown that the performance of ANN is better than KNN. The experimental results dèmonstrate that the proposed diagnostic method can reliably separate different fault conditions in main journal-bearings of IC engine.

  17. A Novel Preferential Diffusion Recommendation Algorithm Based on User’s Nearest Neighbors

    Directory of Open Access Journals (Sweden)

    Fuguo Zhang

    2017-01-01

    Full Text Available Recommender system is a very efficient way to deal with the problem of information overload for online users. In recent years, network based recommendation algorithms have demonstrated much better performance than the standard collaborative filtering methods. However, most of network based algorithms do not give a high enough weight to the influence of the target user’s nearest neighbors in the resource diffusion process, while a user or an object with high degree will obtain larger influence in the standard mass diffusion algorithm. In this paper, we propose a novel preferential diffusion recommendation algorithm considering the significance of the target user’s nearest neighbors and evaluate it in the three real-world data sets: MovieLens 100k, MovieLens 1M, and Epinions. Experiments results demonstrate that the novel preferential diffusion recommendation algorithm based on user’s nearest neighbors can significantly improve the recommendation accuracy and diversity.

  18. Nearest neighbor 3D segmentation with context features

    Science.gov (United States)

    Hristova, Evelin; Schulz, Heinrich; Brosch, Tom; Heinrich, Mattias P.; Nickisch, Hannes

    2018-03-01

    Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are - for a nearest neighbor method - surprisingly accurate, robust as well as data and time efficient.

  19. Estimating forest attribute parameters for small areas using nearest neighbors techniques

    Science.gov (United States)

    Ronald E. McRoberts

    2012-01-01

    Nearest neighbors techniques have become extremely popular, particularly for use with forest inventory data. With these techniques, a population unit prediction is calculated as a linear combination of observations for a selected number of population units in a sample that are most similar, or nearest, in a space of ancillary variables to the population unit requiring...

  20. ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

    DEFF Research Database (Denmark)

    Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander

    2017-01-01

    This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory approximate nearest neighbor algorithms. It provides a standard interface for measuring the performance and quality achieved by nearest neighbor algorithms on different standard data sets. It supports several...... visualise these as images, Open image in new window plots, and websites with interactive plots. ANN-Benchmarks aims to provide a constantly updated overview of the current state of the art of k-NN algorithms. In the short term, this overview allows users to choose the correct k-NN algorithm and parameters...... for their similarity search task; in the longer term, algorithm designers will be able to use this overview to test and refine automatic parameter tuning. The paper gives an overview of the system, evaluates the results of the benchmark, and points out directions for future work. Interestingly, very different...

  1. Collective Behaviors of Mobile Robots Beyond the Nearest Neighbor Rules With Switching Topology.

    Science.gov (United States)

    Ning, Boda; Han, Qing-Long; Zuo, Zongyu; Jin, Jiong; Zheng, Jinchuan

    2018-05-01

    This paper is concerned with the collective behaviors of robots beyond the nearest neighbor rules, i.e., dispersion and flocking, when robots interact with others by applying an acute angle test (AAT)-based interaction rule. Different from a conventional nearest neighbor rule or its variations, the AAT-based interaction rule allows interactions with some far-neighbors and excludes unnecessary nearest neighbors. The resulting dispersion and flocking hold the advantages of scalability, connectivity, robustness, and effective area coverage. For the dispersion, a spring-like controller is proposed to achieve collision-free coordination. With switching topology, a new fixed-time consensus-based energy function is developed to guarantee the system stability. An upper bound of settling time for energy consensus is obtained, and a uniform time interval is accordingly set so that energy distribution is conducted in a fair manner. For the flocking, based on a class of generalized potential functions taking nonsmooth switching into account, a new controller is proposed to ensure that the same velocity for all robots is eventually reached. A co-optimizing problem is further investigated to accomplish additional tasks, such as enhancing communication performance, while maintaining the collective behaviors of mobile robots. Simulation results are presented to show the effectiveness of the theoretical results.

  2. Multi-strategy based quantum cost reduction of linear nearest-neighbor quantum circuit

    Science.gov (United States)

    Tan, Ying-ying; Cheng, Xue-yun; Guan, Zhi-jin; Liu, Yang; Ma, Haiying

    2018-03-01

    With the development of reversible and quantum computing, study of reversible and quantum circuits has also developed rapidly. Due to physical constraints, most quantum circuits require quantum gates to interact on adjacent quantum bits. However, many existing quantum circuits nearest-neighbor have large quantum cost. Therefore, how to effectively reduce quantum cost is becoming a popular research topic. In this paper, we proposed multiple optimization strategies to reduce the quantum cost of the circuit, that is, we reduce quantum cost from MCT gates decomposition, nearest neighbor and circuit simplification, respectively. The experimental results show that the proposed strategies can effectively reduce the quantum cost, and the maximum optimization rate is 30.61% compared to the corresponding results.

  3. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN classification method

    Directory of Open Access Journals (Sweden)

    D.A. Adeniyi

    2016-01-01

    Full Text Available The major problem of many on-line web sites is the presentation of many choices to the client at a time; this usually results to strenuous and time consuming task in finding the right product or information on the site. In this work, we present a study of automatic web usage data mining and recommendation system based on current user behavior through his/her click stream data on the newly developed Really Simple Syndication (RSS reader website, in order to provide relevant information to the individual without explicitly asking for it. The K-Nearest-Neighbor (KNN classification method has been trained to be used on-line and in Real-Time to identify clients/visitors click stream data, matching it to a particular user group and recommend a tailored browsing option that meet the need of the specific user at a particular time. To achieve this, web users RSS address file was extracted, cleansed, formatted and grouped into meaningful session and data mart was developed. Our result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.

  4. CATEGORIZATION OF GELAM, ACACIA AND TUALANG HONEY ODORPROFILE USING K-NEAREST NEIGHBORS

    Directory of Open Access Journals (Sweden)

    Nurdiyana Zahed

    2018-02-01

    Full Text Available Honey authenticity refer to honey types is of great importance issue and interest in agriculture. In current research, several documents of specific types of honey have their own usage in medical field. However, it is quite challenging task to classify different types of honey by simply using our naked eye. This work demostrated a successful an electronic nose (E-nose application as an instrument for identifying odor profile pattern of three common honey in Malaysia (Gelam, Acacia and Tualang honey. The applied E-nose has produced signal for odor measurement in form of numeric resistance (Ω. The data reading have been pre-processed using normalization technique for standardized scale of unique features. Mean features is extracted and boxplot used as the statistical tool to present the data pattern according to three types of honey. Mean features that have been extracted were employed into K-Nearest Neighbors classifier as an input features and evaluated using several splitting ratio. Excellent results were obtained by showing 100% rate of accuracy, sensitivity and specificity of classification from KNN using weigh (k=1, ratio 90:10 and Euclidean distance. The findings confirmed the ability of KNN classifier as intelligent classification to classify different honey types from E-nose calibration. Outperform of other classifier, KNN required less parameter optimization and achieved promising result.

  5. Distance-Constraint k-Nearest Neighbor Searching in Mobile Sensor Networks.

    Science.gov (United States)

    Han, Yongkoo; Park, Kisung; Hong, Jihye; Ulamin, Noor; Lee, Young-Koo

    2015-07-27

    The κ-Nearest Neighbors ( κNN) query is an important spatial query in mobile sensor networks. In this work we extend κNN to include a distance constraint, calling it a l-distant κ-nearest-neighbors (l-κNN) query, which finds the κ sensor nodes nearest to a query point that are also at or greater distance from each other. The query results indicate the objects nearest to the area of interest that are scattered from each other by at least distance l. The l-κNN query can be used in most κNN applications for the case of well distributed query results. To process an l-κNN query, we must discover all sets of κNN sensor nodes and then find all pairs of sensor nodes in each set that are separated by at least a distance l. Given the limited battery and computing power of sensor nodes, this l-κNN query processing is problematically expensive in terms of energy consumption. In this paper, we propose a greedy approach for l-κNN query processing in mobile sensor networks. The key idea of the proposed approach is to divide the search space into subspaces whose all sides are l. By selecting κ sensor nodes from the other subspaces near the query point, we guarantee accurate query results for l-κNN. In our experiments, we show that the proposed method exhibits superior performance compared with a post-processing based method using the κNN query in terms of energy efficiency, query latency, and accuracy.

  6. Sistem Rekomendasi Pada E-Commerce Menggunakan K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    Chandra Saha Dewa Prasetya

    2017-09-01

    The growing number of product information available on the internet brings challenges to both customer and online businesses in the e-commerce environment. Customer often have difficulty when looking for products on the internet because of the number of products sold on the internet. In addition, online businessman often experience difficulties because they has much data about products, customers and transactions, thus causing online businessman have difficulty to promote the right product to a particular customer target. A recommendation system was developed to address those problem with various methods such as Collaborative Filtering, ContentBased, and Hybrid. Collaborative filtering method uses customer’s rating data, content based using product content such as title or description, and hybrid using both as the basis of the recommendation. In this research, the k-nearest neighbor algorithm is used to determine the top-n product recommendations for each buyer. The result of this research method Content Based outperforms other methods because the sparse data, that is the condition where the number of rating given by the customers is relatively little compared the number of products available in e-commerce. Keywords: recomendation system, k-nearest neighbor, collaborative filtering, content based.

  7. Seismic clusters analysis in Northeastern Italy by the nearest-neighbor approach

    Science.gov (United States)

    Peresan, Antonella; Gentili, Stefania

    2018-01-01

    The main features of earthquake clusters in Northeastern Italy are explored, with the aim to get new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters, which are identified by a statistical method, based on nearest-neighbor distances of events in the space-time-energy domain. The method permits us to highlight and investigate the internal structure of earthquake sequences, and to differentiate the spatial properties of seismicity according to the different topological features of the clusters structure. To analyze seismicity of Northeastern Italy, we use information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. A preliminary reappraisal of the earthquake bulletins is carried out and the area of sufficient completeness is outlined. Various techniques are considered to estimate the scaling parameters that characterize earthquakes occurrence in the region, namely the b-value and the fractal dimension of epicenters distribution, required for the application of the nearest-neighbor technique. Specifically, average robust estimates of the parameters of the Unified Scaling Law for Earthquakes, USLE, are assessed for the whole outlined region and are used to compute the nearest-neighbor distances. Clusters identification by the nearest-neighbor method turn out quite reliable and robust with respect to the minimum magnitude cutoff of the input catalog; the identified clusters are well consistent with those obtained from manual aftershocks identification of selected sequences. We demonstrate that the earthquake clusters have distinct preferred geographic locations, and we identify two areas that differ substantially in the examined clustering properties. Specifically, burst-like sequences are associated with the north-western part and swarm-like sequences with the south-eastern part of the study

  8. A two-step nearest neighbors algorithm using satellite imagery for predicting forest structure within species composition classes

    Science.gov (United States)

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques have been shown to be useful for predicting multiple forest attributes from forest inventory and Landsat satellite image data. However, in regions lacking good digital land cover information, nearest neighbors selected to predict continuous variables such as tree volume must be selected without regard to relevant categorical variables such...

  9. A Local Weighted Nearest Neighbor Algorithm and a Weighted and Constrained Least-Squared Method for Mixed Odor Analysis by Electronic Nose Systems

    Directory of Open Access Journals (Sweden)

    Jyuo-Min Shyu

    2010-11-01

    Full Text Available A great deal of work has been done to develop techniques for odor analysis by electronic nose systems. These analyses mostly focus on identifying a particular odor by comparing with a known odor dataset. However, in many situations, it would be more practical if each individual odorant could be determined directly. This paper proposes two methods for such odor components analysis for electronic nose systems. First, a K-nearest neighbor (KNN-based local weighted nearest neighbor (LWNN algorithm is proposed to determine the components of an odor. According to the component analysis, the odor training data is firstly categorized into several groups, each of which is represented by its centroid. The examined odor is then classified as the class of the nearest centroid. The distance between the examined odor and the centroid is calculated based on a weighting scheme, which captures the local structure of each predefined group. To further determine the concentration of each component, odor models are built by regressions. Then, a weighted and constrained least-squares (WCLS method is proposed to estimate the component concentrations. Experiments were carried out to assess the effectiveness of the proposed methods. The LWNN algorithm is able to classify mixed odors with different mixing ratios, while the WCLS method can provide good estimates on component concentrations.

  10. Antiferromagnetic geometric frustration under the influence of the next-nearest-neighbor interaction. An exactly solvable model

    Science.gov (United States)

    Jurčišinová, E.; Jurčišin, M.

    2018-02-01

    The influence of the next-nearest-neighbor interaction on the properties of the geometrically frustrated antiferromagnetic systems is investigated in the framework of the exactly solvable antiferromagnetic spin- 1 / 2 Ising model in the external magnetic field on the square-kagome recursive lattice, where the next-nearest-neighbor interaction is supposed between sites within each elementary square of the lattice. The thermodynamic properties of the model are investigated in detail and it is shown that the competition between the nearest-neighbor antiferromagnetic interaction and the next-nearest-neighbor ferromagnetic interaction changes properties of the single-point ground states but does not change the frustrated character of the basic model. On the other hand, the presence of the antiferromagnetic next-nearest-neighbor interaction leads to the enhancement of the frustration effects with the formation of additional plateau and single-point ground states at low temperatures. Exact expressions for magnetizations and residual entropies of all ground states of the model are found. It is shown that the model exhibits various ground states with the same value of magnetization but different macroscopic degeneracies as well as the ground states with different values of magnetization but the same value of the residual entropy. The specific heat capacity is investigated and it is shown that the model exhibits the Schottky-type anomaly behavior in the vicinity of each single-point ground state value of the magnetic field. The formation of the field-induced double-peak structure of the specific heat capacity at low temperatures is demonstrated and it is shown that its very existence is directly related to the presence of highly macroscopically degenerated single-point ground states in the model.

  11. The nearest neighbor and the bayes error rates.

    Science.gov (United States)

    Loizou, G; Maybank, S J

    1987-02-01

    The (k, l) nearest neighbor method of pattern classification is compared to the Bayes method. If the two acceptance rates are equal then the asymptotic error rates satisfy the inequalities Ek,l + 1 ¿ E*(¿) ¿ Ek,l dE*(¿), where d is a function of k, l, and the number of pattern classes, and ¿ is the reject threshold for the Bayes method. An explicit expression for d is given which is optimal in the sense that for some probability distributions Ek,l and dE* (¿) are equal.

  12. Thermodynamics of alternating spin chains with competing nearest- and next-nearest-neighbor interactions: Ising model

    Science.gov (United States)

    Pini, Maria Gloria; Rettori, Angelo

    1993-08-01

    The thermodynamical properties of an alternating spin (S,s) one-dimensional (1D) Ising model with competing nearest- and next-nearest-neighbor interactions are exactly calculated using a transfer-matrix technique. In contrast to the case S=s=1/2, previously investigated by Harada, the alternation of different spins (S≠s) along the chain is found to give rise to two-peaked static structure factors, signaling the coexistence of different short-range-order configurations. The relevance of our calculations with regard to recent experimental data by Gatteschi et al. in quasi-1D molecular magnetic materials, R (hfac)3 NITEt (R=Gd, Tb, Dy, Ho, Er, . . .), is discussed; hfac is hexafluoro-acetylacetonate and NlTEt is 2-Ethyl-4,4,5,5-tetramethyl-4,5-dihydro-1H-imidazolyl-1-oxyl-3-oxide.

  13. A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality.

    Science.gov (United States)

    Wang, Xueyi

    2012-02-08

    The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.

  14. Elliptic Painlevé equations from next-nearest-neighbor translations on the E_8^{(1)} lattice

    Science.gov (United States)

    Joshi, Nalini; Nakazono, Nobutaka

    2017-07-01

    The well known elliptic discrete Painlevé equation of Sakai is constructed by a standard translation on the E_8(1) lattice, given by nearest neighbor vectors. In this paper, we give a new elliptic discrete Painlevé equation obtained by translations along next-nearest-neighbor vectors. This equation is a generic (8-parameter) version of a 2-parameter elliptic difference equation found by reduction from Adler’s partial difference equation, the so-called Q4 equation. We also provide a projective reduction of the well known equation of Sakai.

  15. Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting

    Science.gov (United States)

    Zhang, Ningning; Lin, Aijing; Shang, Pengjian

    2017-07-01

    In this paper, we propose a new two-stage methodology that combines the ensemble empirical mode decomposition (EEMD) with multidimensional k-nearest neighbor model (MKNN) in order to forecast the closing price and high price of the stocks simultaneously. The modified algorithm of k-nearest neighbors (KNN) has an increasingly wide application in the prediction of all fields. Empirical mode decomposition (EMD) decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs), however, it cannot reveal characteristic information of the signal with much accuracy as a result of mode mixing. So ensemble empirical mode decomposition (EEMD), an improved method of EMD, is presented to resolve the weaknesses of EMD by adding white noise to the original data. With EEMD, the components with true physical meaning can be extracted from the time series. Utilizing the advantage of EEMD and MKNN, the new proposed ensemble empirical mode decomposition combined with multidimensional k-nearest neighbor model (EEMD-MKNN) has high predictive precision for short-term forecasting. Moreover, we extend this methodology to the case of two-dimensions to forecast the closing price and high price of the four stocks (NAS, S&P500, DJI and STI stock indices) at the same time. The results indicate that the proposed EEMD-MKNN model has a higher forecast precision than EMD-KNN, KNN method and ARIMA.

  16. Introduction to machine learning: k-nearest neighbors.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-06-01

    Machine learning techniques have been widely used in many scientific fields, but its use in medical literature is limited partly because of technical difficulties. k-nearest neighbors (kNN) is a simple method of machine learning. The article introduces some basic ideas underlying the kNN algorithm, and then focuses on how to perform kNN modeling with R. The dataset should be prepared before running the knn() function in R. After prediction of outcome with kNN algorithm, the diagnostic performance of the model should be checked. Average accuracy is the mostly widely used statistic to reflect the kNN algorithm. Factors such as k value, distance calculation and choice of appropriate predictors all have significant impact on the model performance.

  17. Applying an efficient K-nearest neighbor search to forest attribute imputation

    Science.gov (United States)

    Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek

    2006-01-01

    This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...

  18. Linear perturbation renormalization group for the two-dimensional Ising model with nearest- and next-nearest-neighbor interactions in a field

    Science.gov (United States)

    Sznajd, J.

    2016-12-01

    The linear perturbation renormalization group (LPRG) is used to study the phase transition of the weakly coupled Ising chains with intrachain (J ) and interchain nearest-neighbor (J1) and next-nearest-neighbor (J2) interactions forming the triangular and rectangular lattices in a field. The phase diagrams with the frustration point at J2=-J1/2 for a rectangular lattice and J2=-J1 for a triangular lattice have been found. The LPRG calculations support the idea that the phase transition is always continuous except for the frustration point and is accompanied by a divergence of the specific heat. For the antiferromagnetic chains, the external field does not change substantially the shape of the phase diagram. The critical temperature is suppressed to zero according to the power law when approaching the frustration point with an exponent dependent on the value of the field.

  19. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

    KAUST Repository

    Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos

    2010-01-01

    Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii

  20. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    Science.gov (United States)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  1. Monte Carlo study of a ferrimagnetic mixed-spin (2, 5/2) system with the nearest and next-nearest neighbors exchange couplings

    Science.gov (United States)

    Bi, Jiang-lin; Wang, Wei; Li, Qi

    2017-07-01

    In this paper, the effects of the next-nearest neighbors exchange couplings on the magnetic and thermal properties of the ferrimagnetic mixed-spin (2, 5/2) Ising model on a 3D honeycomb lattice have been investigated by the use of Monte Carlo simulation. In particular, the influences of exchange couplings (Ja, Jb, Jan) and the single-ion anisotropy(Da) on the phase diagrams, the total magnetization, the sublattice magnetization, the total susceptibility, the internal energy and the specific heat have been discussed in detail. The results clearly show that the system can express the critical and compensation behavior within the next-nearest neighbors exchange coupling. Great deals of the M curves such as N-, Q-, P- and L-types have been discovered, owing to the competition between the exchange coupling and the temperature. Compared with other theoretical and experimental works, our results have an excellent consistency with theirs.

  2. Aftershock identification problem via the nearest-neighbor analysis for marked point processes

    Science.gov (United States)

    Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.

    2007-12-01

    The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.

  3. The influence of As/III pressure ratio on nitrogen nearest-neighbor environments in as-grown GaInNAs quantum wells

    International Nuclear Information System (INIS)

    Kudrawiec, R.; Poloczek, P.; Misiewicz, J.; Korpijaervi, V.-M.; Laukkanen, P.; Pakarinen, J.; Dumitrescu, M.; Guina, M.; Pessa, M.

    2009-01-01

    The energy fine structure, corresponding to different nitrogen nearest-neighbor environments, was observed in contactless electroreflectance (CER) spectra of as-grown GaInNAs quantum wells (QWs) obtained at various As/III pressure ratios. In the spectral range of the fundamental transition, two CER resonances were detected for samples grown at low As pressures whereas only one CER resonance was observed for samples obtained at higher As pressures. This resonance corresponds to the most favorable nitrogen nearest-neighbor environment in terms of the total crystal energy. It means that the nitrogen nearest-neighbor environment in GaInNAs QWs can be controlled in molecular beam epitaxy process by As/III pressure ratio.

  4. Diagnosis of Diabetes Diseases Using an Artificial Immune Recognition System2 (AIRS2) with Fuzzy K-nearest Neighbor

    OpenAIRE

    CHIKH, Mohamed Amine; SAIDI, Meryem; SETTOUTI, Nesma

    2012-01-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disea...

  5. Nearest neighbors EPR superhyperfine interaction in divalent iridium complexes in alkali halide host lattice

    International Nuclear Information System (INIS)

    Pinhal, N.M.; Vugman, N.V.

    1983-01-01

    Further splitting of chlorine superhyperfine lines on the EPR spectrum of the [Ir (CN) 4 Cl 2 ] 4 - molecular species in NaCl latice indicates a super-superhyperfine interaction with the nearest neighbors sodium atoms. (Author) [pt

  6. Chaotic Synchronization in Nearest-Neighbor Coupled Networks of 3D CNNs

    OpenAIRE

    Serrano-Guerrero, H.; Cruz-Hernández, C.; López-Gutiérrez, R.M.; Cardoza-Avendaño, L.; Chávez-Pérez, R.A.

    2013-01-01

    In this paper, a synchronization of Cellular Neural Networks (CNNs) in nearest-neighbor coupled arrays, is numerically studied. Synchronization of multiple chaotic CNNs is achieved by appealing to complex systems theory. In particular, we consider dynamical networks composed by 3D CNNs, as interconnected nodes, where the interactions in the networks are defined by coupling the first state of each node. Four cases of interest are considered: i) synchronization without chaotic master, ii) maste...

  7. Constrained parameter estimation for semi-supervised learning : The case of the nearest mean classifier

    NARCIS (Netherlands)

    Loog, M.

    2011-01-01

    A rather simple semi-supervised version of the equally simple nearest mean classifier is presented. However simple, the proposed approach is of practical interest as the nearest mean classifier remains a relevant tool in biomedical applications or other areas dealing with relatively high-dimensional

  8. A new approach to very short term wind speed prediction using k-nearest neighbor classification

    International Nuclear Information System (INIS)

    Yesilbudak, Mehmet; Sagiroglu, Seref; Colak, Ilhami

    2013-01-01

    Highlights: ► Wind speed parameter was predicted in an n-tupled inputs using k-NN classification. ► The effects of input parameters, nearest neighbors and distance metrics were analyzed. ► Many useful and reasonable inferences were uncovered using the developed model. - Abstract: Wind energy is an inexhaustible energy source and wind power production has been growing rapidly in recent years. However, wind power has a non-schedulable nature due to wind speed variations. Hence, wind speed prediction is an indispensable requirement for power system operators. This paper predicts wind speed parameter in an n-tupled inputs using k-nearest neighbor (k-NN) classification and analyzes the effects of input parameters, nearest neighbors and distance metrics on wind speed prediction. The k-NN classification model was developed using the object oriented programming techniques and includes Manhattan and Minkowski distance metrics except from Euclidean distance metric on the contrary of literature. The k-NN classification model which uses wind direction, air temperature, atmospheric pressure and relative humidity parameters in a 4-tupled space achieved the best wind speed prediction for k = 5 in the Manhattan distance metric. Differently, the k-NN classification model which uses wind direction, air temperature and atmospheric pressure parameters in a 3-tupled inputs gave the worst wind speed prediction for k = 1 in the Minkowski distance metric

  9. Recursive nearest neighbor search in a sparse and multiscale domain for comparing audio signals

    DEFF Research Database (Denmark)

    Sturm, Bob L.; Daudet, Laurent

    2011-01-01

    We investigate recursive nearest neighbor search in a sparse domain at the scale of audio signals. Essentially, to approximate the cosine distance between the signals we make pairwise comparisons between the elements of localized sparse models built from large and redundant multiscale dictionaries...

  10. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

    Science.gov (United States)

    Rivas, Elena; Lang, Raymond; Eddy, Sean R

    2012-02-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.

  11. Mapping change of older forest with nearest-neighbor imputation and Landsat time-series

    Science.gov (United States)

    Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Warren B. Cohen; Robert E. Kennedy; Zhiqiang. Yang

    2012-01-01

    The Northwest Forest Plan (NWFP), which aims to conserve late-successional and old-growth forests (older forests) and associated species, established new policies on federal lands in the Pacific Northwest USA. As part of monitoring for the NWFP, we tested nearest-neighbor imputation for mapping change in older forest, defined by threshold values for forest attributes...

  12. Penerapan Metode K-nearest Neighbor pada Penentuan Grade Dealer Sepeda Motor

    OpenAIRE

    Leidiyana, Henny

    2017-01-01

    The mutually beneficial cooperation is a very important thing for a leasing and dealer. Incentives for marketing is given in order to get consumers as much as possible. But sometimes the surveyor objectivity is lost due to the conspiracy on the field of marketing and surveyors. To overcome this, leasing a variety of ways one of them is doing ranking against the dealer. In this study the application of the k-Nearest Neighbor method and Euclidean distance measurement to determine the grade deal...

  13. Moderate-resolution data and gradient nearest neighbor imputation for regional-national risk assessment

    Science.gov (United States)

    Kenneth B. Jr. Pierce; C. Kenneth Brewer; Janet L. Ohmann

    2010-01-01

    This study was designed to test the feasibility of combining a method designed to populate pixels with inventory plot data at the 30-m scale with a new national predictor data set. The new national predictor data set was developed by the USDA Forest Service Remote Sensing Applications Center (hereafter RSAC) at the 250-m scale. Gradient Nearest Neighbor (GNN)...

  14. Morphological type correlation between nearest neighbor pairs of galaxies

    Science.gov (United States)

    Yamagata, Tomohiko

    1990-01-01

    Although the morphological type of galaxies is one of the most fundamental properties of galaxies, its origin and evolutionary processes, if any, are not yet fully understood. It has been established that the galaxy morphology strongly depends on the environment in which the galaxy resides (e.g., Dressler 1980). Galaxy pairs correspond to the smallest scales of galaxy clustering and may provide important clues to how the environment influences the formation and evolution of galaxies. Several investigators pointed out that there is a tendency for pair galaxies to have similar morphological types (Karachentsev and Karachentseva 1974, Page 1975, Noerdlinger 1979). Here, researchers analyze morphological type correlation for 18,364 nearest neighbor pairs of galaxies identified in the magnetic tape version of the Center for Astrophysics Redshift Catalogue.

  15. Designing lattice structures with maximal nearest-neighbor entanglement

    Energy Technology Data Exchange (ETDEWEB)

    Navarro-Munoz, J C; Lopez-Sandoval, R [Instituto Potosino de Investigacion CientIfica y Tecnologica, Camino a la presa San Jose 2055, 78216 San Luis Potosi (Mexico); Garcia, M E [Theoretische Physik, FB 18, Universitaet Kassel and Center for Interdisciplinary Nanostructure Science and Technology (CINSaT), Heinrich-Plett-Str.40, 34132 Kassel (Germany)

    2009-08-07

    In this paper, we study the numerical optimization of nearest-neighbor concurrence of bipartite one- and two-dimensional lattices, as well as non-bipartite two-dimensional lattices. These systems are described in the framework of a tight-binding Hamiltonian while the optimization of concurrence was performed using genetic algorithms. Our results show that the concurrence of the optimized lattice structures is considerably higher than that of non-optimized systems. In the case of one-dimensional chains, the concurrence increases dramatically when the system begins to dimerize, i.e., it undergoes a structural phase transition (Peierls distortion). This result is consistent with the idea that entanglement is maximal or shows a singularity near quantum phase transitions. Moreover, the optimization of concurrence in two-dimensional bipartite and non-bipartite lattices is achieved when the structures break into smaller subsystems, which are arranged in geometrically distinguishable configurations.

  16. Nearest-neighbor Kitaev exchange blocked by charge order in electron-doped α -RuCl3

    Science.gov (United States)

    Koitzsch, A.; Habenicht, C.; Müller, E.; Knupfer, M.; Büchner, B.; Kretschmer, S.; Richter, M.; van den Brink, J.; Börrnert, F.; Nowak, D.; Isaeva, A.; Doert, Th.

    2017-10-01

    A quantum spin liquid might be realized in α -RuCl3 , a honeycomb-lattice magnetic material with substantial spin-orbit coupling. Moreover, α -RuCl3 is a Mott insulator, which implies the possibility that novel exotic phases occur upon doping. Here, we study the electronic structure of this material when intercalated with potassium by photoemission spectroscopy, electron energy loss spectroscopy, and density functional theory calculations. We obtain a stable stoichiometry at K0.5RuCl3 . This gives rise to a peculiar charge disproportionation into formally Ru2 + (4 d6 ) and Ru3 + (4 d5 ). Every Ru 4 d5 site with one hole in the t2 g shell is surrounded by nearest neighbors of 4 d6 character, where the t2 g level is full and magnetically inert. Thus, each type of Ru site forms a triangular lattice, and nearest-neighbor interactions of the original honeycomb are blocked.

  17. Enhanced Approximate Nearest Neighbor via Local Area Focused Search.

    Energy Technology Data Exchange (ETDEWEB)

    Gonzales, Antonio [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Blazier, Nicholas Paul [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-02-01

    Approximate Nearest Neighbor (ANN) algorithms are increasingly important in machine learning, data mining, and image processing applications. There is a large family of space- partitioning ANN algorithms, such as randomized KD-Trees, that work well in practice but are limited by an exponential increase in similarity comparisons required to optimize recall. Additionally, they only support a small set of similarity metrics. We present Local Area Fo- cused Search (LAFS), a method that enhances the way queries are performed using an existing ANN index. Instead of a single query, LAFS performs a number of smaller (fewer similarity comparisons) queries and focuses on a local neighborhood which is refined as candidates are identified. We show that our technique improves performance on several well known datasets and is easily extended to general similarity metrics using kernel projection techniques.

  18. Predicting Audience Location on the Basis of the k-Nearest Neighbor Multilabel Classification

    Directory of Open Access Journals (Sweden)

    Haitao Wu

    2014-01-01

    Full Text Available Understanding audience location information in online social networks is important in designing recommendation systems, improving information dissemination, and so on. In this paper, we focus on predicting the location distribution of audiences on YouTube. And we transform this problem to a multilabel classification problem, while we find there exist three problems when the classical k-nearest neighbor based algorithm for multilabel classification (ML-kNN is used to predict location distribution. Firstly, the feature weights are not considered in measuring the similarity degree. Secondly, it consumes considerable computing time in finding similar items by traversing all the training set. Thirdly, the goal of ML-kNN is to find relevant labels for every sample which is different from audience location prediction. To solve these problems, we propose the methods of measuring similarity based on weight, quickly finding similar items, and ranking a specific number of labels. On the basis of these methods and the ML-kNN, the k-nearest neighbor based model for audience location prediction (AL-kNN is proposed for predicting audience location. The experiments based on massive YouTube data show that the proposed model can more accurately predict the location of YouTube video audience than the ML-kNN, MLNB, and Rank-SVM methods.

  19. Quality and efficiency in high dimensional Nearest neighbor search

    KAUST Repository

    Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos

    2009-01-01

    Nearest neighbor (NN) search in high dimensional space is an important problem in many applications. Ideally, a practical solution (i) should be implementable in a relational database, and (ii) its query cost should grow sub-linearly with the dataset size, regardless of the data and query distributions. Despite the bulk of NN literature, no solution fulfills both requirements, except locality sensitive hashing (LSH). The existing LSH implementations are either rigorous or adhoc. Rigorous-LSH ensures good quality of query results, but requires expensive space and query cost. Although adhoc-LSH is more efficient, it abandons quality control, i.e., the neighbor it outputs can be arbitrarily bad. As a result, currently no method is able to ensure both quality and efficiency simultaneously in practice. Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast highdimensional NN search with excellent quality. The combination of several LSB-trees leads to a structure called the LSB-forest that ensures the same result quality as rigorous-LSH, but reduces its space and query cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though the latter has no quality guarantee. Besides its appealing theoretical properties, the LSB-tree itself also serves as an effective index that consumes linear space, and supports efficient updates. Our extensive experiments confirm that the LSB-tree is faster than (i) the state of the art of exact NN search by two orders of magnitude, and (ii) the best (linear-space) method of approximate retrieval by an order of magnitude, and at the same time, returns neighbors with much better quality. © 2009 ACM.

  20. A γ dose distribution evaluation technique using the k-d tree for nearest neighbor searching

    International Nuclear Information System (INIS)

    Yuan Jiankui; Chen Weimin

    2010-01-01

    Purpose: The authors propose an algorithm based on the k-d tree for nearest neighbor searching to improve the γ calculation time for 2D and 3D dose distributions. Methods: The γ calculation method has been widely used for comparisons of dose distributions in clinical treatment plans and quality assurances. By specifying the acceptable dose and distance-to-agreement criteria, the method provides quantitative measurement of the agreement between the reference and evaluation dose distributions. The γ value indicates the acceptability. In regions where γ≤1, the predefined criterion is satisfied and thus the agreement is acceptable; otherwise, the agreement fails. Although the concept of the method is not complicated and a quick naieve implementation is straightforward, an efficient and robust implementation is not trivial. Recent algorithms based on exhaustive searching within a maximum radius, the geometric Euclidean distance, and the table lookup method have been proposed to improve the computational time for multidimensional dose distributions. Motivated by the fact that the least searching time for finding a nearest neighbor can be an O(log N) operation with a k-d tree, where N is the total number of the dose points, the authors propose an algorithm based on the k-d tree for the γ evaluation in this work. Results: In the experiment, the authors found that the average k-d tree construction time per reference point is O(log N), while the nearest neighbor searching time per evaluation point is proportional to O(N 1/k ), where k is between 2 and 3 for two-dimensional and three-dimensional dose distributions, respectively. Conclusions: Comparing with other algorithms such as exhaustive search and sorted list O(N), the k-d tree algorithm for γ evaluation is much more efficient.

  1. River Flow Prediction Using the Nearest Neighbor Probabilistic Ensemble Method

    Directory of Open Access Journals (Sweden)

    H. Sanikhani

    2016-02-01

    Full Text Available Introduction: In the recent years, researchers interested on probabilistic forecasting of hydrologic variables such river flow.A probabilistic approach aims at quantifying the prediction reliability through a probability distribution function or a prediction interval for the unknown future value. The evaluation of the uncertainty associated to the forecast is seen as a fundamental information, not only to correctly assess the prediction, but also to compare forecasts from different methods and to evaluate actions and decisions conditionally on the expected values. Several probabilistic approaches have been proposed in the literature, including (1 methods that use resampling techniques to assess parameter and model uncertainty, such as the Metropolis algorithm or the Generalized Likelihood Uncertainty Estimation (GLUE methodology for an application to runoff prediction, (2 methods based on processing the forecast errors of past data to produce the probability distributions of future values and (3 methods that evaluate how the uncertainty propagates from the rainfall forecast to the river discharge prediction, as the Bayesian forecasting system. Materials and Methods: In this study, two different probabilistic methods are used for river flow prediction.Then the uncertainty related to the forecast is quantified. One approach is based on linear predictors and in the other, nearest neighbor was used. The nonlinear probabilistic ensemble can be used for nonlinear time series analysis using locally linear predictors, while NNPE utilize a method adapted for one step ahead nearest neighbor methods. In this regard, daily river discharge (twelve years of Dizaj and Mashin Stations on Baranduz-Chay basin in west Azerbijan and Zard-River basin in Khouzestan provinces were used, respectively. The first six years of data was applied for fitting the model. The next three years was used to calibration and the remained three yeas utilized for testing the models

  2. Diagnosis of diabetes diseases using an Artificial Immune Recognition System2 (AIRS2) with fuzzy K-nearest neighbor.

    Science.gov (United States)

    Chikh, Mohamed Amine; Saidi, Meryem; Settouti, Nesma

    2012-10-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disease dataset used in our work is retrieved from UCI machine learning repository. The performances of the AIRS2 and MAIRS2 are evaluated regarding classification accuracy, sensitivity and specificity values. The highest classification accuracy obtained when applying the AIRS2 and MAIRS2 using 10-fold cross-validation was, respectively 82.69% and 89.10%.

  3. Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds

    Directory of Open Access Journals (Sweden)

    Chin-Hsing Chen

    2015-06-01

    Full Text Available A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician’s subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications.

  4. A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-05-01

    Full Text Available Electric load forecasting plays an important role in electricity markets and power systems. Because electric load time series are complicated and nonlinear, it is very difficult to achieve a satisfactory forecasting accuracy. In this paper, a hybrid model, Wavelet Denoising-Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EWKM, which combines k-Nearest Neighbor (KNN and Extreme Learning Machine (ELM based on a wavelet denoising technique is proposed for short-term load forecasting. The proposed hybrid model decomposes the time series into a low frequency-associated main signal and some detailed signals associated with high frequencies at first, then uses KNN to determine the independent and dependent variables from the low-frequency signal. Finally, the ELM is used to get the non-linear relationship between these variables to get the final prediction result for the electric load. Compared with three other models, Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EKM, Wavelet Denoising-Extreme Learning Machine (WKM and Wavelet Denoising-Back Propagation Neural Network optimized by k-Nearest Neighbor Regression (WNNM, the model proposed in this paper can improve the accuracy efficiently. New South Wales is the economic powerhouse of Australia, so we use the proposed model to predict electric demand for that region. The accurate prediction has a significant meaning.

  5. False-nearest-neighbors algorithm and noise-corrupted time series

    International Nuclear Information System (INIS)

    Rhodes, C.; Morari, M.

    1997-01-01

    The false-nearest-neighbors (FNN) algorithm was originally developed to determine the embedding dimension for autonomous time series. For noise-free computer-generated time series, the algorithm does a good job in predicting the embedding dimension. However, the problem of predicting the embedding dimension when the time-series data are corrupted by noise was not fully examined in the original studies of the FNN algorithm. Here it is shown that with large data sets, even small amounts of noise can lead to incorrect prediction of the embedding dimension. Surprisingly, as the length of the time series analyzed by FNN grows larger, the cause of incorrect prediction becomes more pronounced. An analysis of the effect of noise on the FNN algorithm and a solution for dealing with the effects of noise are given here. Some results on the theoretically correct choice of the FNN threshold are also presented. copyright 1997 The American Physical Society

  6. Nearest Neighbor Estimates of Entropy for Multivariate Circular Distributions

    Directory of Open Access Journals (Sweden)

    Neeraj Misra

    2010-05-01

    Full Text Available In molecular sciences, the estimation of entropies of molecules is important for the understanding of many chemical and biological processes. Motivated by these applications, we consider the problem of estimating the entropies of circular random vectors and introduce non-parametric estimators based on circular distances between n sample points and their k th nearest neighbors (NN, where k (≤ n – 1 is a fixed positive integer. The proposed NN estimators are based on two different circular distances, and are proven to be asymptotically unbiased and consistent. The performance of one of the circular-distance estimators is investigated and compared with that of the already established Euclidean-distance NN estimator using Monte Carlo samples from an analytic distribution of six circular variables of an exactly known entropy and a large sample of seven internal-rotation angles in the molecule of tartaric acid, obtained by a realistic molecular-dynamics simulation.

  7. A Comparison of the Spatial Linear Model to Nearest Neighbor (k-NN) Methods for Forestry Applications

    Science.gov (United States)

    Jay M. Ver Hoef; Hailemariam Temesgen; Sergio Gómez

    2013-01-01

    Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically,...

  8. Sequential nearest-neighbor effects on computed {sup 13}C{sup {alpha}} chemical shifts

    Energy Technology Data Exchange (ETDEWEB)

    Vila, Jorge A. [Cornell University, Baker Laboratory of Chemistry and Chemical Biology (United States); Serrano, Pedro; Wuethrich, Kurt [The Scripps Research Institute, Department of Molecular Biology (United States); Scheraga, Harold A., E-mail: has5@cornell.ed [Cornell University, Baker Laboratory of Chemistry and Chemical Biology (United States)

    2010-09-15

    To evaluate sequential nearest-neighbor effects on quantum-chemical calculations of {sup 13}C{sup {alpha}} chemical shifts, we selected the structure of the nucleic acid binding (NAB) protein from the SARS coronavirus determined by NMR in solution (PDB id 2K87). NAB is a 116-residue {alpha}/{beta} protein, which contains 9 prolines and has 50% of its residues located in loops and turns. Overall, the results presented here show that sizeable nearest-neighbor effects are seen only for residues preceding proline, where Pro introduces an overestimation, on average, of 1.73 ppm in the computed {sup 13}C{sup {alpha}} chemical shifts. A new ensemble of 20 conformers representing the NMR structure of the NAB, which was calculated with an input containing backbone torsion angle constraints derived from the theoretical {sup 13}C{sup {alpha}} chemical shifts as supplementary data to the NOE distance constraints, exhibits very similar topology and comparable agreement with the NOE constraints as the published NMR structure. However, the two structures differ in the patterns of differences between observed and computed {sup 13}C{sup {alpha}} chemical shifts, {Delta}{sub ca,i}, for the individual residues along the sequence. This indicates that the {Delta}{sub ca,i} -values for the NAB protein are primarily a consequence of the limited sampling by the bundles of 20 conformers used, as in common practice, to represent the two NMR structures, rather than of local flaws in the structures.

  9. Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbor imputation methods

    Science.gov (United States)

    Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett

    2009-01-01

    Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....

  10. FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

    OpenAIRE

    Lu Si; Jie Yu; Shasha Li; Jun Ma; Lei Luo; Qingbo Wu; Yongqi Ma; Zhengji Liu

    2017-01-01

    Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rul...

  11. Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance

    Science.gov (United States)

    Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi

    2017-11-01

    K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).

  12. A Novel Quantum Solution to Privacy-Preserving Nearest Neighbor Query in Location-Based Services

    Science.gov (United States)

    Luo, Zhen-yu; Shi, Run-hua; Xu, Min; Zhang, Shun

    2018-04-01

    We present a cheating-sensitive quantum protocol for Privacy-Preserving Nearest Neighbor Query based on Oblivious Quantum Key Distribution and Quantum Encryption. Compared with the classical related protocols, our proposed protocol has higher security, because the security of our protocol is based on basic physical principles of quantum mechanics, instead of difficulty assumptions. Especially, our protocol takes single photons as quantum resources and only needs to perform single-photon projective measurement. Therefore, it is feasible to implement this protocol with the present technologies.

  13. Competing growth processes induced by next-nearest-neighbor interactions: Effects on meandering wavelength and stiffness

    Science.gov (United States)

    Blel, Sonia; Hamouda, Ajmi BH.; Mahjoub, B.; Einstein, T. L.

    2017-02-01

    In this paper we explore the meandering instability of vicinal steps with a kinetic Monte Carlo simulations (kMC) model including the attractive next-nearest-neighbor (NNN) interactions. kMC simulations show that increase of the NNN interaction strength leads to considerable reduction of the meandering wavelength and to weaker dependence of the wavelength on the deposition rate F. The dependences of the meandering wavelength on the temperature and the deposition rate obtained with simulations are in good quantitative agreement with the experimental result on the meandering instability of Cu(0 2 24) [T. Maroutian et al., Phys. Rev. B 64, 165401 (2001), 10.1103/PhysRevB.64.165401]. The effective step stiffness is found to depend not only on the strength of NNN interactions and the Ehrlich-Schwoebel barrier, but also on F. We argue that attractive NNN interactions intensify the incorporation of adatoms at step edges and enhance step roughening. Competition between NNN and nearest-neighbor interactions results in an alternative form of meandering instability which we call "roughening-limited" growth, rather than attachment-detachment-limited growth that governs the Bales-Zangwill instability. The computed effective wavelength and the effective stiffness behave as λeff˜F-q and β˜eff˜F-p , respectively, with q ≈p /2 .

  14. Implementation of Nearest Neighbor using HSV to Identify Skin Disease

    Science.gov (United States)

    Gerhana, Y. A.; Zulfikar, W. B.; Ramdani, A. H.; Ramdhani, M. A.

    2018-01-01

    Today, Android is one of the most widely used operating system in the world. Most of android device has a camera that could capture an image, this feature could be optimized to identify skin disease. The disease is one of health problem caused by bacterium, fungi, and virus. The symptoms of skin disease usually visible. In this work, the symptoms that captured as image contains HSV in every pixel of the image. HSV can extracted and then calculate to earn euclidean value. The value compared using nearest neighbor algorithm to discover closer value between image testing and image training to get highest value that decide class label or type of skin disease. The testing result show that 166 of 200 or about 80% is accurate. There are some reasons that influence the result of classification model like number of image training and quality of android device’s camera.

  15. Classification of matrix-product ground states corresponding to one-dimensional chains of two-state sites of nearest neighbor interactions

    International Nuclear Information System (INIS)

    Fatollahi, Amir H.; Khorrami, Mohammad; Shariati, Ahmad; Aghamohammadi, Amir

    2011-01-01

    A complete classification is given for one-dimensional chains with nearest-neighbor interactions having two states in each site, for which a matrix product ground state exists. The Hamiltonians and their corresponding matrix product ground states are explicitly obtained.

  16. Correction of dental artifacts within the anatomical surface in PET/MRI using active shape models and k-nearest-neighbors

    DEFF Research Database (Denmark)

    Ladefoged, Claes N.; Andersen, Flemming L.; Keller, Sune H.

    2014-01-01

    n combined PET/MR, attenuation correction (AC) is performed indirectly based on the available MR image information. Metal implant-induced susceptibility artifacts and subsequent signal voids challenge MR-based AC. Several papers acknowledge the problem in PET attenuation correction when dental...... artifacts are ignored, but none of them attempts to solve the problem. We propose a clinically feasible correction method which combines Active Shape Models (ASM) and k- Nearest-Neighbors (kNN) into a simple approach which finds and corrects the dental artifacts within the surface boundaries of the patient...... anatomy. ASM is used to locate a number of landmarks in the T1-weighted MR-image of a new patient. We calculate a vector of offsets from each voxel within a signal void to each of the landmarks. We then use kNN to classify each voxel as belonging to an artifact or an actual signal void using this offset...

  17. k-Nearest Neighbors Algorithm in Profiling Power Analysis Attacks

    Directory of Open Access Journals (Sweden)

    Z. Martinasek

    2016-06-01

    Full Text Available Power analysis presents the typical example of successful attacks against trusted cryptographic devices such as RFID (Radio-Frequency IDentifications and contact smart cards. In recent years, the cryptographic community has explored new approaches in power analysis based on machine learning models such as Support Vector Machine (SVM, RF (Random Forest and Multi-Layer Perceptron (MLP. In this paper, we made an extensive comparison of machine learning algorithms in the power analysis. For this purpose, we implemented a verification program that always chooses the optimal settings of individual machine learning models in order to obtain the best classification accuracy. In our research, we used three datasets, the first containing the power traces of an unprotected AES (Advanced Encryption Standard implementation. The second and third datasets are created independently from public available power traces corresponding to a masked AES implementation (DPA Contest v4. The obtained results revealed some interesting facts, namely, an elementary k-NN (k-Nearest Neighbors algorithm, which has not been commonly used in power analysis yet, shows great application potential in practice.

  18. Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor

    OpenAIRE

    Samir Brahim Belhaouari

    2009-01-01

    By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less ...

  19. Nearest neighbor spacing distributions of low-lying levels of vibrational nuclei

    International Nuclear Information System (INIS)

    Abul-Magd, A.Y.; Simbel, M.H.

    1996-01-01

    Energy-level statistics are considered for nuclei whose Hamiltonian is divided into intrinsic and collective-vibrational terms. The levels are described as a random superposition of independent sequences, each corresponding to a given number of phonons. The intrinsic motion is assumed chaotic. The level spacing distribution is found to be intermediate between the Wigner and Poisson distributions and similar in form to the spacing distribution of a system with classical phase space divided into separate regular and chaotic domains. We have obtained approximate expressions for the nearest neighbor spacing and cumulative spacing distribution valid when the level density is described by a constant-temperature formula and not involving additional free parameters. These expressions have been able to achieve good agreement with the experimental spacing distributions. copyright 1996 The American Physical Society

  20. Common Nearest Neighbor Clustering—A Benchmark

    Directory of Open Access Journals (Sweden)

    Oliver Lemke

    2018-02-01

    Full Text Available Cluster analyses are often conducted with the goal to characterize an underlying probability density, for which the data-point density serves as an estimate for this probability density. We here test and benchmark the common nearest neighbor (CNN cluster algorithm. This algorithm assigns a spherical neighborhood R to each data point and estimates the data-point density between two data points as the number of data points N in the overlapping region of their neighborhoods (step 1. The main principle in the CNN cluster algorithm is cluster growing. This grows the clusters by sequentially adding data points and thereby effectively positions the border of the clusters along an iso-surface of the underlying probability density. This yields a strict partitioning with outliers, for which the cluster represents peaks in the underlying probability density—termed core sets (step 2. The removal of the outliers on the basis of a threshold criterion is optional (step 3. The benchmark datasets address a series of typical challenges, including datasets with a very high dimensional state space and datasets in which the cluster centroids are aligned along an underlying structure (Birch sets. The performance of the CNN algorithm is evaluated with respect to these challenges. The results indicate that the CNN cluster algorithm can be useful in a wide range of settings. Cluster algorithms are particularly important for the analysis of molecular dynamics (MD simulations. We demonstrate how the CNN cluster results can be used as a discretization of the molecular state space for the construction of a core-set model of the MD improving the accuracy compared to conventional full-partitioning models. The software for the CNN clustering is available on GitHub.

  1. Mapping wildland fuels and forest structure for land management: a comparison of nearest neighbor imputation and other methods

    Science.gov (United States)

    Kenneth B. Pierce; Janet L. Ohmann; Michael C. Wimberly; Matthew J. Gregory; Jeremy S. Fried

    2009-01-01

    Land managers need consistent information about the geographic distribution of wildland fuels and forest structure over large areas to evaluate fire risk and plan fuel treatments. We compared spatial predictions for 12 fuel and forest structure variables across three regions in the western United States using gradient nearest neighbor (GNN) imputation, linear models (...

  2. K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

    Directory of Open Access Journals (Sweden)

    Cheng Lu

    2015-01-01

    Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

  3. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications

    Directory of Open Access Journals (Sweden)

    Jaime Vitola

    2017-02-01

    Full Text Available Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM system based on the use of a piezoelectric (PZT active system. The SHM system includes: (i the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii data organization; (iii advanced signal processing techniques to define the feature vectors; and finally; (iv the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.

  4. Influence of geometry on light harvesting in dendrimeric systems. II. nth-nearest neighbor effects and the onset of percolation

    International Nuclear Information System (INIS)

    Bentz, Jonathan L.; Kozak, John J.

    2006-01-01

    We explore the effect of imposing different constraints (biases, boundary conditions) on the mean time to trapping (or mean walklength) for a particle (excitation) migrating on a finite dendrimer lattice with a centrally positioned trap. By mobilizing the theory of finite Markov processes, we are able to obtain exact analytic expressions for site-specific walklengths as well as the overall walklength for both nearest-neighbor and second-nearest-neighbor displacements. This allows the comparison with and generalization of earlier results [A. Bar-Haim, J. Klafter, J. Phys. Chem. B 102 (1998) 1662; A. Bar-Haim, J. Klafter, J. Lumin. 76, 77 (1998) 197; O. Flomenbom, R.J. Amir, D. Shabat, J. Klafter, J. Lumin. 111 (2005) 315; J.L. Bentz, F.N. Hosseini, J.J. Kozak, Chem. Phys. Lett. 370 (2003) 319]. A novel feature of this work is the establishment of a connection between the random walk models studied here and percolation theory. The full dynamical behavior was also determined via solution of the stochastic master equation, and the results obtained compared with recent spectroscopic experiments

  5. Influence of geometry on light harvesting in dendrimeric systems. II. nth-nearest neighbor effects and the onset of percolation

    Energy Technology Data Exchange (ETDEWEB)

    Bentz, Jonathan L. [Department of Chemistry, Iowa State University, Ames, IA, 50011 (United States)]. E-mail: jnbntz@iastate.edu; Kozak, John J. [Beckman Institute, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125-7400 (United States)

    2006-11-15

    We explore the effect of imposing different constraints (biases, boundary conditions) on the mean time to trapping (or mean walklength) for a particle (excitation) migrating on a finite dendrimer lattice with a centrally positioned trap. By mobilizing the theory of finite Markov processes, we are able to obtain exact analytic expressions for site-specific walklengths as well as the overall walklength for both nearest-neighbor and second-nearest-neighbor displacements. This allows the comparison with and generalization of earlier results [A. Bar-Haim, J. Klafter, J. Phys. Chem. B 102 (1998) 1662; A. Bar-Haim, J. Klafter, J. Lumin. 76, 77 (1998) 197; O. Flomenbom, R.J. Amir, D. Shabat, J. Klafter, J. Lumin. 111 (2005) 315; J.L. Bentz, F.N. Hosseini, J.J. Kozak, Chem. Phys. Lett. 370 (2003) 319]. A novel feature of this work is the establishment of a connection between the random walk models studied here and percolation theory. The full dynamical behavior was also determined via solution of the stochastic master equation, and the results obtained compared with recent spectroscopic experiments.

  6. Chaotic synchronization of nearest-neighbor diffusive coupling Hindmarsh-Rose neural networks in noisy environments

    International Nuclear Information System (INIS)

    Fang Xiaoling; Yu Hongjie; Jiang Zonglai

    2009-01-01

    The chaotic synchronization of Hindmarsh-Rose neural networks linked by a nonlinear coupling function is discussed. The HR neural networks with nearest-neighbor diffusive coupling form are treated as numerical examples. By the construction of a special nonlinear-coupled term, the chaotic system is coupled symmetrically. For three and four neurons network, a certain region of coupling strength corresponding to full synchronization is given, and the effect of network structure and noise position are analyzed. For five and more neurons network, the full synchronization is very difficult to realize. All the results have been proved by the calculation of the maximum conditional Lyapunov exponent.

  7. A LITERATURE SURVEY ON VARIOUS ILLUMINATION NORMALIZATION TECHNIQUES FOR FACE RECOGNITION WITH FUZZY K NEAREST NEIGHBOUR CLASSIFIER

    Directory of Open Access Journals (Sweden)

    A. Thamizharasi

    2015-05-01

    Full Text Available The face recognition is popular in video surveillance, social networks and criminal identifications nowadays. The performance of face recognition would be affected by variations in illumination, pose, aging and partial occlusion of face by Wearing Hats, scarves and glasses etc. The illumination variations are still the challenging problem in face recognition. The aim is to compare the various illumination normalization techniques. The illumination normalization techniques include: Log transformations, Power Law transformations, Histogram equalization, Adaptive histogram equalization, Contrast stretching, Retinex, Multi scale Retinex, Difference of Gaussian, DCT, DCT Normalization, DWT, Gradient face, Self Quotient, Multi scale Self Quotient and Homomorphic filter. The proposed work consists of three steps. First step is to preprocess the face image with the above illumination normalization techniques; second step is to create the train and test database from the preprocessed face images and third step is to recognize the face images using Fuzzy K nearest neighbor classifier. The face recognition accuracy of all preprocessing techniques is compared using the AR face database of color images.

  8. Sistem Klasifikasi Kualitas Kopra Berdasarkan Warna dan Tekstur Menggunakan Metode Nearest Mean Classifier (NMC

    Directory of Open Access Journals (Sweden)

    Abdullah Abdullah

    2017-12-01

    The classification of copra quality with the help of computer by using image processing can help to speed up human work. Data mining techniques can be utilized for copra quality classification based on RGB color (red, green, blue and texture (energy, contrast, correlation, homogeneity. The problem is the difficulty in predicting the quality of copra in grade of A (80-85%, grade of B (70-75% and grade of C (60-65%. The purpose of this study is to develope an application for the classification of copra quality based on color and texture. The method used is the nearest mean classifier (NMC. Preprocessing is done before the classification process for background subtraction by using pixel subtraction method to separate the image of object against the background. The benefits of this research are it can save time in classifying the quality of copra and can facilitate the determination of copra price. Based on the evaluation result by using cross validation method obtained the average accuracy is 80.67% with standard deviation is 1.17%.  Keywords: classification,  image, copra, nearest mean classifier, pixel subtraction, RGB color, texture

  9. Studying nearest neighbor correlations by atom probe tomography (APT) in metallic glasses as exemplified for Fe40Ni40B20 glassy ribbons

    KAUST Repository

    Shariq, Ahmed

    2012-01-01

    A next nearest neighbor evaluation procedure of atom probe tomography data provides distributions of the distances between atoms. The width of these distributions for metallic glasses studied so far is a few Angstrom reflecting the spatial resolution of the analytical technique. However, fitting Gaussian distributions to the distribution of atomic distances yields average distances with statistical uncertainties of 2 to 3 hundredth of an Angstrom. Fe 40Ni40B20 metallic glass ribbons are characterized this way in the as quenched state and for a state heat treated at 350 °C for 1 h revealing a change in the structure on the sub-nanometer scale. By applying the statistical tool of the χ2 test a slight deviation from a random distribution of B-atoms in the as quenched sample is perceived, whereas a pronounced elemental inhomogeneity of boron is detected for the annealed state. In addition, the distance distribution of the first fifteen atomic neighbors is determined by using this algorithm for both annealed and as quenched states. The next neighbor evaluation algorithm evinces a steric periodicity of the atoms when the next neighbor distances are normalized by the first next neighbor distance. A comparison of the nearest neighbor atomic distribution for as quenched and annealed state shows accumulation of Ni and B. Moreover, it also reveals the tendency of Fe and B to move slightly away from each other, an incipient step to Ni rich boride formation. © 2011 Elsevier B.V.

  10. Polymers with nearest- and next nearest-neighbor interactions on the Husimi lattice

    Science.gov (United States)

    Oliveira, Tiago J.

    2016-04-01

    The exact grand-canonical solution of a generalized interacting self-avoid walk (ISAW) model, placed on a Husimi lattice built with squares, is presented. In this model, beyond the traditional interaction {ω }1={{{e}}}{ɛ 1/{k}BT} between (nonconsecutive) monomers on nearest-neighbor (NN) sites, an additional energy {ɛ }2 is associated to next-NN (NNN) monomers. Three definitions of NNN sites/interactions are considered, where each monomer can have, effectively, at most two, four, or six NNN monomers on the Husimi lattice. The phase diagrams found in all cases have (qualitatively) the same thermodynamic properties: a non-polymerized (NP) and a polymerized (P) phase separated by a critical and a coexistence surface that meet at a tricritical (θ-) line. This θ-line is found even when one of the interactions is repulsive, existing for {ω }1 in the range [0,∞ ), i.e., for {ɛ }1/{k}BT in the range [-∞ ,∞ ). Thus, counterintuitively, a θ-point exists even for an infinite repulsion between NN monomers ({ω }1=0), being associated to a coil-‘soft globule’ transition. In the limit of an infinite repulsive force between NNN monomers, however, the coil-globule transition disappears, and only NP-P continuous transition is observed. This particular case, with {ω }2=0, is also solved exactly on the square lattice, using a transfer matrix calculation where a discontinuous NP-P transition is found. For attractive and repulsive forces between NN and NNN monomers, respectively, the model becomes quite similar to the semiflexible-ISAW one, whose crystalline phase is not observed here, as a consequence of the frustration due to competing NN and NNN forces. The mapping of the phase diagrams in canonical ones is discussed and compared with recent results from Monte Carlo simulations on the square lattice.

  11. Classifier Fusion With Contextual Reliability Evaluation.

    Science.gov (United States)

    Liu, Zhunga; Pan, Quan; Dezert, Jean; Han, Jun-Wei; He, You

    2018-05-01

    Classifier fusion is an efficient strategy to improve the classification performance for the complex pattern recognition problem. In practice, the multiple classifiers to combine can have different reliabilities and the proper reliability evaluation plays an important role in the fusion process for getting the best classification performance. We propose a new method for classifier fusion with contextual reliability evaluation (CF-CRE) based on inner reliability and relative reliability concepts. The inner reliability, represented by a matrix, characterizes the probability of the object belonging to one class when it is classified to another class. The elements of this matrix are estimated from the -nearest neighbors of the object. A cautious discounting rule is developed under belief functions framework to revise the classification result according to the inner reliability. The relative reliability is evaluated based on a new incompatibility measure which allows to reduce the level of conflict between the classifiers by applying the classical evidence discounting rule to each classifier before their combination. The inner reliability and relative reliability capture different aspects of the classification reliability. The discounted classification results are combined with Dempster-Shafer's rule for the final class decision making support. The performance of CF-CRE have been evaluated and compared with those of main classical fusion methods using real data sets. The experimental results show that CF-CRE can produce substantially higher accuracy than other fusion methods in general. Moreover, CF-CRE is robust to the changes of the number of nearest neighbors chosen for estimating the reliability matrix, which is appealing for the applications.

  12. Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization

    Science.gov (United States)

    Jamaluddin; Siringoringo, Rimbun

    2017-12-01

    Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%

  13. Nearest neighbor imputation using spatial-temporal correlations in wireless sensor networks.

    Science.gov (United States)

    Li, YuanYuan; Parker, Lynne E

    2014-01-01

    Missing data is common in Wireless Sensor Networks (WSNs), especially with multi-hop communications. There are many reasons for this phenomenon, such as unstable wireless communications, synchronization issues, and unreliable sensors. Unfortunately, missing data creates a number of problems for WSNs. First, since most sensor nodes in the network are battery-powered, it is too expensive to have the nodes retransmit missing data across the network. Data re-transmission may also cause time delays when detecting abnormal changes in an environment. Furthermore, localized reasoning techniques on sensor nodes (such as machine learning algorithms to classify states of the environment) are generally not robust enough to handle missing data. Since sensor data collected by a WSN is generally correlated in time and space, we illustrate how replacing missing sensor values with spatially and temporally correlated sensor values can significantly improve the network's performance. However, our studies show that it is important to determine which nodes are spatially and temporally correlated with each other. Simple techniques based on Euclidean distance are not sufficient for complex environmental deployments. Thus, we have developed a novel Nearest Neighbor (NN) imputation method that estimates missing data in WSNs by learning spatial and temporal correlations between sensor nodes. To improve the search time, we utilize a k d-tree data structure, which is a non-parametric, data-driven binary search tree. Instead of using traditional mean and variance of each dimension for k d-tree construction, and Euclidean distance for k d-tree search, we use weighted variances and weighted Euclidean distances based on measured percentages of missing data. We have evaluated this approach through experiments on sensor data from a volcano dataset collected by a network of Crossbow motes, as well as experiments using sensor data from a highway traffic monitoring application. Our experimental

  14. Algoritma Interpolasi Nearest-Neighbor untuk Pendeteksian Sampul Pulsa Oscilometri Menggunakan Mikrokontroler Berbiaya Rendah

    Directory of Open Access Journals (Sweden)

    Firdaus Firdaus

    2017-12-01

    Full Text Available Non-invasive blood pressure measurement devices are widely available in the marketplace. Most of these devices use the oscillometric principle that store and analyze oscillometric waveforms during cuff deflation to obtain mean arterial pressure, systolic blood pressure and diastolic blood pressure. Those pressure values are determined from the oscillometric waveform envelope. Several methods to detect the envelope of oscillometric pulses utilize a complex algorithm that requires a large capacity memory and certainly difficult to process by a low memory capacity embedded system. A simple nearest-neighbor interpolation method is applied for oscillometric pulse envelope detection in non-invasive blood pressure measurement using microcontroller such ATmega328. The experiment yields 59 seconds average time to process the computation with 3.6% average percent error in blood pressure measurement.

  15. Phase Transition and Critical Values of a Nearest-Neighbor System with Uncountable Local State Space on Cayley Trees

    International Nuclear Information System (INIS)

    Jahnel, Benedikt; Külske, Christof; Botirov, Golibjon I.

    2014-01-01

    We consider a ferromagnetic nearest-neighbor model on a Cayley tree of degree k ⩾ 2 with uncountable local state space [0,1] where the energy function depends on a parameter θ ∊[0, 1). We show that for 0 ⩽ θ ⩽ 5 3 k the model has a unique translation-invariant Gibbs measure. If 5 3 k < θ < 1 , there is a phase transition, in particular there are three translation-invariant Gibbs measures

  16. Spin canting in a Dy-based single-chain magnet with dominant next-nearest-neighbor antiferromagnetic interactions

    Science.gov (United States)

    Bernot, K.; Luzon, J.; Caneschi, A.; Gatteschi, D.; Sessoli, R.; Bogani, L.; Vindigni, A.; Rettori, A.; Pini, M. G.

    2009-04-01

    We investigate theoretically and experimentally the static magnetic properties of single crystals of the molecular-based single-chain magnet of formula [Dy(hfac)3NIT(C6H4OPh)]∞ comprising alternating Dy3+ and organic radicals. The magnetic molar susceptibility χM displays a strong angular variation for sample rotations around two directions perpendicular to the chain axis. A peculiar inversion between maxima and minima in the angular dependence of χM occurs on increasing temperature. Using information regarding the monomeric building block as well as an ab initio estimation of the magnetic anisotropy of the Dy3+ ion, this “anisotropy-inversion” phenomenon can be assigned to weak one-dimensional ferromagnetism along the chain axis. This indicates that antiferromagnetic next-nearest-neighbor interactions between Dy3+ ions dominate, despite the large Dy-Dy separation, over the nearest-neighbor interactions between the radicals and the Dy3+ ions. Measurements of the field dependence of the magnetization, both along and perpendicularly to the chain, and of the angular dependence of χM in a strong magnetic field confirm such an interpretation. Transfer-matrix simulations of the experimental measurements are performed using a classical one-dimensional spin model with antiferromagnetic Heisenberg exchange interaction and noncollinear uniaxial single-ion anisotropies favoring a canted antiferromagnetic spin arrangement, with a net magnetic moment along the chain axis. The fine agreement obtained with experimental data provides estimates of the Hamiltonian parameters, essential for further study of the dynamics of rare-earth-based molecular chains.

  17. Reentrant behavior in the nearest-neighbor Ising antiferromagnet in a magnetic field

    Science.gov (United States)

    Neto, Minos A.; de Sousa, J. Ricardo

    2004-12-01

    Motived by the H-T phase diagram in the bcc Ising antiferromagnetic with nearest-neighbor interactions obtained by Monte Carlo simulation [Landau, Phys. Rev. B 16, 4164 (1977)] that shows a reentrant behavior at low temperature, with two critical temperatures in magnetic field about 2% greater than the critical value Hc=8J , we apply the effective field renormalization group (EFRG) approach in this model on three-dimensional lattices (simple cubic-sc and body centered cubic-bcc). We find that the critical curve TN(H) exhibits a maximum point around of H≃Hc only in the bcc lattice case. We also discuss the critical behavior by the effective field theory in clusters with one (EFT-1) and two (EFT-2) spins, and a reentrant behavior is observed for the sc and bcc lattices. We have compared our results of EFRG in the bcc lattice with Monte Carlo and series expansion, and we observe a good accordance between the methods.

  18. Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Cobaugh Christian W

    2004-08-01

    Full Text Available Abstract Background A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1. Results The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases. Conclusion Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.

  19. Evaluating a k-nearest neighbours-based classifier for locating faulty areas in power systems

    Directory of Open Access Journals (Sweden)

    Juan José Mora Flórez

    2008-09-01

    Full Text Available This paper reports a strategy for identifying and locating faults in a power distribution system. The strategy was based on the K-nearest neighbours technique. This technique simply helps to estimate a distance from the features used for describing a particu-lar fault being classified to the faults presented during the training stage. If new data is presented to the proposed fault locator, it is classified according to the nearest example recovered. A characterisation of the voltage and current measurements obtained at one single line end is also presented in this document for assigning the area in the case of a fault in a power system. The pro-posed strategy was tested in a real power distribution system, average 93% confidence indexes being obtained which gives a good indicator of the proposal’s high performance. The results showed how a fault could be located by using features obtained from voltage and current, improving utility response and thereby improving system continuity indexes in power distribution sys-tems.

  20. Phosphorous vacancy nearest neighbor hopping induced instabilities in InP capacitors II. Computer simulation

    International Nuclear Information System (INIS)

    Juang, M.T.; Wager, J.F.; Van Vechten, J.A.

    1988-01-01

    Drain current drift in InP metal insulator semiconductor devices display distinct activation energies and pre-exponential factors. The authors have given evidence that these result from two physical mechanisms: thermionic tunneling of electrons into native oxide traps and phosphorous vacancy nearest neighbor hopping (PVNNH). They here present a computer simulation of the effect of the PVNHH mechanism on flatband voltage shift vs. bias stress time measurements. The simulation is based on an analysis of the kinetics of the PVNNH defect reaction sequence in which the electron concentration in the channel is related to the applied bias by a solution of the Poisson equation. The simulation demonstrates quantitatively that the temperature dependence of the flatband shift is associated with PVNNH for temperatures above room temperature

  1. Two tree-formation methods for fast pattern search using nearest-neighbour and nearest-centroid matching

    NARCIS (Netherlands)

    Schomaker, Lambertus; Mangalagiu, D.; Vuurpijl, Louis; Weinfeld, M.; Schomaker, Lambert; Vuurpijl, Louis

    2000-01-01

    This paper describes tree­based classification of character images, comparing two methods of tree formation and two methods of matching: nearest neighbor and nearest centroid. The first method, Preprocess Using Relative Distances (PURD) is a tree­based reorganization of a flat list of patterns,

  2. Forecasting of steel consumption with use of nearest neighbors method

    Directory of Open Access Journals (Sweden)

    Rogalewicz Michał

    2017-01-01

    Full Text Available In the process of building a steel construction, its design is usually commissioned to the design office. Then a quotation is made and the finished offer is delivered to the customer. Its final shape is influenced by steel consumption to a great extent. Correct determination of the potential consumption of this material most often determines the profitability of the project. Because of a long waiting time for a final project from the design office, it is worthwhile to pre-analyze the project’s profitability and feasibility using historical data on already realized orders. The paper presents an innovative approach to decision-making support in one of the Polish construction companies. The authors have defined and prioritized the most important factors that differentiate the executed orders and have the greatest impact on steel consumption. These are, among others: height and width of steel structure, number of aisles, type of roof, etc. Then they applied and adapted the method of k-nearest neighbors to the specificity of the discussed problem. The goal was to search a set of historical orders and find the most similar to the analyzed one. On this basis, consumption of steel can be estimated. The method was programmed within the EXPLOR application.

  3. Rapid and Robust Cross-Correlation-Based Seismic Phase Identification Using an Approximate Nearest Neighbor Method

    Science.gov (United States)

    Tibi, R.; Young, C. J.; Gonzales, A.; Ballard, S.; Encarnacao, A. V.

    2016-12-01

    The matched filtering technique involving the cross-correlation of a waveform of interest with archived signals from a template library has proven to be a powerful tool for detecting events in regions with repeating seismicity. However, waveform correlation is computationally expensive, and therefore impractical for large template sets unless dedicated distributed computing hardware and software are used. In this study, we introduce an Approximate Nearest Neighbor (ANN) approach that enables the use of very large template libraries for waveform correlation without requiring a complex distributed computing system. Our method begins with a projection into a reduced dimensionality space based on correlation with a randomized subset of the full template archive. Searching for a specified number of nearest neighbors is accomplished by using randomized K-dimensional trees. We used the approach to search for matches to each of 2700 analyst-reviewed signal detections reported for May 2010 for the IMS station MKAR. The template library in this case consists of a dataset of more than 200,000 analyst-reviewed signal detections for the same station from 2002-2014 (excluding May 2010). Of these signal detections, 60% are teleseismic first P, and 15% regional phases (Pn, Pg, Sn, and Lg). The analyses performed on a standard desktop computer shows that the proposed approach performs the search of the large template libraries about 20 times faster than the standard full linear search, while achieving recall rates greater than 80%, with the recall rate increasing for higher correlation values. To decide whether to confirm a match, we use a hybrid method involving a cluster approach for queries with two or more matches, and correlation score for single matches. Of the signal detections that passed our confirmation process, 52% were teleseismic first P, and 30% were regional phases.

  4. Third nearest neighbor parameterized tight binding model for graphene nano-ribbons

    Directory of Open Access Journals (Sweden)

    Van-Truong Tran

    2017-07-01

    Full Text Available The existing tight binding models can very well reproduce the ab initio band structure of a 2D graphene sheet. For graphene nano-ribbons (GNRs, the current sets of tight binding parameters can successfully describe the semi-conducting behavior of all armchair GNRs. However, they are still failing in reproducing accurately the slope of the bands that is directly associated with the group velocity and the effective mass of electrons. In this work, both density functional theory and tight binding calculations were performed and a new set of tight binding parameters up to the third nearest neighbors including overlap terms is introduced. The results obtained with this model offer excellent agreement with the predictions of the density functional theory in most cases of ribbon structures, even in the high-energy region. Moreover, this set can induce electron-hole asymmetry as manifested in results from density functional theory. Relevant outcomes are also achieved for armchair ribbons of various widths as well as for zigzag structures, thus opening a route for multi-scale atomistic simulation of large systems that cannot be considered using density functional theory.

  5. A Distributed Approach to Continuous Monitoring of Constrained k-Nearest Neighbor Queries in Road Networks

    Directory of Open Access Journals (Sweden)

    Hyung-Ju Cho

    2012-01-01

    Full Text Available Given two positive parameters k and r, a constrained k-nearest neighbor (CkNN query returns the k closest objects within a network distance r of the query location in road networks. In terms of the scalability of monitoring these CkNN queries, existing solutions based on central processing at a server suffer from a sudden and sharp rise in server load as well as messaging cost as the number of queries increases. In this paper, we propose a distributed and scalable scheme called DAEMON for the continuous monitoring of CkNN queries in road networks. Our query processing is distributed among clients (query objects and server. Specifically, the server evaluates CkNN queries issued at intersections of road segments, retrieves the objects on the road segments between neighboring intersections, and sends responses to the query objects. Finally, each client makes its own query result using this server response. As a result, our distributed scheme achieves close-to-optimal communication costs and scales well to large numbers of monitoring queries. Exhaustive experimental results demonstrate that our scheme substantially outperforms its competitor in terms of query processing time and messaging cost.

  6. An Investigation to Improve Classifier Accuracy for Myo Collected Data

    Science.gov (United States)

    2017-02-01

    Bad Samples Effect on Classification Accuracy 7 5.1 Naïve Bayes (NB) Classifier Accuracy 7 5.2 Logistic Model Tree (LMT) 10 5.3 K-Nearest Neighbor...gesture, pitch feature, user 06. All samples exhibit reversed movement...20 Fig. A-2 Come gesture, pitch feature, user 14. All samples exhibit reversed movement

  7. Microscopic theory of the nearest-neighbor valence bond sector of the spin-1/2 kagome antiferromagnet

    Science.gov (United States)

    Ralko, Arnaud; Mila, Frédéric; Rousochatzakis, Ioannis

    2018-03-01

    The spin-1/2 Heisenberg model on the kagome lattice, which is closely realized in layered Mott insulators such as ZnCu3(OH) 6Cl2 , is one of the oldest and most enigmatic spin-1/2 lattice models. While the numerical evidence has accumulated in favor of a quantum spin liquid, the debate is still open as to whether it is a Z2 spin liquid with very short-range correlations (some kind of resonating valence bond spin liquid), or an algebraic spin liquid with power-law correlations. To address this issue, we have pushed the program started by Rokhsar and Kivelson in their derivation of the effective quantum dimer model description of Heisenberg models to unprecedented accuracy for the spin-1/2 kagome, by including all the most important virtual singlet contributions on top of the orthogonalization of the nearest-neighbor valence bond singlet basis. Quite remarkably, the resulting picture is a competition between a Z2 spin liquid and a diamond valence bond crystal with a 12-site unit cell, as in the density-matrix renormalization group simulations of Yan et al. Furthermore, we found that, on cylinders of finite diameter d , there is a transition between the Z2 spin liquid at small d and the diamond valence bond crystal at large d , the prediction of the present microscopic description for the two-dimensional lattice. These results show that, if the ground state of the spin-1/2 kagome antiferromagnet can be described by nearest-neighbor singlet dimers, it is a diamond valence bond crystal, and, a contrario, that, if the system is a quantum spin liquid, it has to involve long-range singlets, consistent with the algebraic spin liquid scenario.

  8. Weak doping dependence of the antiferromagnetic coupling between nearest-neighbor Mn2 + spins in (Ba1 -xKx) (Zn1-yMny) 2As2

    Science.gov (United States)

    Surmach, M. A.; Chen, B. J.; Deng, Z.; Jin, C. Q.; Glasbrenner, J. K.; Mazin, I. I.; Ivanov, A.; Inosov, D. S.

    2018-03-01

    Dilute magnetic semiconductors (DMS) are nonmagnetic semiconductors doped with magnetic transition metals. The recently discovered DMS material (Ba1 -xKx) (Zn1-yMny) 2As2 offers a unique and versatile control of the Curie temperature TC by decoupling the spin (Mn2 +, S =5 /2 ) and charge (K+) doping in different crystallographic layers. In an attempt to describe from first-principles calculations the role of hole doping in stabilizing ferromagnetic order, it was recently suggested that the antiferromagnetic exchange coupling J between the nearest-neighbor Mn ions would experience a nearly twofold suppression upon doping 20% of holes by potassium substitution. At the same time, further-neighbor interactions become increasingly ferromagnetic upon doping, leading to a rapid increase of TC. Using inelastic neutron scattering, we have observed a localized magnetic excitation at about 13 meV associated with the destruction of the nearest-neighbor Mn-Mn singlet ground state. Hole doping results in a notable broadening of this peak, evidencing significant particle-hole damping, but with only a minor change in the peak position. We argue that this unexpected result can be explained by a combined effect of superexchange and double-exchange interactions.

  9. Geometric k-nearest neighbor estimation of entropy and mutual information

    Science.gov (United States)

    Lord, Warren M.; Sun, Jie; Bollt, Erik M.

    2018-03-01

    Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induce a bias due to a poor description of the local geometry of the underlying probability measure. We introduce a new class of knn estimators that we call geometric knn estimators (g-knn), which use more complex local volume elements to better model the local geometry of the probability measures. As an example of this class of estimators, we develop a g-knn estimator of entropy and mutual information based on elliptical volume elements, capturing the local stretching and compression common to a wide range of dynamical system attractors. A series of numerical examples in which the thickness of the underlying distribution and the sample sizes are varied suggest that local geometry is a source of problems for knn methods such as the Kraskov-Stögbauer-Grassberger estimator when local geometric effects cannot be removed by global preprocessing of the data. The g-knn method performs well despite the manipulation of the local geometry. In addition, the examples suggest that the g-knn estimators can be of particular relevance to applications in which the system is large, but the data size is limited.

  10. Hole motion in the t-J and Hubbard models: Effect of a next-nearest-neighbor hopping

    International Nuclear Information System (INIS)

    Gagliano, E.; Bacci, S.; Dagotto, E.

    1990-01-01

    Using exact diagonalization techniques, we study one dynamical hole in the two-dimensional t-J and Hubbard models on a square lattice including a next-nearest-neighbor hopping t'. We present the phase diagram in the parameter space (J/t,t'/t), discussing the ground-state properties of the hole. At J=0, a crossing of levels exists at some value of t' separating a ferromagnetic from an antiferromagnetic ground state. For nonzero J, at least four different regions appear where the system behaves like an antiferromagnet or a (not fully saturated) ferromagnet. We study the quasiparticle behavior of the hole, showing that for small values of |t'| the previously presented string picture is still valid. We also find that, for a realistic set of parameters derived from the Cu-O Hamiltonian, the hole has momentum (π/2,π/2), suggesting an enhancement of the p-wave superconducting mode due to the second-neighbor interactions in the spin-bag picture. Results for the t-t'-U model are also discussed with conclusions similar to those of the t-t'-J model. In general we found that t'=0 is not a singular point of these models

  11. Spatiotemporal distribution of Oklahoma earthquakes: Exploring relationships using a nearest-neighbor approach

    Science.gov (United States)

    Vasylkivska, Veronika S.; Huerta, Nicolas J.

    2017-07-01

    Determining the spatiotemporal characteristics of natural and induced seismic events holds the opportunity to gain new insights into why these events occur. Linking the seismicity characteristics with other geologic, geographic, natural, or anthropogenic factors could help to identify the causes and suggest mitigation strategies that reduce the risk associated with such events. The nearest-neighbor approach utilized in this work represents a practical first step toward identifying statistically correlated clusters of recorded earthquake events. Detailed study of the Oklahoma earthquake catalog's inherent errors, empirical model parameters, and model assumptions is presented. We found that the cluster analysis results are stable with respect to empirical parameters (e.g., fractal dimension) but were sensitive to epicenter location errors and seismicity rates. Most critically, we show that the patterns in the distribution of earthquake clusters in Oklahoma are primarily defined by spatial relationships between events. This observation is a stark contrast to California (also known for induced seismicity) where a comparable cluster distribution is defined by both spatial and temporal interactions between events. These results highlight the difficulty in understanding the mechanisms and behavior of induced seismicity but provide insights for future work.

  12. [Classification of Children with Attention-Deficit/Hyperactivity Disorder and Typically Developing Children Based on Electroencephalogram Principal Component Analysis and k-Nearest Neighbor].

    Science.gov (United States)

    Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling

    2016-04-01

    This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention

  13. Magnetization reversal in magnetic dot arrays: Nearest-neighbor interactions and global configurational anisotropy

    Energy Technology Data Exchange (ETDEWEB)

    Van de Wiele, Ben [Department of Electrical Energy, Systems and Automation, Ghent University, Technologiepark 913, B-9052 Ghent-Zwijnaarde (Belgium); Fin, Samuele [Dipartimento di Fisica e Scienze della Terra, Università degli Studi di Ferrara, 44122 Ferrara (Italy); Pancaldi, Matteo [CIC nanoGUNE, E-20018 Donostia-San Sebastian (Spain); Vavassori, Paolo [CIC nanoGUNE, E-20018 Donostia-San Sebastian (Spain); IKERBASQUE, Basque Foundation for Science, E-48013 Bilbao (Spain); Sarella, Anandakumar [Physics Department, Mount Holyoke College, 211 Kendade, 50 College St., South Hadley, Massachusetts 01075 (United States); Bisero, Diego [Dipartimento di Fisica e Scienze della Terra, Università degli Studi di Ferrara, 44122 Ferrara (Italy); CNISM, Unità di Ferrara, 44122 Ferrara (Italy)

    2016-05-28

    Various proposals for future magnetic memories, data processing devices, and sensors rely on a precise control of the magnetization ground state and magnetization reversal process in periodically patterned media. In finite dot arrays, such control is hampered by the magnetostatic interactions between the nanomagnets, leading to the non-uniform magnetization state distributions throughout the sample while reversing. In this paper, we evidence how during reversal typical geometric arrangements of dots in an identical magnetization state appear that originate in the dominance of either Global Configurational Anisotropy or Nearest-Neighbor Magnetostatic interactions, which depends on the fields at which the magnetization reversal sets in. Based on our findings, we propose design rules to obtain the uniform magnetization state distributions throughout the array, and also suggest future research directions to achieve non-uniform state distributions of interest, e.g., when aiming at guiding spin wave edge-modes through dot arrays. Our insights are based on the Magneto-Optical Kerr Effect and Magnetic Force Microscopy measurements as well as the extensive micromagnetic simulations.

  14. Obstacle Detection for Intelligent Transportation Systems Using Deep Stacked Autoencoder and k-Nearest Neighbor Scheme

    KAUST Repository

    Dairi, Abdelkader; Harrou, Fouzi; Sun, Ying; Senouci, Mohamed

    2018-01-01

    Obstacle detection is an essential element for the development of intelligent transportation systems so that accidents can be avoided. In this study, we propose a stereovisionbased method for detecting obstacles in urban environment. The proposed method uses a deep stacked auto-encoders (DSA) model that combines the greedy learning features with the dimensionality reduction capacity and employs an unsupervised k-nearest neighbors algorithm (KNN) to accurately and reliably detect the presence of obstacles. We consider obstacle detection as an anomaly detection problem. We evaluated the proposed method by using practical data from three publicly available datasets, the Malaga stereovision urban dataset (MSVUD), the Daimler urban segmentation dataset (DUSD), and Bahnhof dataset. Also, we compared the efficiency of DSA-KNN approach to the deep belief network (DBN)-based clustering schemes. Results show that the DSA-KNN is suitable to visually monitor urban scenes.

  15. Obstacle Detection for Intelligent Transportation Systems Using Deep Stacked Autoencoder and k-Nearest Neighbor Scheme

    KAUST Repository

    Dairi, Abdelkader

    2018-04-30

    Obstacle detection is an essential element for the development of intelligent transportation systems so that accidents can be avoided. In this study, we propose a stereovisionbased method for detecting obstacles in urban environment. The proposed method uses a deep stacked auto-encoders (DSA) model that combines the greedy learning features with the dimensionality reduction capacity and employs an unsupervised k-nearest neighbors algorithm (KNN) to accurately and reliably detect the presence of obstacles. We consider obstacle detection as an anomaly detection problem. We evaluated the proposed method by using practical data from three publicly available datasets, the Malaga stereovision urban dataset (MSVUD), the Daimler urban segmentation dataset (DUSD), and Bahnhof dataset. Also, we compared the efficiency of DSA-KNN approach to the deep belief network (DBN)-based clustering schemes. Results show that the DSA-KNN is suitable to visually monitor urban scenes.

  16. Fracton topological order from nearest-neighbor two-spin interactions and dualities

    Science.gov (United States)

    Slagle, Kevin; Kim, Yong Baek

    2017-10-01

    Fracton topological order describes a remarkable phase of matter, which can be characterized by fracton excitations with constrained dynamics and a ground-state degeneracy that increases exponentially with the length of the system on a three-dimensional torus. However, previous models exhibiting this order require many-spin interactions, which may be very difficult to realize in a real material or cold atom system. In this work, we present a more physically realistic model which has the so-called X-cube fracton topological order [Vijay, Haah, and Fu, Phys. Rev. B 94, 235157 (2016), 10.1103/PhysRevB.94.235157] but only requires nearest-neighbor two-spin interactions. The model lives on a three-dimensional honeycomb-based lattice with one to two spin-1/2 degrees of freedom on each site and a unit cell of six sites. The model is constructed from two orthogonal stacks of Z2 topologically ordered Kitaev honeycomb layers [Kitaev, Ann. Phys. 321, 2 (2006), 10.1016/j.aop.2005.10.005], which are coupled together by a two-spin interaction. It is also shown that a four-spin interaction can be included to instead stabilize 3+1D Z2 topological order. We also find dual descriptions of four quantum phase transitions in our model, all of which appear to be discontinuous first-order transitions.

  17. Local Order in the Unfolded State: Conformational Biases and Nearest Neighbor Interactions

    Directory of Open Access Journals (Sweden)

    Siobhan Toal

    2014-07-01

    Full Text Available The discovery of Intrinsically Disordered Proteins, which contain significant levels of disorder yet perform complex biologically functions, as well as unwanted aggregation, has motivated numerous experimental and theoretical studies aimed at describing residue-level conformational ensembles. Multiple lines of evidence gathered over the last 15 years strongly suggest that amino acids residues display unique and restricted conformational preferences in the unfolded state of peptides and proteins, contrary to one of the basic assumptions of the canonical random coil model. To fully understand residue level order/disorder, however, one has to gain a quantitative, experimentally based picture of conformational distributions and to determine the physical basis underlying residue-level conformational biases. Here, we review the experimental, computational and bioinformatic evidence for conformational preferences of amino acid residues in (mostly short peptides that can be utilized as suitable model systems for unfolded states of peptides and proteins. In this context particular attention is paid to the alleged high polyproline II preference of alanine. We discuss how these conformational propensities may be modulated by peptide solvent interactions and so called nearest-neighbor interactions. The relevance of conformational propensities for the protein folding problem and the understanding of IDPs is briefly discussed.

  18. Kinetic Models for Topological Nearest-Neighbor Interactions

    Science.gov (United States)

    Blanchet, Adrien; Degond, Pierre

    2017-12-01

    We consider systems of agents interacting through topological interactions. These have been shown to play an important part in animal and human behavior. Precisely, the system consists of a finite number of particles characterized by their positions and velocities. At random times a randomly chosen particle, the follower, adopts the velocity of its closest neighbor, the leader. We study the limit of a system size going to infinity and, under the assumption of propagation of chaos, show that the limit kinetic equation is a non-standard spatial diffusion equation for the particle distribution function. We also study the case wherein the particles interact with their K closest neighbors and show that the corresponding kinetic equation is the same. Finally, we prove that these models can be seen as a singular limit of the smooth rank-based model previously studied in Blanchet and Degond (J Stat Phys 163:41-60, 2016). The proofs are based on a combinatorial interpretation of the rank as well as some concentration of measure arguments.

  19. Novel qsar combination forecast model for insect repellent coupling support vector regression and k-nearest-neighbor

    International Nuclear Information System (INIS)

    Wang, L.F.; Bai, L.Y.

    2013-01-01

    To improve the precision of quantitative structure-activity relationship (QSAR) modeling for aromatic carboxylic acid derivatives insect repellent, a novel nonlinear combination forecast model was proposed integrating support vector regression (SVR) and K-nearest neighbor (KNN): Firstly, search optimal kernel function and nonlinearly select molecular descriptors by the rule of minimum MSE value using SVR. Secondly, illuminate the effects of all descriptors on biological activity by multi-round enforcement resistance-selection. Thirdly, construct the sub-models with predicted values of different KNN. Then, get the optimal kernel and corresponding retained sub-models through subtle selection. Finally, make prediction with leave-one-out (LOO) method in the basis of reserved sub-models. Compared with previous widely used models, our work shows significant improvement in modeling performance, which demonstrates the superiority of the present combination forecast model. (author)

  20. Disordering scaling and generalized nearest-neighbor approach in the thermodynamics of Lennard-Jones systems

    International Nuclear Information System (INIS)

    Vorob'ev, V.S.

    2003-01-01

    We suggest a concept of multiple disordering scaling of the crystalline state. Such a scaling procedure applied to a crystal leads to the liquid and (in low density limit) gas states. This approach provides an explanation to a high value of configuration (common) entropy of liquefied noble gases, which can be deduced from experimental data. We use the generalized nearest-neighbor approach to calculate free energy and pressure of the Lennard-Jones systems after performing this scaling procedure. These thermodynamic functions depend on one parameter characterizing the disordering only. Condensed states of the system (liquid and solid) correspond to small values of this parameter. When this parameter tends to unity, we get an asymptotically exact equation of state for a gas involving the second virial coefficient. A reasonable choice of the values for the disordering parameter (ranging between zero and unity) allows us to find the lines of coexistence between different phase states in the Lennard-Jones systems, which are in a good agreement with the available experimental data

  1. Heterogeneous autoregressive model with structural break using nearest neighbor truncation volatility estimators for DAX.

    Science.gov (United States)

    Chin, Wen Cheong; Lee, Min Cherng; Yap, Grace Lee Ching

    2016-01-01

    High frequency financial data modelling has become one of the important research areas in the field of financial econometrics. However, the possible structural break in volatile financial time series often trigger inconsistency issue in volatility estimation. In this study, we propose a structural break heavy-tailed heterogeneous autoregressive (HAR) volatility econometric model with the enhancement of jump-robust estimators. The breakpoints in the volatility are captured by dummy variables after the detection by Bai-Perron sequential multi breakpoints procedure. In order to further deal with possible abrupt jump in the volatility, the jump-robust volatility estimators are composed by using the nearest neighbor truncation approach, namely the minimum and median realized volatility. Under the structural break improvements in both the models and volatility estimators, the empirical findings show that the modified HAR model provides the best performing in-sample and out-of-sample forecast evaluations as compared with the standard HAR models. Accurate volatility forecasts have direct influential to the application of risk management and investment portfolio analysis.

  2. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  3. Utilization of Singularity Exponent in Nearest Neighbor Based Classifier

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2013-01-01

    Roč. 30, č. 1 (2013), s. 3-29 ISSN 0176-4268 Grant - others:Czech Technical University(CZ) CZ68407700 Institutional support: RVO:67985807 Keywords : multivariate data * probability density estimation * classification * probability distribution mapping function * probability density mapping function * power approximation Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.571, year: 2013

  4. An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.

    Science.gov (United States)

    Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu

    2017-08-05

    The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.

  5. An improved coupled-states approximation including the nearest neighbor Coriolis couplings for diatom-diatom inelastic collision

    Science.gov (United States)

    Yang, Dongzheng; Hu, Xixi; Zhang, Dong H.; Xie, Daiqian

    2018-02-01

    Solving the time-independent close coupling equations of a diatom-diatom inelastic collision system by using the rigorous close-coupling approach is numerically difficult because of its expensive matrix manipulation. The coupled-states approximation decouples the centrifugal matrix by neglecting the important Coriolis couplings completely. In this work, a new approximation method based on the coupled-states approximation is presented and applied to time-independent quantum dynamic calculations. This approach only considers the most important Coriolis coupling with the nearest neighbors and ignores weaker Coriolis couplings with farther K channels. As a result, it reduces the computational costs without a significant loss of accuracy. Numerical tests for para-H2+ortho-H2 and para-H2+HD inelastic collision were carried out and the results showed that the improved method dramatically reduces the errors due to the neglect of the Coriolis couplings in the coupled-states approximation. This strategy should be useful in quantum dynamics of other systems.

  6. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    Science.gov (United States)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  7. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    International Nuclear Information System (INIS)

    Blanco, A; Rodriguez, R; Martinez-Maranon, I

    2014-01-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity

  8. Dynamical correlation functions of the S=1/2 nearest-neighbor and Haldane-Shastry Heisenberg antiferromagnetic chains in zero and applied fields

    DEFF Research Database (Denmark)

    Lefmann, K.; Rischel, C.

    1996-01-01

    We present a numerical diagonalization study of two one-dimensional S=1/2 antiferromagnetic Heisenberg chains, having nearest-neighbor and Haldane-Shastry (1/r(2)) interactions, respectively. We have obtained the T=0 dynamical correlation function, S-alpha alpha(q,omega), for chains of length N=8......-28. We have studied S-zz(q,omega) for the Heisenberg chain in zero field, and from finite-size scaling we have obtained a limiting behavior that for large omega deviates from the conjecture proposed earlier by Muller ct al. For both chains we describe the behavior of S-zz(q,omega) and S...

  9. The spectrum and the quantum Hall effect on the square lattice with next-nearest-neighbor hopping: Statistics of holons and spinons in the t-J model

    International Nuclear Information System (INIS)

    Hatsugai, Y.; Kohmoto, M.

    1992-01-01

    We investigate the energy spectrum and the Hall effect of electrons on the square lattice with next-nearest-neighbor (NNN) hopping as well as nearest-neighbor hopping. General rational values of magnetic flux per unit cell φ=p/q are considered. In the absence of NNN hopping, the two bands at the center touch for q even, thus the Hall conductance is not well defined at half filling. An energy gap opens there by introducing NNN hoping. When φ=1/2, the NNN model coincides with the mean field Hamiltonian for the chiral spin state proposed by Wen, Wilczek and Zee (WWZ). The Hall conductance is calculated from the Diophantine equation and the E-φ diagram. We find that gaps close for other fillings at certain values of NNN hopping strength. The quantized value of the Hall conductance changes once this phenomenon occurs. In a mean field treatment of the t-J model, the effective Hamiltonian is the same as our NNN model. From this point of view, the statistics of the quasi-particles is not always semion and depends on the filling and the strength of the mean field. (orig.)

  10. An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    He Xu

    2017-08-01

    Full Text Available The Global Positioning System (GPS is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID, etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K-Nearest Neighbor (BKNN. The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.

  11. Study of parameters of the nearest neighbour shared algorithm on clustering documents

    Science.gov (United States)

    Mustika Rukmi, Alvida; Budi Utomo, Daryono; Imro’atus Sholikhah, Neni

    2018-03-01

    Document clustering is one way of automatically managing documents, extracting of document topics and fastly filtering information. Preprocess of clustering documents processed by textmining consists of: keyword extraction using Rapid Automatic Keyphrase Extraction (RAKE) and making the document as concept vector using Latent Semantic Analysis (LSA). Furthermore, the clustering process is done so that the documents with the similarity of the topic are in the same cluster, based on the preprocesing by textmining performed. Shared Nearest Neighbour (SNN) algorithm is a clustering method based on the number of "nearest neighbors" shared. The parameters in the SNN Algorithm consist of: k nearest neighbor documents, ɛ shared nearest neighbor documents and MinT minimum number of similar documents, which can form a cluster. Characteristics The SNN algorithm is based on shared ‘neighbor’ properties. Each cluster is formed by keywords that are shared by the documents. SNN algorithm allows a cluster can be built more than one keyword, if the value of the frequency of appearing keywords in document is also high. Determination of parameter values on SNN algorithm affects document clustering results. The higher parameter value k, will increase the number of neighbor documents from each document, cause similarity of neighboring documents are lower. The accuracy of each cluster is also low. The higher parameter value ε, caused each document catch only neighbor documents that have a high similarity to build a cluster. It also causes more unclassified documents (noise). The higher the MinT parameter value cause the number of clusters will decrease, since the number of similar documents can not form clusters if less than MinT. Parameter in the SNN Algorithm determine performance of clustering result and the amount of noise (unclustered documents ). The Silhouette coeffisient shows almost the same result in many experiments, above 0.9, which means that SNN algorithm works well

  12. Representative Vector Machines: A Unified Framework for Classical Classifiers.

    Science.gov (United States)

    Gui, Jie; Liu, Tongliang; Tao, Dacheng; Sun, Zhenan; Tan, Tieniu

    2016-08-01

    Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.

  13. Randomized Approaches for Nearest Neighbor Search in Metric Space When Computing the Pairwise Distance Is Extremely Expensive

    Science.gov (United States)

    Wang, Lusheng; Yang, Yong; Lin, Guohui

    Finding the closest object for a query in a database is a classical problem in computer science. For some modern biological applications, computing the similarity between two objects might be very time consuming. For example, it takes a long time to compute the edit distance between two whole chromosomes and the alignment cost of two 3D protein structures. In this paper, we study the nearest neighbor search problem in metric space, where the pair-wise distance between two objects in the database is known and we want to minimize the number of distances computed on-line between the query and objects in the database in order to find the closest object. We have designed two randomized approaches for indexing metric space databases, where objects are purely described by their distances with each other. Analysis and experiments show that our approaches only need to compute O(logn) objects in order to find the closest object, where n is the total number of objects in the database.

  14. A Diagnosis Method for Rotation Machinery Faults Based on Dimensionless Indexes Combined with K-Nearest Neighbor Algorithm

    Directory of Open Access Journals (Sweden)

    Jianbin Xiong

    2015-01-01

    Full Text Available It is difficult to well distinguish the dimensionless indexes between normal petrochemical rotating machinery equipment and those with complex faults. When the conflict of evidence is too big, it will result in uncertainty of diagnosis. This paper presents a diagnosis method for rotation machinery fault based on dimensionless indexes combined with K-nearest neighbor (KNN algorithm. This method uses a KNN algorithm and an evidence fusion theoretical formula to process fuzzy data, incomplete data, and accurate data. This method can transfer the signals from the petrochemical rotating machinery sensors to the reliability manners using dimensionless indexes and KNN algorithm. The input information is further integrated by an evidence synthesis formula to get the final data. The type of fault will be decided based on these data. The experimental results show that the proposed method can integrate data to provide a more reliable and reasonable result, thereby reducing the decision risk.

  15. Diagnostics of synchronous motor based on analysis of acoustic signals with application of MFCC and Nearest Mean classifier

    OpenAIRE

    Adam Głowacz; Witold Głowacz; Andrzej Głowacz

    2010-01-01

    The paper presents method of diagnostics of imminent failure conditions of synchronous motor. This method is based on a study ofacoustic signals generated by synchronous motor. Sound recognition system is based on algorithms of data processing, such as MFCC andNearest Mean classifier with cosine distance. Software to recognize the sounds of synchronous motor was implemented. The studies werecarried out for four imminent failure conditions of synchronous motor. The results confirm that the sys...

  16. Clustered K nearest neighbor algorithm for daily inflow forecasting

    NARCIS (Netherlands)

    Akbari, M.; Van Overloop, P.J.A.T.M.; Afshar, A.

    2010-01-01

    Instance based learning (IBL) algorithms are a common choice among data driven algorithms for inflow forecasting. They are based on the similarity principle and prediction is made by the finite number of similar neighbors. In this sense, the similarity of a query instance is estimated according to

  17. Just-in-time adaptive classifiers-part II: designing the classifier.

    Science.gov (United States)

    Alippi, Cesare; Roveri, Manuel

    2008-12-01

    Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few results addressing nonstationary environments. This paper proposes a methodology based on k-nearest neighbor (NN) classifiers for designing adaptive classification systems able to react to changing conditions just-in-time (JIT), i.e., exactly when it is needed. k-NN classifiers have been selected for their computational-free training phase, the possibility to easily estimate the model complexity k and keep under control the computational complexity of the classifier through suitable data reduction mechanisms. A JIT classifier requires a temporal detection of a (possible) process deviation (aspect tackled in a companion paper) followed by an adaptive management of the knowledge base (KB) of the classifier to cope with the process change. The novelty of the proposed approach resides in the general framework supporting the real-time update of the KB of the classification system in response to novel information coming from the process both in stationary conditions (accuracy improvement) and in nonstationary ones (process tracking) and in providing a suitable estimate of k. It is shown that the classification system grants consistency once the change targets the process generating the data in a new stationary state, as it is the case in many real applications.

  18. ESTIMATING PHOTOMETRIC REDSHIFTS OF QUASARS VIA THE k-NEAREST NEIGHBOR APPROACH BASED ON LARGE SURVEY DATABASES

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Yanxia; Ma He; Peng Nanbo; Zhao Yongheng [Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, 100012 Beijing (China); Wu Xuebing, E-mail: zyx@bao.ac.cn [Department of Astronomy, Peking University, 100871 Beijing (China)

    2013-08-01

    We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.

  19. ESTIMATING PHOTOMETRIC REDSHIFTS OF QUASARS VIA THE k-NEAREST NEIGHBOR APPROACH BASED ON LARGE SURVEY DATABASES

    International Nuclear Information System (INIS)

    Zhang Yanxia; Ma He; Peng Nanbo; Zhao Yongheng; Wu Xuebing

    2013-01-01

    We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.

  20. PERBANDINGAN K-NEAREST NEIGHBOR DAN NAIVE BAYES UNTUK KLASIFIKASI TANAH LAYAK TANAM POHON JATI

    Directory of Open Access Journals (Sweden)

    Didik Srianto

    2016-10-01

    Full Text Available Data mining adalah proses menganalisa data dari perspektif yang berbeda dan menyimpulkannya menjadi informasi-informasi penting yang dapat dipakai untuk meningkatkan keuntungan, memperkecil biaya pengeluaran, atau bahkan keduanya. Secara teknis, data mining dapat disebut sebagai proses untuk menemukan korelasi atau pola dari ratusan atau ribuan field dari sebuah relasional database yang besar. Pada perum perhutani KPH SEMARANG saat ini masih menggunakan cara manual untuk menentukan jenis tanaman (jati / non jati. K-Nearest Neighbour atau k-NN merupakan algoritma data mining yang dapat digunakan untuk proses klasifikasi dan regresi. Naive bayes Classifier merupakan suatu teknik yang dapat digunakan untuk teknik klasifikasi. Pada penelitian ini k-NN dan Naive Bayes akan digunakan untuk mengklasifikasi data pohon jati dari perum perhutani KPH SEMARANG. Yang mana hasil klasifikasi dari k-NN dan Naive Bayes akan dibandingkan hasilnya. Pengujian dilakukan menggunakan software RapidMiner. Setelah dilakukan pengujian k-NN dianggap lebih baik dari Naife Bayes dengan akurasi 96.66% dan 82.63. Kata kunci -k-NN,Klasifikasi,Naive Bayes,Penanaman Pohon Jati

  1. An Active Learning Classifier for Further Reducing Diabetic Retinopathy Screening System Cost

    Directory of Open Access Journals (Sweden)

    Yinan Zhang

    2016-01-01

    Full Text Available Diabetic retinopathy (DR screening system raises a financial problem. For further reducing DR screening cost, an active learning classifier is proposed in this paper. Our approach identifies retinal images based on features extracted by anatomical part recognition and lesion detection algorithms. Kernel extreme learning machine (KELM is a rapid classifier for solving classification problems in high dimensional space. Both active learning and ensemble technique elevate performance of KELM when using small training dataset. The committee only proposes necessary manual work to doctor for saving cost. On the publicly available Messidor database, our classifier is trained with 20%–35% of labeled retinal images and comparative classifiers are trained with 80% of labeled retinal images. Results show that our classifier can achieve better classification accuracy than Classification and Regression Tree, radial basis function SVM, Multilayer Perceptron SVM, Linear SVM, and K Nearest Neighbor. Empirical experiments suggest that our active learning classifier is efficient for further reducing DR screening cost.

  2. A novel implementation of kNN classifier based on multi-tupled meteorological input data for wind power prediction

    International Nuclear Information System (INIS)

    Yesilbudak, Mehmet; Sagiroglu, Seref; Colak, Ilhami

    2017-01-01

    Highlights: • An accurate wind power prediction model is proposed for very short-term horizon. • The k-nearest neighbor classifier is implemented based on the multi-tupled inputs. • The variation of wind power prediction errors is evaluated in various aspects. • Our approach shows the superior prediction performance over the persistence method. - Abstract: With the growing share of wind power production in the electric power grids, many critical challenges to the grid operators have been emerged in terms of the power balance, power quality, voltage support, frequency stability, load scheduling, unit commitment and spinning reserve calculations. To overcome such problems, numerous studies have been conducted to predict the wind power production, but a small number of them have attempted to improve the prediction accuracy by employing the multidimensional meteorological input data. The novelties of this study lie in the proposal of an efficient and easy to implement very short-term wind power prediction model based on the k-nearest neighbor classifier (kNN), in the usage of wind speed, wind direction, barometric pressure and air temperature parameters as the multi-tupled meteorological inputs and in the comparison of wind power prediction results with respect to the persistence reference model. As a result of the achieved patterns, we characterize the variation of wind power prediction errors according to the input tuples, distance measures and neighbor numbers, and uncover the most influential and the most ineffective meteorological parameters on the optimization of wind power prediction results.

  3. Evidence of codon usage in the nearest neighbor spacing distribution of bases in bacterial genomes

    Science.gov (United States)

    Higareda, M. F.; Geiger, O.; Mendoza, L.; Méndez-Sánchez, R. A.

    2012-02-01

    Statistical analysis of whole genomic sequences usually assumes a homogeneous nucleotide density throughout the genome, an assumption that has been proved incorrect for several organisms since the nucleotide density is only locally homogeneous. To avoid giving a single numerical value to this variable property, we propose the use of spectral statistics, which characterizes the density of nucleotides as a function of its position in the genome. We show that the cumulative density of bases in bacterial genomes can be separated into an average (or secular) plus a fluctuating part. Bacterial genomes can be divided into two groups according to the qualitative description of their secular part: linear and piecewise linear. These two groups of genomes show different properties when their nucleotide spacing distribution is studied. In order to analyze genomes having a variable nucleotide density, statistically, the use of unfolding is necessary, i.e., to get a separation between the secular part and the fluctuations. The unfolding allows an adequate comparison with the statistical properties of other genomes. With this methodology, four genomes were analyzed Burkholderia, Bacillus, Clostridium and Corynebacterium. Interestingly, the nearest neighbor spacing distributions or detrended distance distributions are very similar for species within the same genus but they are very different for species from different genera. This difference can be attributed to the difference in the codon usage.

  4. Instance Selection for Classifier Performance Estimation in Meta Learning

    Directory of Open Access Journals (Sweden)

    Marcin Blachnik

    2017-11-01

    Full Text Available Building an accurate prediction model is challenging and requires appropriate model selection. This process is very time consuming but can be accelerated with meta-learning–automatic model recommendation by estimating the performances of given prediction models without training them. Meta-learning utilizes metadata extracted from the dataset to effectively estimate the accuracy of the model in question. To achieve that goal, metadata descriptors must be gathered efficiently and must be informative to allow the precise estimation of prediction accuracy. In this paper, a new type of metadata descriptors is analyzed. These descriptors are based on the compression level obtained from the instance selection methods at the data-preprocessing stage. To verify their suitability, two types of experiments on real-world datasets have been conducted. In the first one, 11 instance selection methods were examined in order to validate the compression–accuracy relation for three classifiers: k-nearest neighbors (kNN, support vector machine (SVM, and random forest. From this analysis, two methods are recommended (instance-based learning type 2 (IB2, and edited nearest neighbor (ENN which are then compared with the state-of-the-art metaset descriptors. The obtained results confirm that the two suggested compression-based meta-features help to predict accuracy of the base model much more accurately than the state-of-the-art solution.

  5. Predicting the severity of nuclear power plant transients using nearest neighbors modeling optimized by genetic algorithms on a parallel computer

    International Nuclear Information System (INIS)

    Lin, J.; Bartal, Y.; Uhrig, R.E.

    1995-01-01

    The importance of automatic diagnostic systems for nuclear power plants (NPPs) has been discussed in numerous studies, and various such systems have been proposed. None of those systems were designed to predict the severity of the diagnosed scenario. A classification and severity prediction system for NPP transients is developed. The system is based on nearest neighbors modeling, which is optimized using genetic algorithms. The optimization process is used to determine the most important variables for each of the transient types analyzed. An enhanced version of the genetic algorithms is used in which a local downhill search is performed to further increase the accuracy achieved. The genetic algorithms search was implemented on a massively parallel supercomputer, the KSR1-64, to perform the analysis in a reasonable time. The data for this study were supplied by the high-fidelity simulator of the San Onofre unit 1 pressurized water reactor

  6. DichroMatch at the protein circular dichroism data bank (DM@PCDDB): A web-based tool for identifying protein nearest neighbors using circular dichroism spectroscopy.

    Science.gov (United States)

    Whitmore, Lee; Mavridis, Lazaros; Wallace, B A; Janes, Robert W

    2018-01-01

    Circular dichroism spectroscopy is a well-used, but simple method in structural biology for providing information on the secondary structure and folds of proteins. DichroMatch (DM@PCDDB) is an online tool that is newly available in the Protein Circular Dichroism Data Bank (PCDDB), which takes advantage of the wealth of spectral and metadata deposited therein, to enable identification of spectral nearest neighbors of a query protein based on four different methods of spectral matching. DM@PCDDB can potentially provide novel information about structural relationships between proteins and can be used in comparison studies of protein homologs and orthologs. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  7. Highway Travel Time Prediction Using Sparse Tensor Completion Tactics and K-Nearest Neighbor Pattern Matching Method

    Directory of Open Access Journals (Sweden)

    Jiandong Zhao

    2018-01-01

    Full Text Available Remote transportation microwave sensor (RTMS technology is being promoted for China’s highways. The distance is about 2 to 5 km between RTMSs, which leads to missing data and data sparseness problems. These two problems seriously restrict the accuracy of travel time prediction. Aiming at the data-missing problem, based on traffic multimode characteristics, a tensor completion method is proposed to recover the lost RTMS speed and volume data. Aiming at the data sparseness problem, virtual sensor nodes are set up between real RTMS nodes, and the two-dimensional linear interpolation and piecewise method are applied to estimate the average travel time between two nodes. Next, compared with the traditional K-nearest neighbor method, an optimal KNN method is proposed for travel time prediction. optimization is made in three aspects. Firstly, the three original state vectors, that is, speed, volume, and time of the day, are subdivided into seven periods. Secondly, the traffic congestion level is added as a new state vector. Thirdly, the cross-validation method is used to calibrate the K value to improve the adaptability of the KNN algorithm. Based on the data collected from Jinggangao highway, all the algorithms are validated. The results show that the proposed method can improve data quality and prediction precision of travel time.

  8. Remaining Useful Life Estimation of Insulated Gate Biploar Transistors (IGBTs Based on a Novel Volterra k-Nearest Neighbor Optimally Pruned Extreme Learning Machine (VKOPP Model Using Degradation Data

    Directory of Open Access Journals (Sweden)

    Zhen Liu

    2017-11-01

    Full Text Available The insulated gate bipolar transistor (IGBT is a kind of excellent performance switching device used widely in power electronic systems. How to estimate the remaining useful life (RUL of an IGBT to ensure the safety and reliability of the power electronics system is currently a challenging issue in the field of IGBT reliability. The aim of this paper is to develop a prognostic technique for estimating IGBTs’ RUL. There is a need for an efficient prognostic algorithm that is able to support in-situ decision-making. In this paper, a novel prediction model with a complete structure based on optimally pruned extreme learning machine (OPELM and Volterra series is proposed to track the IGBT’s degradation trace and estimate its RUL; we refer to this model as Volterra k-nearest neighbor OPELM prediction (VKOPP model. This model uses the minimum entropy rate method and Volterra series to reconstruct phase space for IGBTs’ ageing samples, and a new weight update algorithm, which can effectively reduce the influence of the outliers and noises, is utilized to establish the VKOPP network; then a combination of the k-nearest neighbor method (KNN and least squares estimation (LSE method is used to calculate the output weights of OPELM and predict the RUL of the IGBT. The prognostic results show that the proposed approach can predict the RUL of IGBT modules with small error and achieve higher prediction precision and lower time cost than some classic prediction approaches.

  9. Analysis and Identification of Aptamer-Compound Interactions with a Maximum Relevance Minimum Redundancy and Nearest Neighbor Algorithm.

    Science.gov (United States)

    Wang, ShaoPeng; Zhang, Yu-Hang; Lu, Jing; Cui, Weiren; Hu, Jerry; Cai, Yu-Dong

    2016-01-01

    The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.

  10. Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition

    International Nuclear Information System (INIS)

    Shen Hongbin; Chou Kuochen

    2005-01-01

    The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen

  11. A novel method for the detection of R-peaks in ECG based on K-Nearest Neighbors and Particle Swarm Optimization

    Science.gov (United States)

    He, Runnan; Wang, Kuanquan; Li, Qince; Yuan, Yongfeng; Zhao, Na; Liu, Yang; Zhang, Henggui

    2017-12-01

    Cardiovascular diseases are associated with high morbidity and mortality. However, it is still a challenge to diagnose them accurately and efficiently. Electrocardiogram (ECG), a bioelectrical signal of the heart, provides crucial information about the dynamical functions of the heart, playing an important role in cardiac diagnosis. As the QRS complex in ECG is associated with ventricular depolarization, therefore, accurate QRS detection is vital for interpreting ECG features. In this paper, we proposed a real-time, accurate, and effective algorithm for QRS detection. In the algorithm, a proposed preprocessor with a band-pass filter was first applied to remove baseline wander and power-line interference from the signal. After denoising, a method combining K-Nearest Neighbor (KNN) and Particle Swarm Optimization (PSO) was used for accurate QRS detection in ECGs with different morphologies. The proposed algorithm was tested and validated using 48 ECG records from MIT-BIH arrhythmia database (MITDB), achieved a high averaged detection accuracy, sensitivity and positive predictivity of 99.43, 99.69, and 99.72%, respectively, indicating a notable improvement to extant algorithms as reported in literatures.

  12. Speaker gender identification based on majority vote classifiers

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  13. Comparisons and Selections of Features and Classifiers for Short Text Classification

    Science.gov (United States)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  14. Near Neighbor Distribution in Sets of Fractal Nature

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel

    2013-01-01

    Roč. 5, č. 1 (2013), s. 159-166 ISSN 2150-7988 R&D Projects: GA MŠk(CZ) LG12020 Institutional support: RVO:67985807 Keywords : nearest neighbor * fractal set * multifractal * Erlang distribution Subject RIV: BB - Applied Statistics, Operational Research http://www.mirlabs.org/ijcisim/regular_papers_2013/Paper91.pdf

  15. α-K2AgF4: Ferromagnetism induced by the weak superexchange of different eg orbitals from the nearest neighbor Ag ions

    Science.gov (United States)

    Zhang, Xiaoli; Zhang, Guoren; Jia, Ting; Zeng, Zhi; Lin, H. Q.

    2016-05-01

    We study the abnormal ferromagnetism in α-K2AgF4, which is very similar to high-TC parent material La2CuO4 in structure. We find out that the electron correlation is very important in determining the insulating property of α-K2AgF4. The Ag(II) 4d9 in the octahedron crystal field has the t2 g 6 eg 3 electron occupation with eg x2-y2 orbital fully occupied and 3z2-r2 orbital partially occupied. The two eg orbitals are very extended indicating both of them are active in superexchange. Using the Hubbard model combined with Nth-order muffin-tin orbital (NMTO) downfolding technique, it is concluded that the exchange interaction between eg 3z2-r2 and x2-y2 from the first nearest neighbor Ag ions leads to the anomalous ferromagnetism in α-K2AgF4.

  16. α-K2AgF4: Ferromagnetism induced by the weak superexchange of different eg orbitals from the nearest neighbor Ag ions

    Directory of Open Access Journals (Sweden)

    Xiaoli Zhang

    2016-05-01

    Full Text Available We study the abnormal ferromagnetism in α-K2AgF4, which is very similar to high-TC parent material La2CuO4 in structure. We find out that the electron correlation is very important in determining the insulating property of α-K2AgF4. The Ag(II 4d9 in the octahedron crystal field has the t 2 g 6 e g 3 electron occupation with eg x2-y2 orbital fully occupied and 3z2-r2 orbital partially occupied. The two eg orbitals are very extended indicating both of them are active in superexchange. Using the Hubbard model combined with Nth-order muffin-tin orbital (NMTO downfolding technique, it is concluded that the exchange interaction between eg 3z2-r2 and x2-y2 from the first nearest neighbor Ag ions leads to the anomalous ferromagnetism in α-K2AgF4.

  17. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  18. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

    KAUST Repository

    Tao, Yufei

    2010-07-01

    Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial

  19. Fingerprint prediction using classifier ensembles

    CSIR Research Space (South Africa)

    Molale, P

    2011-11-01

    Full Text Available ); logistic discrimination (LgD), k-nearest neighbour (k-NN), artificial neural network (ANN), association rules (AR) decision tree (DT), naive Bayes classifier (NBC) and the support vector machine (SVM). The performance of several multiple classifier systems...

  20. Improved Multiscale Entropy Technique with Nearest-Neighbor Moving-Average Kernel for Nonlinear and Nonstationary Short-Time Biomedical Signal Analysis

    Directory of Open Access Journals (Sweden)

    S. P. Arunachalam

    2018-01-01

    Full Text Available Analysis of biomedical signals can yield invaluable information for prognosis, diagnosis, therapy evaluation, risk assessment, and disease prevention which is often recorded as short time series data that challenges existing complexity classification algorithms such as Shannon entropy (SE and other techniques. The purpose of this study was to improve previously developed multiscale entropy (MSE technique by incorporating nearest-neighbor moving-average kernel, which can be used for analysis of nonlinear and non-stationary short time series physiological data. The approach was tested for robustness with respect to noise analysis using simulated sinusoidal and ECG waveforms. Feasibility of MSE to discriminate between normal sinus rhythm (NSR and atrial fibrillation (AF was tested on a single-lead ECG. In addition, the MSE algorithm was applied to identify pivot points of rotors that were induced in ex vivo isolated rabbit hearts. The improved MSE technique robustly estimated the complexity of the signal compared to that of SE with various noises, discriminated NSR and AF on single-lead ECG, and precisely identified the pivot points of ex vivo rotors by providing better contrast between the rotor core and the peripheral region. The improved MSE technique can provide efficient complexity analysis of variety of nonlinear and nonstationary short-time biomedical signals.

  1. Large-Scale Mapping of Carbon Stocks in Riparian Forests with Self-Organizing Maps and the k-Nearest-Neighbor Algorithm

    Directory of Open Access Journals (Sweden)

    Leonhard Suchenwirth

    2014-07-01

    Full Text Available Among the machine learning tools being used in recent years for environmental applications such as forestry, self-organizing maps (SOM and the k-nearest neighbor (kNN algorithm have been used successfully. We applied both methods for the mapping of organic carbon (Corg in riparian forests due to their considerably high carbon storage capacity. Despite the importance of floodplains for carbon sequestration, a sufficient scientific foundation for creating large-scale maps showing the spatial Corg distribution is still missing. We estimated organic carbon in a test site in the Danube Floodplain based on RapidEye remote sensing data and additional geodata. Accordingly, carbon distribution maps of vegetation, soil, and total Corg stocks were derived. Results were compared and statistically evaluated with terrestrial survey data for outcomes with pure remote sensing data and for the combination with additional geodata using bias and the Root Mean Square Error (RMSE. Results show that SOM and kNN approaches enable us to reproduce spatial patterns of riparian forest Corg stocks. While vegetation Corg has very high RMSEs, outcomes for soil and total Corg stocks are less biased with a lower RMSE, especially when remote sensing and additional geodata are conjointly applied. SOMs show similar percentages of RMSE to kNN estimations.

  2. Distance and Density Similarity Based Enhanced k-NN Classifier for Improving Fault Diagnosis Performance of Bearings

    Directory of Open Access Journals (Sweden)

    Sharif Uddin

    2016-01-01

    Full Text Available An enhanced k-nearest neighbor (k-NN classification algorithm is presented, which uses a density based similarity measure in addition to a distance based similarity measure to improve the diagnostic performance in bearing fault diagnosis. Due to its use of distance based similarity measure alone, the classification accuracy of traditional k-NN deteriorates in case of overlapping samples and outliers and is highly susceptible to the neighborhood size, k. This study addresses these limitations by proposing the use of both distance and density based measures of similarity between training and test samples. The proposed k-NN classifier is used to enhance the diagnostic performance of a bearing fault diagnosis scheme, which classifies different fault conditions based upon hybrid feature vectors extracted from acoustic emission (AE signals. Experimental results demonstrate that the proposed scheme, which uses the enhanced k-NN classifier, yields better diagnostic performance and is more robust to variations in the neighborhood size, k.

  3. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines.

    Science.gov (United States)

    Majid, Abdul; Ali, Safdar; Iqbal, Mubashar; Kausar, Nabeela

    2014-03-01

    This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  4. SISTEM PEMBAGIAN KELAS KULIAH MAHASISWA DENGAN METODE K-MEANS DAN K-NEAREST NEIGHBORS UNTUK MENINGKATKAN KUALITAS PEMBELAJARAN

    Directory of Open Access Journals (Sweden)

    Gede Aditra Pradnyana

    2018-01-01

    Full Text Available Permasalahan yang terjadi saat pembentukan atau pembagian kelas mahasiswa adalah perbedaan kemampuan yang dimiliki oleh mahasiswa di setiap kelasnya yang dapat berdampak pada tidak efektifnya proses pembelajaran yang berlangsung. Pengelompokkan mahasiswa dengan kemampuan yang sama merupakan hal yang sangat penting dalam rangka meningkatkan kualitas proses belajar mengajar yang dilakukan. Dengan pengelompokkan mahasiswa yang tepat, mereka akan dapat saling membantu dalam proses pembelajaran. Selain itu, membagi kelas mahasiswa sesuai dengan kemampuannya dapat mempermudah tenaga pendidik dalam menentukan metode atau strategi pembelajaran yang sesuai. Penggunaan metode dan strategi pembelajaran yang tepat akan meningkatkan efektifitas proses belajar mengajar. Pada penelitian ini dirancang sebuah metode baru untuk pembagian kelas kuliah mahasiswa dengan mengkombinasikan metode K-Means dan K-Nearest Neighbors (KNN. Metode K-means digunakan untuk pembagian kelas kuliah mahasiswa berdasarkan komponen penilaian dari mata kuliah prasyaratnya. Adapun fitur yang digunakan dalam pengelompokkan adalah nilai tugas, nilai ujian tengah semester, nilai ujian akhir semester, dan indeks prestasi kumulatif (IPK. Metode KNN digunakan untuk memprediksi kelulusan seoarang mahasiswa di sebuah matakuliah berdasarkan data sebelumnya. Hasil prediksi ini akan digunakan sebagai fitur tambahan yang digunakan dalam pembentukan kelas mahasiswa menggunakan metode K-means. Pendekatan yang digunakan dalam penelitian ini adalah Software Development Live Cycle (SDLC dengan model waterfall. Berdasarkan hasil pengujian yang dilakukan diperoleh kesimpulan bahwa jumlah cluster atau kelas dan jumlah data yang digunakan mempengaruhi dari kualitas cluster yang dibentuk oleh metode K-Means dan KNN yang digunakan. Nilai Silhouette Indeks tertinggi diperolah saat menggunakan 100 data dengan jumlah cluster 10 sebesar 0,534 yang tergolong kelas dengan kualitas medium structure.

  5. Alpha centauri unveiling the secrets of our nearest stellar neighbor

    CERN Document Server

    Beech, Martin

    2015-01-01

    As our closest stellar companion and composed of two Sun-like stars and a third small dwarf star, Alpha Centauri is an ideal testing ground of astrophysical models and has played a central role in the history and development of modern astronomy—from the first guesses at stellar distances to understanding how our own star, the Sun, might have evolved. It is also the host of the nearest known exoplanet, an ultra-hot, Earth-like planet recently discovered. Just 4.4 light years away Alpha Centauri is also the most obvious target for humanity’s first directed interstellar space probe. Such a mission could reveal the small-scale structure of a new planetary system and also represent the first step in what must surely be humanity’s greatest future adventure—exploration of the Milky Way Galaxy itself. For all of its closeness, α Centauri continues to tantalize astronomers with many unresolved mysteries, such as how did it form, how many planets does it contain and where are they, and how might we view its ex...

  6. Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm.

    Science.gov (United States)

    Manganaro, Alberto; Pizzo, Fabiola; Lombardo, Anna; Pogliaghi, Alberto; Benfenati, Emilio

    2016-02-01

    The ability of a substance to resist degradation and persist in the environment needs to be readily identified in order to protect the environment and human health. Many regulations require the assessment of persistence for substances commonly manufactured and marketed. Besides laboratory-based testing methods, in silico tools may be used to obtain a computational prediction of persistence. We present a new program to develop k-Nearest Neighbor (k-NN) models. The k-NN algorithm is a similarity-based approach that predicts the property of a substance in relation to the experimental data for its most similar compounds. We employed this software to identify persistence in the sediment compartment. Data on half-life (HL) in sediment were obtained from different sources and, after careful data pruning the final dataset, containing 297 organic compounds, was divided into four experimental classes. We developed several models giving satisfactory performances, considering that both the training and test set accuracy ranged between 0.90 and 0.96. We finally selected one model which will be made available in the near future in the freely available software platform VEGA. This model offers a valuable in silico tool that may be really useful for fast and inexpensive screening. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data.

    Directory of Open Access Journals (Sweden)

    Brett A McKinney

    Full Text Available Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k for each gene to optimize the Relief-F test statistics (importance scores for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to

  8. Classification in medical image analysis using adaptive metric k-NN

    DEFF Research Database (Denmark)

    Chen, Chen; Chernoff, Konstantin; Karemore, Gopal

    2010-01-01

    The performance of the k-nearest neighborhoods (k-NN) classifier is highly dependent on the distance metric used to identify the k nearest neighbors of the query points. The standard Euclidean distance is commonly used in practice. This paper investigates the performance of k-NN classifier...

  9. Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost

    NARCIS (Netherlands)

    Mensink, T.; Verbeek, J.; Perronnin, F.; Csurka, G.

    2013-01-01

    We study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers, and introduce a new

  10. A systematic comparison of supervised classifiers.

    Directory of Open Access Journals (Sweden)

    Diego Raphael Amancio

    Full Text Available Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM. In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

  11. Examining change detection approaches for tropical mangrove monitoring

    Science.gov (United States)

    Myint, Soe W.; Franklin, Janet; Buenemann, Michaela; Kim, Won; Giri, Chandra

    2014-01-01

    This study evaluated the effectiveness of different band combinations and classifiers (unsupervised, supervised, object-oriented nearest neighbor, and object-oriented decision rule) for quantifying mangrove forest change using multitemporal Landsat data. A discriminant analysis using spectra of different vegetation types determined that bands 2 (0.52 to 0.6 μm), 5 (1.55 to 1.75 μm), and 7 (2.08 to 2.35 μm) were the most effective bands for differentiating mangrove forests from surrounding land cover types. A ranking of thirty-six change maps, produced by comparing the classification accuracy of twelve change detection approaches, was used. The object-based Nearest Neighbor classifier produced the highest mean overall accuracy (84 percent) regardless of band combinations. The automated decision rule-based approach (mean overall accuracy of 88 percent) as well as a composite of bands 2, 5, and 7 used with the unsupervised classifier and the same composite or all band difference with the object-oriented Nearest Neighbor classifier were the most effective approaches.

  12. Velocity correlations and spatial dependencies between neighbors in a unidirectional flow of pedestrians

    Science.gov (United States)

    Porzycki, Jakub; WÄ s, Jarosław; Hedayatifar, Leila; Hassanibesheli, Forough; Kułakowski, Krzysztof

    2017-08-01

    The aim of the paper is an analysis of self-organization patterns observed in the unidirectional flow of pedestrians. On the basis of experimental data from Zhang et al. [J. Zhang et al., J. Stat. Mech. (2011) P06004, 10.1088/1742-5468/2011/06/P06004], we analyze the mutual positions and velocity correlations between pedestrians when walking along a corridor. The angular and spatial dependencies of the mutual positions reveal a spatial structure that remains stable during the crowd motion. This structure differs depending on the value of n , for the consecutive n th -nearest-neighbor position set. The preferred position for the first-nearest neighbor is on the side of the pedestrian, while for further neighbors, this preference shifts to the axis of movement. The velocity correlations vary with the angle formed by the pair of neighboring pedestrians and the direction of motion and with the time delay between pedestrians' movements. The delay dependence of the correlations shows characteristic oscillations, produced by the velocity oscillations when striding; however, a filtering of the main frequency of individual striding out reduces the oscillations only partially. We conclude that pedestrians select their path directions so as to evade the necessity of continuously adjusting their speed to their neighbors'. They try to keep a given distance, but follow the person in front of them, as well as accepting and observing pedestrians on their sides. Additionally, we show an empirical example that illustrates the shape of a pedestrian's personal space during movement.

  13. Classifying dysmorphic syndromes by using artificial neural network based hierarchical decision tree.

    Science.gov (United States)

    Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf

    2018-05-01

    Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.

  14. The square Ising model with second-neighbor interactions and the Ising chain in a transverse field

    International Nuclear Information System (INIS)

    Grynberg, M.D.; Tanatar, B.

    1991-06-01

    We consider the thermal and critical behaviour of the square Ising lattice with frustrated first - and second-neighbor interactions. A low-temperature domain wall analysis including kinks and dislocations shows that there is a close relation between this classical model and the Hamiltonian of an Ising chain in a transverse field provided that the ratio of the next-nearest to nearest-neighbor coupling, is close to 1/2. Due to the field inversion symmetry of the Ising chain Hamiltonian, the thermal properties of the classical system are symmetrical with respect to this coupling ratio. In the neighborhood of this regime critical exponents of the model turn out to belong to the Ising universality class. Our results are compared with previous Monte Carlo simulations. (author). 23 refs, 6 figs

  15. Hybrid Neuro-Fuzzy Classifier Based On Nefclass Model

    Directory of Open Access Journals (Sweden)

    Bogdan Gliwa

    2011-01-01

    Full Text Available The paper presents hybrid neuro-fuzzy classifier, based on NEFCLASS model, which wasmodified. The presented classifier was compared to popular classifiers – neural networks andk-nearest neighbours. Efficiency of modifications in classifier was compared with methodsused in original model NEFCLASS (learning methods. Accuracy of classifier was testedusing 3 datasets from UCI Machine Learning Repository: iris, wine and breast cancer wisconsin.Moreover, influence of ensemble classification methods on classification accuracy waspresented.

  16. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data

    Directory of Open Access Journals (Sweden)

    Yousef Malik

    2016-12-01

    Full Text Available The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN. In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  17. Structure of the first- and second-neighbor shells of simulated water: Quantitative relation to translational and orientational order

    Science.gov (United States)

    Yan, Zhenyu; Buldyrev, Sergey V.; Kumar, Pradeep; Giovambattista, Nicolas; Debenedetti, Pablo G.; Stanley, H. Eugene

    2007-11-01

    We perform molecular dynamics simulations of water using the five-site transferable interaction potential (TIP5P) model to quantify structural order in both the first shell (defined by four nearest neighbors) and second shell (defined by twelve next-nearest neighbors) of a central water molecule. We find that the anomalous decrease of orientational order upon compression occurs in both shells, but the anomalous decrease of translational order upon compression occurs mainly in the second shell. The decreases of translational order and orientational order upon compression (called the “structural anomaly”) are thus correlated only in the second shell. Our findings quantitatively confirm the qualitative idea that the thermodynamic, structural, and hence dynamic anomalies of water are related to changes upon compression in the second shell.

  18. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    Directory of Open Access Journals (Sweden)

    Jiangyuan Mei

    2014-01-01

    the basis of classifiers, for example, the k-nearest neighbors classifier. Experiments on benchmark data sets demonstrate that the proposed algorithm compares favorably with the state-of-the-art methods.

  19. Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation.

    Science.gov (United States)

    Silva, Carlos Alberto; Klauberg, Carine; Hudak, Andrew T; Vierling, Lee A; Liesenberg, Veraldo; Bernett, Luiz G; Scheraiber, Clewerson F; Schoeninger, Emerson R

    2018-01-01

    Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

  20. Exotic lagomorph may influence eagle abundances and breeding spatial aggregations: a field study and meta-analysis on the nearest neighbor distance

    Directory of Open Access Journals (Sweden)

    Facundo Barbar

    2018-05-01

    Full Text Available The introduction of alien species could be changing food source composition, ultimately restructuring demography and spatial distribution of native communities. In Argentine Patagonia, the exotic European hare has one of the highest numbers recorded worldwide and is now a widely consumed prey for many predators. We examine the potential relationship between abundance of this relatively new prey and the abundance and breeding spacing of one of its main consumers, the Black-chested Buzzard-Eagle (Geranoaetus melanoleucus. First we analyze the abundance of individuals of a raptor guild in relation to hare abundance through a correspondence analysis. We then estimated the Nearest Neighbor Distance (NND of the Black-chested Buzzard-eagle abundances in the two areas with high hare abundances. Finally, we performed a meta-regression between the NND and the body masses of Accipitridae raptors, to evaluate if Black-chested Buzzard-eagle NND deviates from the expected according to their mass. We found that eagle abundance was highly associated with hare abundance, more than with any other raptor species in the study area. Their NND deviates from the value expected, which was significantly lower than expected for a raptor species of this size in two areas with high hare abundance. Our results support the hypothesis that high local abundance of prey leads to a reduction of the breeding spacing of its main predator, which could potentially alter other interspecific interactions, and thus the entire community.

  1. Arabic Text Categorization Using Improved k-Nearest neighbour Algorithm

    Directory of Open Access Journals (Sweden)

    Wail Hamood KHALED

    2014-10-01

    Full Text Available The quantity of text information published in Arabic language on the net requires the implementation of effective techniques for the extraction and classifying of relevant information contained in large corpus of texts. In this paper we presented an implementation of an enhanced k-NN Arabic text classifier. We apply the traditional k-NN and Naive Bayes from Weka Toolkit for comparison purpose. Our proposed modified k-NN algorithm features an improved decision rule to skip the classes that are less similar and identify the right class from k nearest neighbours which increases the accuracy. The study evaluates the improved decision rule technique using the standard of recall, precision and f-measure as the basis of comparison. We concluded that the effectiveness of the proposed classifier is promising and outperforms the classical k-NN classifier.

  2. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients

    Science.gov (United States)

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  3. Experimental Validation of an Efficient Fan-Beam Calibration Procedure for k-Nearest Neighbor Position Estimation in Monolithic Scintillator Detectors

    Science.gov (United States)

    Borghi, Giacomo; Tabacchini, Valerio; Seifert, Stefan; Schaart, Dennis R.

    2015-02-01

    Monolithic scintillator detectors can achieve excellent spatial resolution and coincidence resolving time. However, their practical use for positron emission tomography (PET) and other applications in the medical imaging field is still limited due to drawbacks of the different methods used to estimate the position of interaction. Common statistical methods for example require the collection of an extensive dataset of reference events with a narrow pencil beam aimed at a fine grid of reference positions. Such procedures are time consuming and not straightforwardly implemented in systems composed of many detectors. Here, we experimentally demonstrate for the first time a new calibration procedure for k-nearest neighbor ( k-NN) position estimation that utilizes reference data acquired with a fan beam. The procedure is tested on two detectors consisting of 16 mm ×16 mm ×10 mm and 16 mm ×16 mm ×20 mm monolithic, Ca-codoped LSO:Ce crystals and digital photon counter (DPC) arrays. For both detectors, the spatial resolution and the bias obtained with the new method are found to be practically the same as those obtained with the previously used method based on pencil-beam irradiation, while the calibration time is reduced by a factor of 20. Specifically, a FWHM of 1.1 mm and a FWTM of 2.7 mm were obtained using the fan-beam method with the 10 mm crystal, whereas a FWHM of 1.5 mm and a FWTM of 6 mm were achieved with the 20 mm crystal. Using a fan beam made with a 4.5 MBq 22Na point-source and a tungsten slit collimator with 0.5 mm aperture, the total measurement time needed to acquire the reference dataset was 3 hours for the thinner crystal and 2 hours for the thicker one.

  4. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  5. Data-driven method based on particle swarm optimization and k-nearest neighbor regression for estimating capacity of lithium-ion battery

    International Nuclear Information System (INIS)

    Hu, Chao; Jain, Gaurav; Zhang, Puqiang; Schmidt, Craig; Gomadam, Parthasarathy; Gorka, Tom

    2014-01-01

    Highlights: • We develop a data-driven method for the battery capacity estimation. • Five charge-related features that are indicative of the capacity are defined. • The kNN regression model captures the dependency of the capacity on the features. • Results with 10 years’ continuous cycling data verify the effectiveness of the method. - Abstract: Reliability of lithium-ion (Li-ion) rechargeable batteries used in implantable medical devices has been recognized as of high importance from a broad range of stakeholders, including medical device manufacturers, regulatory agencies, physicians, and patients. To ensure Li-ion batteries in these devices operate reliably, it is important to be able to assess the battery health condition by estimating the battery capacity over the life-time. This paper presents a data-driven method for estimating the capacity of Li-ion battery based on the charge voltage and current curves. The contributions of this paper are three-fold: (i) the definition of five characteristic features of the charge curves that are indicative of the capacity, (ii) the development of a non-linear kernel regression model, based on the k-nearest neighbor (kNN) regression, that captures the complex dependency of the capacity on the five features, and (iii) the adaptation of particle swarm optimization (PSO) to finding the optimal combination of feature weights for creating a kNN regression model that minimizes the cross validation (CV) error in the capacity estimation. Verification with 10 years’ continuous cycling data suggests that the proposed method is able to accurately estimate the capacity of Li-ion battery throughout the whole life-time

  6. Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation

    Directory of Open Access Journals (Sweden)

    CARLOS ALBERTO SILVA

    Full Text Available ABSTRACT Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM and tree density (TD of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR data and the non- k-nearest neighbor (k-NN imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2 and a root mean squared difference (RMSD for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

  7. A Robust and Fast Computation Touchless Palm Print Recognition System Using LHEAT and the IFkNCN Classifier

    Directory of Open Access Journals (Sweden)

    Haryati Jaafar

    2015-01-01

    Full Text Available Mobile implementation is a current trend in biometric design. This paper proposes a new approach to palm print recognition, in which smart phones are used to capture palm print images at a distance. A touchless system was developed because of public demand for privacy and sanitation. Robust hand tracking, image enhancement, and fast computation processing algorithms are required for effective touchless and mobile-based recognition. In this project, hand tracking and the region of interest (ROI extraction method were discussed. A sliding neighborhood operation with local histogram equalization, followed by a local adaptive thresholding or LHEAT approach, was proposed in the image enhancement stage to manage low-quality palm print images. To accelerate the recognition process, a new classifier, improved fuzzy-based k nearest centroid neighbor (IFkNCN, was implemented. By removing outliers and reducing the amount of training data, this classifier exhibited faster computation. Our experimental results demonstrate that a touchless palm print system using LHEAT and IFkNCN achieves a promising recognition rate of 98.64%.

  8. Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal.

    Science.gov (United States)

    Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza

    2013-03-01

    Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  9. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.

    Science.gov (United States)

    Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N

    2017-06-21

    Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction

  10. Computerized index for teaching files

    International Nuclear Information System (INIS)

    Bramble, J.M.

    1989-01-01

    A computerized index can be used to retrieve cases from a teaching file that have radiographic findings similar to an unknown case. The probability that a user will review cases with a correct diagnosis was estimated with use of radiographic findings of arthritis in hand radiographs of 110 cases from a teaching file. The nearest-neighbor classification algorithm was used as a computer index to 110 cases of arthritis. Each case was treated as an unknown and inputted to the computer index. The accuracy of the computer index in retrieving cases with the same diagnosis (including rheumatoid arthritis, gout, psoriatic arthritis, inflammatory osteoarthritis, and pyrophosphate arthropathy) was measured. A Bayes classifier algorithm was also tested on the same database. Results are presented. The nearest-neighbor algorithm was 83%. By comparison, the estimated accuracy of the Bayes classifier algorithm was 78%. Conclusions: A computerized index to a teaching file based on the nearest-neighbor algorithm should allow the user to review cases with the correct diagnosis of an unknown case, by entering the findings of the unknown case

  11. Data Mining Learning Models and Algorithms on a Scada System Data Repository

    Directory of Open Access Journals (Sweden)

    Mircea Rîşteiu

    2010-06-01

    Full Text Available This paper presents three data mining techniques applied
    on a SCADA system data repository: Naijve Bayes, k-Nearest Neighbor and Decision Trees. A conclusion that k-Nearest Neighbor is a suitable method to classify the large amount of data considered is made finally according to the mining result and its reasonable explanation. The experiments are built on the training data set and evaluated using the new test set with machine learning tool WEKA.

  12. Data characteristics that determine classifier performance

    CSIR Research Space (South Africa)

    Van der Walt, Christiaan M

    2006-11-01

    Full Text Available available at [11]. The kNN uses a LinearNN nearest neighbour search algorithm with an Euclidean distance metric [8]. The optimal k value is determined by performing 10-fold cross-validation. An optimal k value between 1 and 10 is used for Experiments 1... classifiers. 10-fold cross-validation is used to evaluate and compare the performance of the classifiers on the different data sets. 3.1. Artificial data generation Multivariate Gaussian distributions are used to generate artificial data sets. We use d...

  13. Pathological Brain Detection Using Weiner Filtering, 2D-Discrete Wavelet Transform, Probabilistic PCA, and Random Subspace Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Debesh Jha

    2017-01-01

    Full Text Available Accurate diagnosis of pathological brain images is important for patient care, particularly in the early phase of the disease. Although numerous studies have used machine-learning techniques for the computer-aided diagnosis (CAD of pathological brain, previous methods encountered challenges in terms of the diagnostic efficiency owing to deficiencies in the choice of proper filtering techniques, neuroimaging biomarkers, and limited learning models. Magnetic resonance imaging (MRI is capable of providing enhanced information regarding the soft tissues, and therefore MR images are included in the proposed approach. In this study, we propose a new model that includes Wiener filtering for noise reduction, 2D-discrete wavelet transform (2D-DWT for feature extraction, probabilistic principal component analysis (PPCA for dimensionality reduction, and a random subspace ensemble (RSE classifier along with the K-nearest neighbors (KNN algorithm as a base classifier to classify brain images as pathological or normal ones. The proposed methods provide a significant improvement in classification results when compared to other studies. Based on 5×5 cross-validation (CV, the proposed method outperforms 21 state-of-the-art algorithms in terms of classification accuracy, sensitivity, and specificity for all four datasets used in the study.

  14. Training set optimization and classifier performance in a top-down diabetic retinopathy screening system

    Science.gov (United States)

    Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.

    2013-03-01

    Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.

  15. A three-parameter model for classifying anurans into four genera based on advertisement calls.

    Science.gov (United States)

    Gingras, Bruno; Fitch, William Tecumseh

    2013-01-01

    The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.

  16. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer

    Science.gov (United States)

    Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant

    2015-01-01

    Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029

  17. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor

    Directory of Open Access Journals (Sweden)

    Chang Xu

    2018-05-01

    Full Text Available This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs. Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  18. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    Science.gov (United States)

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  19. Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use

    Science.gov (United States)

    Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil

    2013-01-01

    The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648

  20. Evaluation of the maximum-likelihood adaptive neural system (MLANS) applications to noncooperative IFF

    Science.gov (United States)

    Chernick, Julian A.; Perlovsky, Leonid I.; Tye, David M.

    1994-06-01

    This paper describes applications of maximum likelihood adaptive neural system (MLANS) to the characterization of clutter in IR images and to the identification of targets. The characterization of image clutter is needed to improve target detection and to enhance the ability to compare performance of different algorithms using diverse imagery data. Enhanced unambiguous IFF is important for fratricide reduction while automatic cueing and targeting is becoming an ever increasing part of operations. We utilized MLANS which is a parametric neural network that combines optimal statistical techniques with a model-based approach. This paper shows that MLANS outperforms classical classifiers, the quadratic classifier and the nearest neighbor classifier, because on the one hand it is not limited to the usual Gaussian distribution assumption and can adapt in real time to the image clutter distribution; on the other hand MLANS learns from fewer samples and is more robust than the nearest neighbor classifiers. Future research will address uncooperative IFF using fused IR and MMW data.

  1. Feature Selection and Predictors of Falls with Foot Force Sensors Using KNN-Based Algorithms

    Directory of Open Access Journals (Sweden)

    Shengyun Liang

    2015-11-01

    Full Text Available The aging process may lead to the degradation of lower extremity function in the elderly population, which can restrict their daily quality of life and gradually increase the fall risk. We aimed to determine whether objective measures of physical function could predict subsequent falls. Ground reaction force (GRF data, which was quantified by sample entropy, was collected by foot force sensors. Thirty eight subjects (23 fallers and 15 non-fallers participated in functional movement tests, including walking and sit-to-stand (STS. A feature selection algorithm was used to select relevant features to classify the elderly into two groups: at risk and not at risk of falling down, for three KNN-based classifiers: local mean-based k-nearest neighbor (LMKNN, pseudo nearest neighbor (PNN, local mean pseudo nearest neighbor (LMPNN classification. We compared classification performances, and achieved the best results with LMPNN, with sensitivity, specificity and accuracy all 100%. Moreover, a subset of GRFs was significantly different between the two groups via Wilcoxon rank sum test, which is compatible with the classification results. This method could potentially be used by non-experts to monitor balance and the risk of falling down in the elderly population.

  2. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

    Directory of Open Access Journals (Sweden)

    Lina Zhang

    Full Text Available Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information, PSSM (Position Specific Scoring Matrix, RSA (Relative Solvent Accessibility, and CTD (Composition, Transition, Distribution. The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest, SMO (Sequential Minimal Optimization, NNA (Nearest Neighbor Algorithm, and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

  3. Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Directory of Open Access Journals (Sweden)

    Tsatsoulis Costas

    2010-05-01

    Full Text Available Abstract Background There is increasing evidence that gene location and surrounding genes influence the functionality of genes in the eukaryotic genome. Knowing the Gene Ontology Slim terms associated with a gene gives us insight into a gene's functionality by informing us how its gene product behaves in a cellular context using three different ontologies: molecular function, biological process, and cellular component. In this study, we analyzed if we could classify a gene in Saccharomyces cerevisiae to its correct Gene Ontology Slim term using information about its location in the genome and information from its nearest-neighbouring genes using classification learning. Results We performed experiments to establish that the MultiBoostAB algorithm using the J48 classifier could correctly classify Gene Ontology Slim terms of a gene given information regarding the gene's location and information from its nearest-neighbouring genes for training. Different neighbourhood sizes were examined to determine how many nearest neighbours should be included around each gene to provide better classification rules. Our results show that by just incorporating neighbour information from each gene's two-nearest neighbours, the percentage of correctly classified genes to their correct Gene Ontology Slim term for each ontology reaches over 80% with high accuracy (reflected in F-measures over 0.80 of the classification rules produced. Conclusions We confirmed that in classifying genes to their correct Gene Ontology Slim term, the inclusion of neighbour information from those genes is beneficial. Knowing the location of a gene and the Gene Ontology Slim information from neighbouring genes gives us insight into that gene's functionality. This benefit is seen by just including information from a gene's two-nearest neighbouring genes.

  4. Applying cost-sensitive classification for financial fraud detection under high class-imbalance

    CSIR Research Space (South Africa)

    Moepya, SO

    2014-12-01

    Full Text Available , sensitivity, specificity, recall and precision using PCA and Factor Analysis. Weighted Support Vector Machines (SVM) were shown superior to the cost-sensitive Naive Bayes (NB) and K-Nearest Neighbors classifiers....

  5. Multi-feature classifiers for burst detection in single EEG channels from preterm infants

    Science.gov (United States)

    Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.

    2017-08-01

    Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA  ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa  =  0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants  ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.

  6. Differential profiling of volatile organic compound biomarker signatures utilizing a logical statistical filter-set and novel hybrid evolutionary classifiers

    Science.gov (United States)

    Grigsby, Claude C.; Zmuda, Michael A.; Boone, Derek W.; Highlander, Tyler C.; Kramer, Ryan M.; Rizki, Mateen M.

    2012-06-01

    A growing body of discoveries in molecular signatures has revealed that volatile organic compounds (VOCs), the small molecules associated with an individual's odor and breath, can be monitored to reveal the identity and presence of a unique individual, as well their overall physiological status. Given the analysis requirements for differential VOC profiling via gas chromatography/mass spectrometry, our group has developed a novel informatics platform, Metabolite Differentiation and Discovery Lab (MeDDL). In its current version, MeDDL is a comprehensive tool for time-series spectral registration and alignment, visualization, comparative analysis, and machine learning to facilitate the efficient analysis of multiple, large-scale biomarker discovery studies. The MeDDL toolset can therefore identify a large differential subset of registered peaks, where their corresponding intensities can be used as features for classification. This initial screening of peaks yields results sets that are typically too large for incorporation into a portable, electronic nose based system in addition to including VOCs that are not amenable to classification; consequently, it is also important to identify an optimal subset of these peaks to increase classification accuracy and to decrease the cost of the final system. MeDDL's learning tools include a classifier similar to a K-nearest neighbor classifier used in conjunction with a genetic algorithm (GA) that simultaneously optimizes the classifier and subset of features. The GA uses ROC curves to produce classifiers having maximal area under their ROC curve. Experimental results on over a dozen recognition problems show many examples of classifiers and feature sets that produce perfect ROC curves.

  7. Detecting epileptic seizure with different feature extracting strategies using robust machine learning classification techniques by applying advance parameter optimization approach.

    Science.gov (United States)

    Hussain, Lal

    2018-06-01

    Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.

  8. Comparative analysis of instance selection algorithms for instance-based classifiers in the context of medical decision support

    International Nuclear Information System (INIS)

    Mazurowski, Maciej A; Tourassi, Georgia D; Malof, Jordan M

    2011-01-01

    When constructing a pattern classifier, it is important to make best use of the instances (a.k.a. cases, examples, patterns or prototypes) available for its development. In this paper we present an extensive comparative analysis of algorithms that, given a pool of previously acquired instances, attempt to select those that will be the most effective to construct an instance-based classifier in terms of classification performance, time efficiency and storage requirements. We evaluate seven previously proposed instance selection algorithms and compare their performance to simple random selection of instances. We perform the evaluation using k-nearest neighbor classifier and three classification problems: one with simulated Gaussian data and two based on clinical databases for breast cancer detection and diagnosis, respectively. Finally, we evaluate the impact of the number of instances available for selection on the performance of the selection algorithms and conduct initial analysis of the selected instances. The experiments show that for all investigated classification problems, it was possible to reduce the size of the original development dataset to less than 3% of its initial size while maintaining or improving the classification performance. Random mutation hill climbing emerges as the superior selection algorithm. Furthermore, we show that some previously proposed algorithms perform worse than random selection. Regarding the impact of the number of instances available for the classifier development on the performance of the selection algorithms, we confirm that the selection algorithms are generally more effective as the pool of available instances increases. In conclusion, instance selection is generally beneficial for instance-based classifiers as it can improve their performance, reduce their storage requirements and improve their response time. However, choosing the right selection algorithm is crucial.

  9. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Feature selection and nearest centroid classification for protein mass spectrometry

    Directory of Open Access Journals (Sweden)

    Levner Ilya

    2005-03-01

    Full Text Available Abstract Background The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. Results This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. Conclusion This study tested a number of popular feature

  11. Measurement of near neighbor separations of surface atoms

    International Nuclear Information System (INIS)

    Cohen, P.I.

    Two techniques are being developed to measure the nearest neighbor distances of atoms at the surfaces of solids. Both measures extended fine structure in the excitation probability of core level electrons which are excited by an incident electron beam. This is an important problem because the structures of most surface systems are as yet unknown, even though the location of surface atoms is the basis for any quantitative understanding of the chemistry and physics of surfaces and interfaces. These methods would allow any laboratory to make in situ determinations of surface structure in conjunction with most other laboratory probes of surfaces. Each of these two techniques has different advantages; further, the combination of the two will increase confidence in the results by reducing systematic error in the data analysis

  12. Inferring feature relevances from metric learning

    DEFF Research Database (Denmark)

    Schulz, Alexander; Mokbel, Bassam; Biehl, Michael

    2015-01-01

    Powerful metric learning algorithms have been proposed in the last years which do not only greatly enhance the accuracy of distance-based classifiers and nearest neighbor database retrieval, but which also enable the interpretability of these operations by assigning explicit relevance weights...

  13. Eksperimen Seleksi Fitur Pada Parameter Proyek Untuk Software Effort Estimation dengan K-Nearest Neighbor

    Directory of Open Access Journals (Sweden)

    Fachruddin Fachruddin

    2017-07-01

    Full Text Available Software Effort Estimation adalah proses estimasi biaya perangkat lunak sebagai suatu proses penting dalam melakukan proyek perangkat lunak. Berbagai penelitian terdahulu telah melakukan estimasi usaha perangkat lunak dengan berbagai metode, baik metode machine learning  maupun non machine learning. Penelitian ini mengadakan set eksperimen seleksi atribut pada parameter proyek menggunakan teknik k-nearest neighbours sebagai estimasinya dengan melakukan seleksi atribut menggunakan information gain dan mutual information serta bagaimana menemukan  parameter proyek yang paling representif pada software effort estimation. Dataset software estimation effort yang digunakan pada eksperimen adalah  yakni albrecht, china, kemerer dan mizayaki94 yang dapat diperoleh dari repositori data khusus Software Effort Estimation melalui url http://openscience.us/repo/effort/. Selanjutnya peneliti melakukan pembangunan aplikasi seleksi atribut untuk menyeleksi parameter proyek. Sistem ini menghasilkan dataset arff yang telah diseleksi. Aplikasi ini dibangun dengan bahasa java menggunakan IDE Netbean. Kemudian dataset yang telah di-generate merupakan parameter hasil seleksi yang akan dibandingkan pada saat melakukan Software Effort Estimation menggunakan tool WEKA . Seleksi Fitur berhasil menurunkan nilai error estimasi (yang diwakilkan oleh nilai RAE dan RMSE. Artinya bahwa semakin rendah nilai error (RAE dan RMSE maka semakin akurat nilai estimasi yang dihasilkan. Estimasi semakin baik setelah di lakukan seleksi fitur baik menggunakan information gain maupun mutual information. Dari nilai error yang dihasilkan maka dapat disimpulkan bahwa dataset yang dihasilkan seleksi fitur dengan metode information gain lebih baik dibanding mutual information namun, perbedaan keduanya tidak terlalu signifikan.

  14. Identifying Different Transportation Modes from Trajectory Data Using Tree-Based Ensemble Classifiers

    Directory of Open Access Journals (Sweden)

    Zhibin Xiao

    2017-02-01

    Full Text Available Recognition of transportation modes can be used in different applications including human behavior research, transport management and traffic control. Previous work on transportation mode recognition has often relied on using multiple sensors or matching Geographic Information System (GIS information, which is not possible in many cases. In this paper, an approach based on ensemble learning is proposed to infer hybrid transportation modes using only Global Position System (GPS data. First, in order to distinguish between different transportation modes, we used a statistical method to generate global features and extract several local features from sub-trajectories after trajectory segmentation, before these features were combined in the classification stage. Second, to obtain a better performance, we used tree-based ensemble models (Random Forest, Gradient Boosting Decision Tree, and XGBoost instead of traditional methods (K-Nearest Neighbor, Decision Tree, and Support Vector Machines to classify the different transportation modes. The experiment results on the later have shown the efficacy of our proposed approach. Among them, the XGBoost model produced the best performance with a classification accuracy of 90.77% obtained on the GEOLIFE dataset, and we used a tree-based ensemble method to ensure accurate feature selection to reduce the model complexity.

  15. Integrating the Supervised Information into Unsupervised Learning

    Directory of Open Access Journals (Sweden)

    Ping Ling

    2013-01-01

    Full Text Available This paper presents an assembling unsupervised learning framework that adopts the information coming from the supervised learning process and gives the corresponding implementation algorithm. The algorithm consists of two phases: extracting and clustering data representatives (DRs firstly to obtain labeled training data and then classifying non-DRs based on labeled DRs. The implementation algorithm is called SDSN since it employs the tuning-scaled Support vector domain description to collect DRs, uses spectrum-based method to cluster DRs, and adopts the nearest neighbor classifier to label non-DRs. The validation of the clustering procedure of the first-phase is analyzed theoretically. A new metric is defined data dependently in the second phase to allow the nearest neighbor classifier to work with the informed information. A fast training approach for DRs’ extraction is provided to bring more efficiency. Experimental results on synthetic and real datasets verify that the proposed idea is of correctness and performance and SDSN exhibits higher popularity in practice over the traditional pure clustering procedure.

  16. Carbon-hydrogen defects with a neighboring oxygen atom in n-type Si

    Science.gov (United States)

    Gwozdz, K.; Stübner, R.; Kolkovsky, Vl.; Weber, J.

    2017-07-01

    We report on the electrical activation of neutral carbon-oxygen complexes in Si by wet-chemical etching at room temperature. Two deep levels, E65 and E75, are observed by deep level transient spectroscopy in n-type Czochralski Si. The activation enthalpies of E65 and E75 are obtained as EC-0.11 eV (E65) and EC-0.13 eV (E75). The electric field dependence of their emission rates relates both levels to single acceptor states. From the analysis of the depth profiles, we conclude that the levels belong to two different defects, which contain only one hydrogen atom. A configuration is proposed, where the CH1BC defect, with hydrogen in the bond-centered position between neighboring C and Si atoms, is disturbed by interstitial oxygen in the second nearest neighbor position to substitutional carbon. The significant reduction of the CH1BC concentration in samples with high oxygen concentrations limits the use of this defect for the determination of low concentrations of substitutional carbon in Si samples.

  17. Nearest Neighbour Corner Points Matching Detection Algorithm

    Directory of Open Access Journals (Sweden)

    Zhang Changlong

    2015-01-01

    Full Text Available Accurate detection towards the corners plays an important part in camera calibration. To deal with the instability and inaccuracies of present corner detection algorithm, the nearest neighbour corners match-ing detection algorithms was brought forward. First, it dilates the binary image of the photographed pictures, searches and reserves quadrilateral outline of the image. Second, the blocks which accord with chess-board-corners are classified into a class. If too many blocks in class, it will be deleted; if not, it will be added, and then let the midpoint of the two vertex coordinates be the rough position of corner. At last, it precisely locates the position of the corners. The Experimental results have shown that the algorithm has obvious advantages on accuracy and validity in corner detection, and it can give security for camera calibration in traffic accident measurement.

  18. Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.

    Directory of Open Access Journals (Sweden)

    Daniel Ting

    2010-04-01

    Full Text Available Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1 input data size and criteria for structure inclusion (resolution, R-factor, etc.; 2 filtering of suspect conformations and outliers using B-factors or other features; 3 secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included; 4 the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5 whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.

  19. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Science.gov (United States)

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  20. Nearest Neighbor Queries in Road Networks

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

    2003-01-01

    in road networks. Such queries may be of use in many services. Specifically, we present an easily implementable data model that serves well as a foundation for such queries. We also present the design of a prototype system that implements the queries based on the data model. The algorithm used...

  1. A localized navigation algorithm for Radiation Evasion for nuclear facilities. Part II: Optimizing the “Nearest Exit” Criterion

    Energy Technology Data Exchange (ETDEWEB)

    Khasawneh, Mohammed A., E-mail: mkha@ieee.org [Department of Electrical Engineering, Jordan University of Science and Technology (Jordan); Al-Shboul, Zeina Aman M., E-mail: xeinaaman@gmail.com [Department of Electrical Engineering, Jordan University of Science and Technology (Jordan); Jaradat, Mohammad A., E-mail: majaradat@just.edu.jo [Department of Mechanical Engineering, Jordan University of Science and Technology (Jordan); Malkawi, Mohammad I., E-mail: mmalkawi@aimws.com [College of Engineering, Jadara University, Irbid 221 10 (Jordan)

    2013-06-15

    Highlights: ► A new navigation algorithm for Radiation Evasion around nuclear facilities. ► An optimization criteria minimized under algorithm operation. ► A man-borne device guiding the occupational worker towards paths that warrant least radiation × time products. ► Benefits of using localized navigation as opposed to global navigation schemas. ► A path discrimination function for finding the navigational paths exhibiting the least amounts of radiation. -- Abstract: In this extension from part I (Khasawneh et al., in press), we modify the navigation algorithm which was presented with the objective of optimizing the “Radiation Evasion” Criterion so that navigation would optimize the criterion of “Nearest Exit”. Under this modification, algorithm would yield navigation paths that would guide occupational workers towards Nearest Exit points. Again, under this optimization criterion, algorithm leverages the use of localized information acquired through a well designed and distributed wireless sensor network, as it averts the need for any long-haul communication links or centralized decision and monitoring facility thereby achieving a more reliable performance under dynamic environments. As was done in part I, the proposed algorithm under the “Nearest Exit” Criterion is designed to leverage nearest neighbor information coming in through the sensory network overhead, in computing successful navigational paths from one point to another. For comparison purposes, the proposed algorithm is tested under the two optimization criteria: “Radiation Evasion” and “Nearest Exit”, for different numbers of step look-ahead. We verify the performance of the algorithm by means of simulations, whereby navigational paths are calculated for different radiation fields. We, via simulations, also, verify the performance of the algorithm in comparison with a well-known global navigation algorithm upon which we draw our conclusions.

  2. A localized navigation algorithm for Radiation Evasion for nuclear facilities. Part II: Optimizing the “Nearest Exit” Criterion

    International Nuclear Information System (INIS)

    Khasawneh, Mohammed A.; Al-Shboul, Zeina Aman M.; Jaradat, Mohammad A.; Malkawi, Mohammad I.

    2013-01-01

    Highlights: ► A new navigation algorithm for Radiation Evasion around nuclear facilities. ► An optimization criteria minimized under algorithm operation. ► A man-borne device guiding the occupational worker towards paths that warrant least radiation × time products. ► Benefits of using localized navigation as opposed to global navigation schemas. ► A path discrimination function for finding the navigational paths exhibiting the least amounts of radiation. -- Abstract: In this extension from part I (Khasawneh et al., in press), we modify the navigation algorithm which was presented with the objective of optimizing the “Radiation Evasion” Criterion so that navigation would optimize the criterion of “Nearest Exit”. Under this modification, algorithm would yield navigation paths that would guide occupational workers towards Nearest Exit points. Again, under this optimization criterion, algorithm leverages the use of localized information acquired through a well designed and distributed wireless sensor network, as it averts the need for any long-haul communication links or centralized decision and monitoring facility thereby achieving a more reliable performance under dynamic environments. As was done in part I, the proposed algorithm under the “Nearest Exit” Criterion is designed to leverage nearest neighbor information coming in through the sensory network overhead, in computing successful navigational paths from one point to another. For comparison purposes, the proposed algorithm is tested under the two optimization criteria: “Radiation Evasion” and “Nearest Exit”, for different numbers of step look-ahead. We verify the performance of the algorithm by means of simulations, whereby navigational paths are calculated for different radiation fields. We, via simulations, also, verify the performance of the algorithm in comparison with a well-known global navigation algorithm upon which we draw our conclusions

  3. Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database

    DEFF Research Database (Denmark)

    van Ginneken, Bram; Stegmann, Mikkel Bille; Loog, Marco

    2006-01-01

    classification method that employs a multi-scale filter bank of Gaussian derivatives and a k-nearest-neighbors classifier. The methods have been tested on a publicly available database of 247 chest radiographs, in which all objects have been manually segmented by two human observers. A parameter optimization...

  4. Texture Classification in Lung CT Using Local Binary Patterns

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Shaker, Saher B.; de Bruijne, Marleen

    2008-01-01

    the k nearest neighbor classifier with histogram similarity as distance measure. The proposed method is evaluated on a set of 168 regions of interest comprising normal tissue and different emphysema patterns, and compared to a filter bank based on Gaussian derivatives. The joint LBP and intensity...

  5. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  6. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  7. Cerebellum segmentation in MRI using atlas registration and local multi-scale image descriptors

    DEFF Research Database (Denmark)

    van der Lijn, F.; de Bruijne, M.; Hoogendam, Y.Y.

    2009-01-01

    We propose a novel cerebellum segmentation method for MRI, based on a combination of statistical models of the structure's expected location in the brain and its local appearance. The appearance model is obtained from a k-nearest-neighbor classifier, which uses a set of multi-scale local image...

  8. Chirality dependence of dipole matrix element of carbon nanotubes in axial magnetic field: A third neighbor tight binding approach

    Science.gov (United States)

    Chegel, Raad; Behzad, Somayeh

    2014-02-01

    We have studied the electronic structure and dipole matrix element, D, of carbon nanotubes (CNTs) under magnetic field, using the third nearest neighbor tight binding model. It is shown that the 1NN and 3NN-TB band structures show differences such as the spacing and mixing of neighbor subbands. Applying the magnetic field leads to breaking the degeneracy behavior in the D transitions and creates new allowed transitions corresponding to the band modifications. It is found that |D| is proportional to the inverse tube radius and chiral angle. Our numerical results show that amount of filed induced splitting for the first optical peak is proportional to the magnetic field by the splitting rate ν11. It is shown that ν11 changes linearly and parabolicly with the chiral angle and radius, respectively.

  9. Comparison of feature selection and classification for MALDI-MS data

    Directory of Open Access Journals (Sweden)

    Yang Mary

    2009-07-01

    Full Text Available Abstract Introduction In the classification of Mass Spectrometry (MS proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE, and a recently developed method, Gradient based Leave-one-out Gene Selection (GLGS that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC, Naïve Bayes Classifier (NBC, Nearest Mean Scaled Classifier (NMSC, uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing

  10. The use of hyperspectral data for tree species discrimination: Combining binary classifiers

    CSIR Research Space (South Africa)

    Dastile, X

    2010-11-01

    Full Text Available classifier Classification system 7 class 1 class 2 new sample For 5-nearest neighbour classification: assign new sample to class 1. RU SASA 2010 ? Given learning task {(x1,t1),(x 2,t2),?,(x p,tp)} (xi ? Rn feature vectors, ti ? {?1,?, ?c...). A review on the combination of binary classifiers in multiclass problems. Springer science and Business Media B.V [7] Dietterich T.G and Bakiri G.(1995). Solving Multiclass Learning Problem via Error-Correcting Output Codes. AI Access Foundation...

  11. Gait Recognition Based on Outermost Contour

    Directory of Open Access Journals (Sweden)

    Lili Liu

    2011-10-01

    Full Text Available Gait recognition aims to identify people by the way they walk. In this paper, a simple but e ective gait recognition method based on Outermost Contour is proposed. For each gait image sequence, an adaptive silhouette extraction algorithm is firstly used to segment the frames of the sequence and a series of postprocessing is applied to obtain the normalized silhouette images with less noise. Then a novel feature extraction method based on Outermost Contour is performed. Principal Component Analysis (PCA is adopted to reduce the dimensionality of the distance signals derived from the Outermost Contours of silhouette images. Then Multiple Discriminant Analysis (MDA is used to optimize the separability of gait features belonging to di erent classes. Nearest Neighbor (NN classifier and Nearest Neighbor classifier with respect to class Exemplars (ENN are used to classify the final feature vectors produced by MDA. In order to verify the e ectiveness and robustness of our feature extraction algorithm, we also use two other classifiers: Backpropagation Neural Network (BPNN and Support Vector Machine (SVM for recognition. Experimental results on a gait database of 100 people show that the accuracy of using MDA, BPNN and SVM can achieve 97.67%, 94.33% and 94.67%, respectively.

  12. Comparison of four approaches to a rock facies classification problem

    Science.gov (United States)

    Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  13. Hyperplane distance neighbor clustering based on local discriminant analysis for complex chemical processes monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Chunhong; Xiao, Shaoqing; Gu, Xiaofeng [Jiangnan University, Wuxi (China)

    2014-11-15

    The collected training data often include both normal and faulty samples for complex chemical processes. However, some monitoring methods, such as partial least squares (PLS), principal component analysis (PCA), independent component analysis (ICA) and Fisher discriminant analysis (FDA), require fault-free data to build the normal operation model. These techniques are applicable after the preliminary step of data clustering is applied. We here propose a novel hyperplane distance neighbor clustering (HDNC) based on the local discriminant analysis (LDA) for chemical process monitoring. First, faulty samples are separated from normal ones using the HDNC method. Then, the optimal subspace for fault detection and classification can be obtained using the LDA approach. The proposed method takes the multimodality within the faulty data into account, and thus improves the capability of process monitoring significantly. The HDNC-LDA monitoring approach is applied to two simulation processes and then compared with the conventional FDA based on the K-nearest neighbor (KNN-FDA) method. The results obtained in two different scenarios demonstrate the superiority of the HDNC-LDA approach in terms of fault detection and classification accuracy.

  14. Hyperplane distance neighbor clustering based on local discriminant analysis for complex chemical processes monitoring

    International Nuclear Information System (INIS)

    Lu, Chunhong; Xiao, Shaoqing; Gu, Xiaofeng

    2014-01-01

    The collected training data often include both normal and faulty samples for complex chemical processes. However, some monitoring methods, such as partial least squares (PLS), principal component analysis (PCA), independent component analysis (ICA) and Fisher discriminant analysis (FDA), require fault-free data to build the normal operation model. These techniques are applicable after the preliminary step of data clustering is applied. We here propose a novel hyperplane distance neighbor clustering (HDNC) based on the local discriminant analysis (LDA) for chemical process monitoring. First, faulty samples are separated from normal ones using the HDNC method. Then, the optimal subspace for fault detection and classification can be obtained using the LDA approach. The proposed method takes the multimodality within the faulty data into account, and thus improves the capability of process monitoring significantly. The HDNC-LDA monitoring approach is applied to two simulation processes and then compared with the conventional FDA based on the K-nearest neighbor (KNN-FDA) method. The results obtained in two different scenarios demonstrate the superiority of the HDNC-LDA approach in terms of fault detection and classification accuracy

  15. From Physiological data to Emotional States: Conducting a User Study and Comparing Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Ali Mehmood KHAN

    2016-06-01

    Full Text Available Recognizing emotional states is becoming a major part of a user's context for wearable computing applications. The system should be able to acquire a user's emotional states by using physiological sensors. We want to develop a personal emotional states recognition system that is practical, reliable, and can be used for health-care related applications. We propose to use the eHealth platform 1 which is a ready-made, light weight, small and easy to use device for recognizing a few emotional states like ‘Sad’, ‘Dislike’, ‘Joy’, ‘Stress’, ‘Normal’, ‘No-Idea’, ‘Positive’ and ‘Negative’ using decision tree (J48 and k-Nearest Neighbors (IBK classifiers. In this paper, we present an approach to build a system that exhibits this property and provides evidence based on data for 8 different emotional states collected from 24 different subjects. Our results indicate that the system has an accuracy rate of approximately 98 %. In our work, we used four physiological sensors i.e. ‘Blood Volume Pulse’ (BVP, ‘Electromyogram’ (EMG, ‘Galvanic Skin Response’ (GSR, and ‘Skin Temperature’ in order to recognize emotional states (i.e. Stress, Joy/Happy, Sad, Normal/Neutral, Dislike, No-idea, Positive and Negative.

  16. Constructing a logical, regular axis topology from an irregular topology

    Science.gov (United States)

    Faraj, Daniel A.

    2014-07-01

    Constructing a logical regular topology from an irregular topology including, for each axial dimension and recursively, for each compute node in a subcommunicator until returning to a first node: adding to a logical line of the axial dimension a neighbor specified in a nearest neighbor list; calling the added compute node; determining, by the called node, whether any neighbor in the node's nearest neighbor list is available to add to the logical line; if a neighbor in the called compute node's nearest neighbor list is available to add to the logical line, adding, by the called compute node to the logical line, any neighbor in the called compute node's nearest neighbor list for the axial dimension not already added to the logical line; and, if no neighbor in the called compute node's nearest neighbor list is available to add to the logical line, returning to the calling compute node.

  17. Analytic nearest neighbour model for FCC metals

    International Nuclear Information System (INIS)

    Idiodi, J.O.A.; Garba, E.J.D.; Akinlade, O.

    1991-06-01

    A recently proposed analytic nearest-neighbour model for fcc metals is criticised and two alternative nearest-neighbour models derived from the separable potential method (SPM) are recommended. Results for copper and aluminium illustrate the utility of the recommended models. (author). 20 refs, 5 tabs

  18. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

    Directory of Open Access Journals (Sweden)

    Ali Dashti

    Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.

  19. Facial Expression Recognition via Non-Negative Least-Squares Sparse Coding

    Directory of Open Access Journals (Sweden)

    Ying Chen

    2014-05-01

    Full Text Available Sparse coding is an active research subject in signal processing, computer vision, and pattern recognition. A novel method of facial expression recognition via non-negative least squares (NNLS sparse coding is presented in this paper. The NNLS sparse coding is used to form a facial expression classifier. To testify the performance of the presented method, local binary patterns (LBP and the raw pixels are extracted for facial feature representation. Facial expression recognition experiments are conducted on the Japanese Female Facial Expression (JAFFE database. Compared with other widely used methods such as linear support vector machines (SVM, sparse representation-based classifier (SRC, nearest subspace classifier (NSC, K-nearest neighbor (KNN and radial basis function neural networks (RBFNN, the experiment results indicate that the presented NNLS method performs better than other used methods on facial expression recognition tasks.

  20. Norrie disease and MAO genes: nearest neighbors.

    Science.gov (United States)

    Chen, Z Y; Denney, R M; Breakefield, X O

    1995-01-01

    The Norrie disease and MAO genes are tandemly arranged in the p11.4-p11.3 region of the human X chromosome in the order tel-MAOA-MAOB-NDP-cent. This relationship is conserved in the mouse in the order tel-MAOB-MAOA-NDP-cent. The MAO genes appear to have arisen by tandem duplication of an ancestral MAO gene, but their positional relationship to NDP appears to be random. Distinctive X-linked syndromes have been described for mutations in the MAOA and NDP genes, and in addition, individuals have been identified with contiguous gene syndromes due to chromosomal deletions which encompass two or three of these genes. Loss of function of the NDP gene causes a syndrome of congenital blindness and progressive hearing loss, sometimes accompanied by signs of CNS dysfunction, including variable mental retardation and psychiatric symptoms. Other mutations in the NDP gene have been found to underlie another X-linked eye disease, exudative vitreo-retinopathy. An MAOA deficiency state has been described in one family to date, with features of altered amine and amine metabolite levels, low normal intelligence, apparent difficulty in impulse control and cardiovascular difficulty in affected males. A contiguous gene syndrome in which all three genes are lacking, as well as other as yet unidentified flanking genes, results in severe mental retardation, small stature, seizures and congenital blindness, as well as altered amine and amine metabolites. Issues that remain to be resolved are the function of the NDP gene product, the frequency and phenotype of the MAOA deficiency state, and the possible occurrence and phenotype of an MAOB deficiency state.

  1. Analytical approach for collective diffusion: one-dimensional lattice with the nearest neighbor and the next nearest neighbor lateral interactions

    Czech Academy of Sciences Publication Activity Database

    Tarasenko, Alexander

    2018-01-01

    Roč. 95, Jan (2018), s. 37-40 ISSN 1386-9477 R&D Projects: GA MŠk LO1409; GA MŠk LM2015088 Institutional support: RVO:68378271 Keywords : lattice gas systems * kinetic Monte Carlo simulations * diffusion and migration Subject RIV: BE - Theoretical Physics OBOR OECD: Atomic, molecular and chemical physics (physics of atoms and molecules including collision, interaction with radiation, magnetic resonances, Mössbauer effect) Impact factor: 2.221, year: 2016

  2. The influence of further-neighbor spin-spin interaction on a ground state of 2D coupled spin-electron model in a magnetic field

    Science.gov (United States)

    Čenčariková, Hana; Strečka, Jozef; Gendiar, Andrej; Tomašovičová, Natália

    2018-05-01

    An exhaustive ground-state analysis of extended two-dimensional (2D) correlated spin-electron model consisting of the Ising spins localized on nodal lattice sites and mobile electrons delocalized over pairs of decorating sites is performed within the framework of rigorous analytical calculations. The investigated model, defined on an arbitrary 2D doubly decorated lattice, takes into account the kinetic energy of mobile electrons, the nearest-neighbor Ising coupling between the localized spins and mobile electrons, the further-neighbor Ising coupling between the localized spins and the Zeeman energy. The ground-state phase diagrams are examined for a wide range of model parameters for both ferromagnetic as well as antiferromagnetic interaction between the nodal Ising spins and non-zero value of external magnetic field. It is found that non-zero values of further-neighbor interaction leads to a formation of new quantum states as a consequence of competition between all considered interaction terms. Moreover, the new quantum states are accompanied with different magnetic features and thus, several kinds of field-driven phase transitions are observed.

  3. NeighborHood

    OpenAIRE

    Corominola Ocaña, Víctor

    2015-01-01

    NeighborHood és una aplicació basada en el núvol, adaptable a qualsevol dispositiu (mòbil, tablet, desktop). L'objectiu d'aquesta aplicació és poder permetre als usuaris introduir a les persones del seu entorn més immediat i que aquestes persones siguin visibles per a la resta d'usuaris. NeighborHood es una aplicación basada en la nube, adaptable a cualquier dispositivo (móvil, tablet, desktop). El objetivo de esta aplicación es poder permitir a los usuarios introducir a las personas de su...

  4. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  5. Neighboring and Urbanism: Commonality versus Friendship.

    Science.gov (United States)

    Silverman, Carol J.

    1986-01-01

    Examines a dimension of neighboring that need not assume friendship as the role model. When the model assumes only a sense of connectedness as defining neighboring, then the residential correlation, shown in many studies between urbanism and neighboring, disappears. Theories of neighboring, study variables, methods, and analysis are discussed.…

  6. Detecting android malicious apps and categorizing benign apps with ensemble of classifiers

    KAUST Repository

    Wang, Wei

    2017-01-17

    Android platform has dominated the markets of smart mobile devices in recent years. The number of Android applications (apps) has seen a massive surge. Unsurprisingly, Android platform has also become the primary target of attackers. The management of the explosively expansive app markets has thus become an important issue. On the one hand, it requires effectively detecting malicious applications (malapps) in order to keep the malapps out of the app market. On the other hand, it needs to automatically categorize a big number of benign apps so as to ease the management, such as correcting an app’s category falsely designated by the app developer. In this work, we propose a framework to effectively and efficiently manage a big app market in terms of detecting malapps and categorizing benign apps. We extract 11 types of static features from each app to characterize the behaviors of the app, and employ the ensemble of multiple classifiers, namely, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Naive Bayes (NB), Classification and Regression Tree (CART) and Random Forest (RF), to detect malapps and to categorize benign apps. An alarm will be triggered if an app is identified as malicious. Otherwise, the benign app will be identified as a specific category. We evaluate the framework on a large app set consisting of 107,327 benign apps as well as 8,701 malapps. The experimental results show that our method achieves the accuracy of 99.39% in the detection of malapps and achieves the best accuracy of 82.93% in the categorization of benign apps.

  7. Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.

    Science.gov (United States)

    Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F

    2018-03-01

    Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.

  8. Hybrid RGSA and Support Vector Machine Framework for Three-Dimensional Magnetic Resonance Brain Tumor Classification

    Directory of Open Access Journals (Sweden)

    R. Rajesh Sharma

    2015-01-01

    algorithm (RGSA. Support vector machines, over backpropagation network, and k-nearest neighbor are used to evaluate the goodness of classifier approach. The preliminary evaluation of the system is performed using 320 real-time brain MRI images. The system is trained and tested by using a leave-one-case-out method. The performance of the classifier is tested using the receiver operating characteristic curve of 0.986 (±002. The experimental results demonstrate the systematic and efficient feature extraction and feature selection algorithm to the performance of state-of-the-art feature classification methods.

  9. Design ensemble machine learning model for breast cancer diagnosis.

    Science.gov (United States)

    Hsieh, Sheau-Ling; Hsieh, Sung-Huai; Cheng, Po-Hsun; Chen, Chi-Huang; Hsu, Kai-Ping; Lee, I-Shun; Wang, Zhenyu; Lai, Feipei

    2012-10-01

    In this paper, we classify the breast cancer of medical diagnostic data. Information gain has been adapted for feature selections. Neural fuzzy (NF), k-nearest neighbor (KNN), quadratic classifier (QC), each single model scheme as well as their associated, ensemble ones have been developed for classifications. In addition, a combined ensemble model with these three schemes has been constructed for further validations. The experimental results indicate that the ensemble learning performs better than individual single ones. Moreover, the combined ensemble model illustrates the highest accuracy of classifications for the breast cancer among all models.

  10. IMPROVING NEAREST NEIGHBOUR SEARCH IN 3D SPATIAL ACCESS METHOD

    Directory of Open Access Journals (Sweden)

    A. Suhaibaha

    2016-10-01

    Full Text Available Nearest Neighbour (NN is one of the important queries and analyses for spatial application. In normal practice, spatial access method structure is used during the Nearest Neighbour query execution to retrieve information from the database. However, most of the spatial access method structures are still facing with unresolved issues such as overlapping among nodes and repetitive data entry. This situation will perform an excessive Input/Output (IO operation which is inefficient for data retrieval. The situation will become more crucial while dealing with 3D data. The size of 3D data is usually large due to its detail geometry and other attached information. In this research, a clustered 3D hierarchical structure is introduced as a 3D spatial access method structure. The structure is expected to improve the retrieval of Nearest Neighbour information for 3D objects. Several tests are performed in answering Single Nearest Neighbour search and k Nearest Neighbour (kNN search. The tests indicate that clustered hierarchical structure is efficient in handling Nearest Neighbour query compared to its competitor. From the results, clustered hierarchical structure reduced the repetitive data entry and the accessed page. The proposed structure also produced minimal Input/Output operation. The query response time is also outperformed compared to the other competitor. For future outlook of this research several possible applications are discussed and summarized.

  11. Next neighbors effect along the Ca-Sr-Ba-åkermanite join: Long-range vs. short-range structural features

    Science.gov (United States)

    Dondi, Michele; Ardit, Matteo; Cruciani, Giuseppe

    2013-06-01

    An original approach has been developed herein to explore the correlations between short- and long-range structural properties of solid solutions. X-ray diffraction (XRD) and electronic absorption spectroscopy (EAS) data were combined on a (Ca,Sr,Ba)2(Mg0.7Co0.3)Si2O7 join to determine average and local distances, respectively. Instead of varying the EAS-active ion concentration along the join, as has commonly been performed in previous studies, the constant replacement of Mg2+ by a minimal fraction of a similar size cation (Co2+) has been used to assess the effects of varying second-nearest neighbor cations (Ca, Sr, Ba) on the local distances of the first shell. A comparison between doped and un-doped series has shown that, although the overall symmetry of the Co-centered T1-site was retained, greater relaxation occurs at the CoO4 tetrahedra which become increasingly large and more distorted than the MgO4 tetrahedra. This is indicated by an increase in both the quadratic elongation (λT1) and the bond angle variance (σ2T1) distortion indices, as the whole structure expands due to an increase in size in the second-nearest neighbors. This behavior highlights the effect of the different electronic configurations of Co2+ (3d7) and Mg2+ (2p6) in spite of their very similar ionic size. Furthermore, although the overall symmetry of the Co-centered T1-site is retained, relatively limited (Co2+-O occur along the solid solution series and large changes are found in molar absorption coefficients showing that EAS Co2+-bands are highly sensitive to change in the local structure.

  12. Identifying influential neighbors in animal flocking.

    Directory of Open Access Journals (Sweden)

    Li Jiang

    2017-11-01

    Full Text Available Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.

  13. Identifying influential neighbors in animal flocking.

    Science.gov (United States)

    Jiang, Li; Giuggioli, Luca; Perna, Andrea; Escobedo, Ramón; Lecheval, Valentin; Sire, Clément; Han, Zhangang; Theraulaz, Guy

    2017-11-01

    Schools of fish and flocks of birds can move together in synchrony and decide on new directions of movement in a seamless way. This is possible because group members constantly share directional information with their neighbors. Although detecting the directionality of other group members is known to be important to maintain cohesion, it is not clear how many neighbors each individual can simultaneously track and pay attention to, and what the spatial distribution of these influential neighbors is. Here, we address these questions on shoals of Hemigrammus rhodostomus, a species of fish exhibiting strong schooling behavior. We adopt a data-driven analysis technique based on the study of short-term directional correlations to identify which neighbors have the strongest influence over the participation of an individual in a collective U-turn event. We find that fish mainly react to one or two neighbors at a time. Moreover, we find no correlation between the distance rank of a neighbor and its likelihood to be influential. We interpret our results in terms of fish allocating sequential and selective attention to their neighbors.

  14. The magnetic properties of a mixed spin-1/2 and spin-1 Heisenberg ferrimagnetic system on a two-dimensional square lattice

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Ai-Yuan, E-mail: huaiyuanhuyuanai@126.com [School of Physics and Electronic Engineering, Chongqing Normal University, Chongqing 401331 (China); Zhang, A.-Jie [Military Operational Research Teaching Division of the 4th Department, PLA Academy of National Defense Information, Wuhan 430000 (China)

    2016-02-01

    The magnetic properties of a mixed spin-1/2 and spin-1 Heisenberg ferrimagnetic system on a two-dimensional square lattice are investigated by means of the double-time Green's function technique within the random phase decoupling approximation. The role of the nearest-, next-nearest-neighbors interactions and the exchange anisotropy in the Hamiltonian is explored. And their effects on the critical and compensation temperature are discussed in detail. Our investigation indicates that both the next-nearest-neighbor interactions and the anisotropy have a great effect on the phase diagram. - Highlights: • Spin-1/2 and spin-1 ferrimagnetic model is examined. • Green's function technique is used. • The role of the nearest-, next-nearest-neighbors interactions and the exchange anisotropy in the Hamiltonian is explored. • The next-nearest-neighbor interactions and the anisotropy have a great effect on the phase diagram.

  15. Social aggregation in pea aphids: experiment and random walk modeling.

    Directory of Open Access Journals (Sweden)

    Christa Nilsen

    Full Text Available From bird flocks to fish schools and ungulate herds to insect swarms, social biological aggregations are found across the natural world. An ongoing challenge in the mathematical modeling of aggregations is to strengthen the connection between models and biological data by quantifying the rules that individuals follow. We model aggregation of the pea aphid, Acyrthosiphon pisum. Specifically, we conduct experiments to track the motion of aphids walking in a featureless circular arena in order to deduce individual-level rules. We observe that each aphid transitions stochastically between a moving and a stationary state. Moving aphids follow a correlated random walk. The probabilities of motion state transitions, as well as the random walk parameters, depend strongly on distance to an aphid's nearest neighbor. For large nearest neighbor distances, when an aphid is essentially isolated, its motion is ballistic with aphids moving faster, turning less, and being less likely to stop. In contrast, for short nearest neighbor distances, aphids move more slowly, turn more, and are more likely to become stationary; this behavior constitutes an aggregation mechanism. From the experimental data, we estimate the state transition probabilities and correlated random walk parameters as a function of nearest neighbor distance. With the individual-level model established, we assess whether it reproduces the macroscopic patterns of movement at the group level. To do so, we consider three distributions, namely distance to nearest neighbor, angle to nearest neighbor, and percentage of population moving at any given time. For each of these three distributions, we compare our experimental data to the output of numerical simulations of our nearest neighbor model, and of a control model in which aphids do not interact socially. Our stochastic, social nearest neighbor model reproduces salient features of the experimental data that are not captured by the control.

  16. Synthesis and electronic properties of chemically functionalized graphene on metal surfaces

    International Nuclear Information System (INIS)

    Grüneis, Alexander

    2013-01-01

    A review on the electronic properties, growth and functionalization of graphene on metals is presented. Starting from the derivation of the electronic properties of an isolated graphene layer using the nearest neighbor tight-binding (TB) approximation for π and σ electrons, the TB model is then extended to third-nearest neighbors and interlayer coupling. The latter is relevant to few-layer graphene and graphite. Next, the conditions under which epitaxial graphene can be obtained by chemical vapor deposition are reviewed with a particular emphasis on the Ni(111) surface. Regarding functionalization, I first discuss the intercalation of monolayer Au into the graphene/Ni(111) interface, which renders graphene quasi-free-standing. The Au intercalated quasi-free-standing graphene is then the basis for chemical functionalization. Functionalization of graphene is classified into covalent, ionic and substitutional functionalization. As archetypical examples for these three possibilities I discuss covalent functionalization by hydrogen, ionic functionalization by alkali metals and substitutional functionalization by nitrogen heteroatoms.

  17. MINIMIZING THE PREPARATION TIME OF A TUBES MACHINE: EXACT SOLUTION AND HEURISTICS

    Directory of Open Access Journals (Sweden)

    Robinson S.V. Hoto

    Full Text Available ABSTRACT In this paper we optimize the preparation time of a tubes machine. Tubes are hard tubes made by gluing strips of paper that are packed in paper reels, and some of them may be reused between the production of one and another tube. We present a mathematical model for the minimization of changing reels and movements and also implementations for the heuristics Nearest Neighbor, an improvement of a nearest neighbor (Best Nearest Neighbor, refinements of the Best Nearest Neighbor heuristic and a heuristic of permutation called Best Configuration using the IDE (integrated development environment WxDev C++. The results obtained by simulations improve the one used by the company.

  18. Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

    DEFF Research Database (Denmark)

    Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan

    2009-01-01

    The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify t...... other previously applied intelligent approaches....

  19. Efficient Fingercode Classification

    Science.gov (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  20. Autonomous target recognition using remotely sensed surface vibration measurements

    Science.gov (United States)

    Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.

    1993-09-01

    The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.

  1. Estimation and Mapping Forest Attributes Using “k Nearest Neighbor” Method on IRS-P6 LISS III Satellite Image Data

    Directory of Open Access Journals (Sweden)

    Amir Eslam Bonyad

    2015-06-01

    Full Text Available In this study, we explored the utility of k Nearest Neighbor (kNN algorithm to integrate IRS-P6 LISS III satellite imagery data and ground inventory data for application in forest attributes (DBH, trees height, volume, basal area, density and forest cover type estimation and mapping. The ground inventory data was based on a systematic-random sampling grid and the numbers of sampling plots were 408 circular plots in a plantation in Guilan province, north of Iran. We concluded that kNN method was useful tool for mapping at a fine accuracy between 80% and 93.94%. Values of k between 5 and 8 seemed appropriate. The best distance metrics were found Euclidean, Fuzzy and Mahalanobis. Results showed that kNN was accurate enough for practical applicability for mapping forest areas.

  2. Quantitative diagnosis of bladder cancer by morphometric analysis of HE images

    Science.gov (United States)

    Wu, Binlin; Nebylitsa, Samantha V.; Mukherjee, Sushmita; Jain, Manu

    2015-02-01

    In clinical practice, histopathological analysis of biopsied tissue is the main method for bladder cancer diagnosis and prognosis. The diagnosis is performed by a pathologist based on the morphological features in the image of a hematoxylin and eosin (HE) stained tissue sample. This manuscript proposes algorithms to perform morphometric analysis on the HE images, quantify the features in the images, and discriminate bladder cancers with different grades, i.e. high grade and low grade. The nuclei are separated from the background and other types of cells such as red blood cells (RBCs) and immune cells using manual outlining, color deconvolution and image segmentation. A mask of nuclei is generated for each image for quantitative morphometric analysis. The features of the nuclei in the mask image including size, shape, orientation, and their spatial distributions are measured. To quantify local clustering and alignment of nuclei, we propose a 1-nearest-neighbor (1-NN) algorithm which measures nearest neighbor distance and nearest neighbor parallelism. The global distributions of the features are measured using statistics of the proposed parameters. A linear support vector machine (SVM) algorithm is used to classify the high grade and low grade bladder cancers. The results show using a particular group of nuclei such as large ones, and combining multiple parameters can achieve better discrimination. This study shows the proposed approach can potentially help expedite pathological diagnosis by triaging potentially suspicious biopsies.

  3. Classification of Pulse Waveforms Using Edit Distance with Real Penalty

    Directory of Open Access Journals (Sweden)

    Zhang Dongyu

    2010-01-01

    Full Text Available Abstract Advances in sensor and signal processing techniques have provided effective tools for quantitative research in traditional Chinese pulse diagnosis (TCPD. Because of the inevitable intraclass variation of pulse patterns, the automatic classification of pulse waveforms has remained a difficult problem. In this paper, by referring to the edit distance with real penalty (ERP and the recent progress in -nearest neighbors (KNN classifiers, we propose two novel ERP-based KNN classifiers. Taking advantage of the metric property of ERP, we first develop an ERP-induced inner product and a Gaussian ERP kernel, then embed them into difference-weighted KNN classifiers, and finally develop two novel classifiers for pulse waveform classification. The experimental results show that the proposed classifiers are effective for accurate classification of pulse waveform.

  4. Deep learning approach for classifying, detecting and predicting photometric redshifts of quasars in the Sloan Digital Sky Survey stripe 82

    Science.gov (United States)

    Pasquet-Itam, J.; Pasquet, J.

    2018-04-01

    We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope. A table of the candidates is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A97

  5. Heart murmur detection based on wavelet transformation and a synergy between artificial neural network and modified neighbor annealing methods.

    Science.gov (United States)

    Eslamizadeh, Gholamhossein; Barati, Ramin

    2017-05-01

    Early recognition of heart disease plays a vital role in saving lives. Heart murmurs are one of the common heart problems. In this study, Artificial Neural Network (ANN) is trained with Modified Neighbor Annealing (MNA) to classify heart cycles into normal and murmur classes. Heart cycles are separated from heart sounds using wavelet transformer. The network inputs are features extracted from individual heart cycles, and two classification outputs. Classification accuracy of the proposed model is compared with five multilayer perceptron trained with Levenberg-Marquardt, Extreme-learning-machine, back-propagation, simulated-annealing, and neighbor-annealing algorithms. It is also compared with a Self-Organizing Map (SOM) ANN. The proposed model is trained and tested using real heart sounds available in the Pascal database to show the applicability of the proposed scheme. Also, a device to record real heart sounds has been developed and used for comparison purposes too. Based on the results of this study, MNA can be used to produce considerable results as a heart cycle classifier. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Nonlocal synchronization in nearest neighbour coupled oscillators

    International Nuclear Information System (INIS)

    El-Nashar, H.F.; Elgazzar, A.S.; Cerdeira, H.A.

    2002-02-01

    We investigate a system of nearest neighbour coupled oscillators. We show that the nonlocal frequency synchronization, that might appear in such a system, occurs as a consequence of the nearest neighbour coupling. The power spectra of nonadjacent oscillators shows that there is no complete coincidence between all frequency peaks of the oscillators in the nonlocal cluster, while the peaks for neighbouring oscillators approximately coincide even if they are not yet in a cluster. It is shown that nonadjacent oscillators closer in frequencies, share slow modes with their adjacent oscillators which are neighbours in space. It is also shown that when a direct coupling between non-neighbours oscillators is introduced explicitly, the peaks of the spectra of the frequencies of those non-neighbours coincide. (author)

  7. Performance of svm, k-nn and nbc classifiers for text-independent speaker identification with and without modelling through merging models

    Directory of Open Access Journals (Sweden)

    Yussouf Nahayo

    2016-04-01

    Full Text Available This paper proposes some methods of robust text-independent speaker identification based on Gaussian Mixture Model (GMM. We implemented a combination of GMM model with a set of classifiers such as Support Vector Machine (SVM, K-Nearest Neighbour (K-NN, and Naive Bayes Classifier (NBC. In order to improve the identification rate, we developed a combination of hybrid systems by using validation technique. The experiments were performed on the dialect DR1 of the TIMIT corpus. The results have showed a better performance for the developed technique compared to the individual techniques.

  8. PENERAPAN TEKNIK BAGGING PADA ALGORITMA KLASIFIKASI UNTUK MENGATASI KETIDAKSEIMBANGAN KELAS DATASET MEDIS

    Directory of Open Access Journals (Sweden)

    Rizki Tri Prasetio

    2016-03-01

    Full Text Available ABSTRACT – The class imbalance problems have been reported to severely hinder classification performance of many standard learning algorithms, and have attracted a great deal of attention from researchers of different fields. Therefore, a number of methods, such as sampling methods, cost-sensitive learning methods, and bagging and boosting based ensemble methods, have been proposed to solve these problems. Some medical dataset has two classes has two classes or binominal experiencing an imbalance that causes lack of accuracy in classification. This research proposed a combination technique of bagging and algorithms of classification to improve the accuracy of medical datasets. Bagging technique used to solve the problem of imbalanced class. The proposed method is applied on three classifier algorithm i.e., naïve bayes, decision tree and k-nearest neighbor. This research uses five medical datasets obtained from UCI Machine Learning i.e.., breast-cancer, liver-disorder, heart-disease, pima-diabetes and vertebral column. Results of this research indicate that the proposed method makes a significant improvement on two algorithms of classification i.e. decision tree with p value of t-Test 0.0184 and k-nearest neighbor with p value of t-Test 0.0292, but not significant in naïve bayes with p value of t-Test 0.9236. After bagging technique applied at five medical datasets, naïve bayes has the highest accuracy for breast-cancer dataset of 96.14% with AUC of 0.984, heart-disease of 84.44% with AUC of 0.911 and pima-diabetes of 74.73% with AUC of 0.806. While the k-nearest neighbor has the best accuracy for dataset liver-disorder of 62.03% with AUC of 0.632 and vertebral-column of 82.26% with the AUC of 0.867. Keywords: ensemble technique, bagging, imbalanced class, medical dataset. ABSTRAKSI – Masalah ketidakseimbangan kelas telah dilaporkan sangat menghambat kinerja klasifikasi banyak algoritma klasifikasi dan telah menarik banyak perhatian dari

  9. Recrafting the neighbor-joining method

    Directory of Open Access Journals (Sweden)

    Pedersen Christian NS

    2006-01-01

    Full Text Available Abstract Background The neighbor-joining method by Saitou and Nei is a widely used method for constructing phylogenetic trees. The formulation of the method gives rise to a canonical Θ(n3 algorithm upon which all existing implementations are based. Results In this paper we present techniques for speeding up the canonical neighbor-joining method. Our algorithms construct the same phylogenetic trees as the canonical neighbor-joining method. The best-case running time of our algorithms are O(n2 but the worst-case remains O(n3. We empirically evaluate the performance of our algoritms on distance matrices obtained from the Pfam collection of alignments. The experiments indicate that the running time of our algorithms evolve as Θ(n2 on the examined instance collection. We also compare the running time with that of the QuickTree tool, a widely used efficient implementation of the canonical neighbor-joining method. Conclusion The experiments show that our algorithms also yield a significant speed-up, already for medium sized instances.

  10. Technique for fast and efficient hierarchical clustering

    Science.gov (United States)

    Stork, Christopher

    2013-10-08

    A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.

  11. Convolutional Neural Networks as Feature Extractors for Data Scarce Visual Searches

    Science.gov (United States)

    2016-09-01

    created with the t-SNE technique, the fully connected layers FC6 and FC7 better represent the features’ compactness in high dimensional space . In the...This topology is called Fully Connected (FC) layers, and it is shown in Figure 2.2. A CNN model consists of several combinations of CONV and FC...to generate a new representation of the images. These representations are classified with K-Nearest Neighbors within a target space that has just a few

  12. Atomistic simulation of the point defects in B2-type MoTa alloy

    International Nuclear Information System (INIS)

    Zhang Jianmin; Wang Fang; Xu Kewei; Ji, Vincent

    2009-01-01

    The formation and migration mechanisms of three different point defects (mono-vacancy, anti-site defect and interstitial atom) in B 2 -type MoTa alloy have been investigated by combining molecular dynamics (MD) simulation with modified analytic embedded-atom method (MAEAM). From minimization of the formation energy, we find that the anti-site defects Mo Ta and Ta Mo are easier to form than Mo and Ta mono-vacancies, while Mo and Ta interstitial atoms are difficult to form in the alloy. In six migration mechanisms of Mo and Ta mono-vacancies, one nearest-neighbor jump (1NNJ) is the most favorable due to its lowest activation and migration energies, but it will cause a disorder in the alloy. One next-nearest-neighbor jump (1NNNJ) and one third-nearest-neighbor jump (1TNNJ) can maintain the ordered property of the alloy but require higher activation and migration energies, so the 1NNNJ and 1TNNJ should be replaced by straight [1 0 0] six nearest-neighbor cyclic jumps (S[1 0 0]6NNCJ) or bent [1 0 0] six nearest-neighbor cyclic jumps (B[1 0 0]6NNCJ) and [1 1 0] six nearest-neighbor cyclic jumps ([1 1 0]6NNCJ), respectively. Although the migrations of Mo and Ta interstitial atoms need much lower energy than Mo and Ta mono-vacancies, they are not main migration mechanisms due to difficult to form in the alloy.

  13. Road Short-Term Travel Time Prediction Method Based on Flow Spatial Distribution and the Relations

    Directory of Open Access Journals (Sweden)

    Mingjun Deng

    2016-01-01

    Full Text Available There are many short-term road travel time forecasting studies based on time series, but indeed, road travel time not only relies on the historical travel time series, but also depends on the road and its adjacent sections history flow. However, few studies have considered that. This paper is based on the correlation of flow spatial distribution and the road travel time series, applying nearest neighbor and nonparametric regression method to build a forecasting model. In aspect of spatial nearest neighbor search, three different space distances are defined. In addition, two forecasting functions are introduced: one combines the forecasting value by mean weight and the other uses the reciprocal of nearest neighbors distance as combined weight. Three different distances are applied in nearest neighbor search, which apply to the two forecasting functions. For travel time series, the nearest neighbor and nonparametric regression are applied too. Then minimizing forecast error variance is utilized as an objective to establish the combination model. The empirical results show that the combination model can improve the forecast performance obviously. Besides, the experimental results of the evaluation for the computational complexity show that the proposed method can satisfy the real-time requirement.

  14. Comparison of classification methods for voxel-based prediction of acute ischemic stroke outcome following intra-arterial intervention

    Science.gov (United States)

    Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.

    2017-03-01

    Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.

  15. Model of directed lines for square ice with second-neighbor and third-neighbor interactions

    Science.gov (United States)

    Kirov, Mikhail V.

    2018-02-01

    The investigation of the properties of nanoconfined systems is one of the most rapidly developing scientific fields. Recently it has been established that water monolayer between two graphene sheets forms square ice. Because of the energetic disadvantage, in the structure of the square ice there are no longitudinally arranged molecules. The result is that the structure is formed by unidirectional straight-lines of hydrogen bonds only. A simple but accurate discrete model of square ice with second-neighbor and third-neighbor interactions is proposed. According to this model, the ground state includes all configurations which do not contain three neighboring unidirectional chains of hydrogen bonds. Each triplet increases the energy by the same value. This new model differs from an analogous model with long-range interactions where in the ground state all neighboring chains are antiparallel. The new model is suitable for the corresponding system of point electric (and magnetic) dipoles on the square lattice. It allows separately estimating the different contributions to the total binding energy and helps to understand the properties of infinite monolayers and finite nanostructures. Calculations of the binding energy for square ice and for point dipole system are performed using the packages TINKER and LAMMPS.

  16. A local non-parametric model for trade sign inference

    Science.gov (United States)

    Blazejewski, Adam; Coggins, Richard

    2005-03-01

    We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.

  17. Supervised Classification of Agricultural Land Cover Using a Modified k-NN Technique (MNN and Landsat Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Karsten Schulz

    2009-11-01

    Full Text Available Nearest neighbor techniques are commonly used in remote sensing, pattern recognition and statistics to classify objects into a predefined number of categories based on a given set of predictors. These techniques are especially useful for highly nonlinear relationship between the variables. In most studies the distance measure is adopted a priori. In contrast we propose a general procedure to find an adaptive metric that combines a local variance reducing technique and a linear embedding of the observation space into an appropriate Euclidean space. To illustrate the application of this technique, two agricultural land cover classifications using mono-temporal and multi-temporal Landsat scenes are presented. The results of the study, compared with standard approaches used in remote sensing such as maximum likelihood (ML or k-Nearest Neighbor (k-NN indicate substantial improvement with regard to the overall accuracy and the cardinality of the calibration data set. Also, using MNN in a soft/fuzzy classification framework demonstrated to be a very useful tool in order to derive critical areas that need some further attention and investment concerning additional calibration data.

  18. Air Pollution from Livestock Farms Is Associated with Airway Obstruction in Neighboring Residents.

    Science.gov (United States)

    Borlée, Floor; Yzermans, C Joris; Aalders, Bernadette; Rooijackers, Jos; Krop, Esmeralda; Maassen, Catharina B M; Schellevis, François; Brunekreef, Bert; Heederik, Dick; Smit, Lidwien A M

    2017-11-01

    Livestock farm emissions may not only affect respiratory health of farmers but also of neighboring residents. To explore associations between spatial and temporal variation in pollutant emissions from livestock farms and lung function in a general, nonfarming, rural population in the Netherlands. We conducted a cross-sectional study in 2,308 adults (age, 20-72 yr). A pulmonary function test was performed measuring prebronchodilator and post-bronchodilator FEV 1 , FVC, FEV 1 /FVC, and maximum mid-expiratory flow (MMEF). Spatial exposure was assessed as (1) number of farms within 500 m and 1,000 m of the home, (2) distance to the nearest farm, and (3) modeled annual average fine dust emissions from farms within 500 m and 1,000 m of the home address. Temporal exposure was assessed as week-average ambient particulate matter livestock farms within a 1,000-m buffer from the home address and MMEF, which was more pronounced in participants without atopy. No associations were found with other spatial exposure variables. Week-average particulate matter livestock air pollution emissions are associated with lung function deficits in nonfarming residents.

  19. "Equilibrium structure of monatomic steps on vicinal Si(001)

    NARCIS (Netherlands)

    Zandvliet, Henricus J.W.; Elswijk, H.B.; van Loenen, E.J.; Dijkkamp, D.

    1992-01-01

    The equilibrium structure of monatomic steps on vicinal Si(001) is described in terms of anisotropic nearest-neighbor and isotropic second-nearest-neighbor interactions between dimers. By comparing scanning-tunneling-microscopy data and this equilibrium structure, we obtained interaction energies of

  20. A swarm-trained k-nearest prototypes adaptive classifier with automatic feature selection for interval data.

    Science.gov (United States)

    Silva Filho, Telmo M; Souza, Renata M C R; Prudêncio, Ricardo B C

    2016-08-01

    Some complex data types are capable of modeling data variability and imprecision. These data types are studied in the symbolic data analysis field. One such data type is interval data, which represents ranges of values and is more versatile than classic point data for many domains. This paper proposes a new prototype-based classifier for interval data, trained by a swarm optimization method. Our work has two main contributions: a swarm method which is capable of performing both automatic selection of features and pruning of unused prototypes and a generalized weighted squared Euclidean distance for interval data. By discarding unnecessary features and prototypes, the proposed algorithm deals with typical limitations of prototype-based methods, such as the problem of prototype initialization. The proposed distance is useful for learning classes in interval datasets with different shapes, sizes and structures. When compared to other prototype-based methods, the proposed method achieves lower error rates in both synthetic and real interval datasets. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. SpaceTwist

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Jensen, Christian Søndergaard; Xuegang, Huang

    2008-01-01

    -based matching generally fall short in offering practical query accuracy guarantees. Our proposed framework, called SpaceTwist, rectifies these shortcomings for k nearest neighbor (kNN) queries. Starting with a location different from the user's actual location, nearest neighbors are retrieved incrementally...

  2. Machine Learning Methods for Production Cases Analysis

    Science.gov (United States)

    Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.

    2018-03-01

    Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.

  3. THE SLOAN DIGITAL SKY SURVEY CO-ADD: A GALAXY PHOTOMETRIC REDSHIFT CATALOG

    International Nuclear Information System (INIS)

    Reis, Ribamar R. R.; Soares-Santos, Marcelle; Annis, James; Dodelson, Scott; Hao Jiangang; Johnston, David; Kubo, Jeffrey; Lin Huan; Seo, Hee-Jong; Simet, Melanie

    2012-01-01

    We present and describe a catalog of galaxy photometric redshifts (photo-z) for the Sloan Digital Sky Survey (SDSS) Co-add Data. We use the artificial neural network (ANN) technique to calculate the photo-z and the nearest neighbor error method to estimate photo-z errors for ∼13 million objects classified as galaxies in the co-add with r 68 = 0.031. After presenting our results and quality tests, we provide a short guide for users accessing the public data.

  4. New Results on the Nearest OB Association: Sco-Cen (Sco OB2)

    Science.gov (United States)

    Mamajek, Eric E.

    2013-01-01

    The Scorpius-Centaurus OB association (Sco OB2) is the nearest site of recent massive star formation to the Sun. The primary stellar groups in the Sco-Cen complex (including OB subgroups Upper Sco, Upper Cen Lup, and Lower Cen Cru, the neighboring molecular cloud complexes Lup, Cha, CrA, Oph, and dispersed young groups Eta Cha, Epsilon Cha, TW Hya, and Beta Pic) have been participants in a complex episode of stellar birth (and some stellar death) over the past ~20 Myr. Here I summarize some recent results on the Sco-Cen complex from the U. Rochester group: (1) isochronal analysis of the HR diagram positions for >1 Msun stars in the Upper Scorpius subgroup shows it to be twice as old as previously thought (11 Myr vs. 5 Myr), (2) analysis of high resolution optical echelle spectra show that the subgroups are approximately solar in composition, (3) surveys for lower mass members are showing that the complex shows more substructure than previously recognized, including at least one new subgroup ("Lower Sco"), and the velocity and age data for the nearest OB subgroup Lower Cen Cru argue for a bifurcation into a younger 10 Myr) southern part ("Crux") and an older 20 Myr) northern part ("Lower Centaurus"), (4) an eclipsing, multi-ring dust disk system was serendipitously discovered in the SuperWASP and ASAS light curve for the newly discovered K5-type Sco-Cen member 1SWASP J140747.93-394542.6. With regard to some recent results by other investigators, we find that (1) attempts by some authors to subsume the Sco-Cen subgroups into a single sample of a single age are unnecessarily mixing samples with a wide range in ages, and (2) I have been unable to replicate the expansion age determinations claimed by some investigators for the TW Hya and Beta Pic groups (both purported to have expansion ages of 8 and 12 Myr, respectively), which have been used by some investigators to independently age-date the Sco-Cen subgroups. We acknowledge support from NSF grant AST-1008908 and the

  5. Landscape object-based analysis of wetland plant functional types: the effects of spatial scale, vegetation classes and classifier methods

    Science.gov (United States)

    Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.

    2011-12-01

    Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because

  6. Recrafting the Neighbor-Joining Method

    DEFF Research Database (Denmark)

    Mailund; Brodal, Gerth Stølting; Fagerberg, Rolf

    2006-01-01

    Background: The neighbor-joining method by Saitou and Nei is a widely used method for constructing phylogenetic trees. The formulation of the method gives rise to a canonical Θ(n3) algorithm upon which all existing implementations are based. Methods: In this paper we present techniques for speeding...... up the canonical neighbor-joining method. Our algorithms construct the same phylogenetic trees as the canonical neighbor-joining method. The best-case running time of our algorithms are O(n2) but the worst-case remains O(n3). We empirically evaluate the performance of our algoritms on distance...... matrices obtained from the Pfam collection of alignments. Results: The experiments indicate that the running time of our algorithms evolve as Θ(n2) on the examined instance collection. We also compare the running time with that of the QuickTree tool, a widely used efficient implementation of the canonical...

  7. Energetics and Dynamics of Cu(001)-c(2x2)Cl steps

    NARCIS (Netherlands)

    van Dijk, F.R.; Zandvliet, Henricus J.W.; Poelsema, Bene

    2006-01-01

    The energetics of the step faceting transition of Cu(001) [copper (001) surface] upon Cl (chloride) adsorption in contact with HCl (hydrogen chloride) solution is modeled in terms of a solid-on-solid model that incorporates both nearest-neighbor and next-nearest-neighbor interactions. It is shown

  8. Fast Demand Forecast of Electric Vehicle Charging Stations for Cell Phone Application

    Energy Technology Data Exchange (ETDEWEB)

    Majidpour, Mostafa; Qiu, Charlie; Chung, Ching-Yen; Chu, Peter; Gadh, Rajit; Pota, Hemanshu R.

    2014-07-31

    This paper describes the core cellphone application algorithm which has been implemented for the prediction of energy consumption at Electric Vehicle (EV) Charging Stations at UCLA. For this interactive user application, the total time of accessing database, processing the data and making the prediction, needs to be within a few seconds. We analyze four relatively fast Machine Learning based time series prediction algorithms for our prediction engine: Historical Average, kNearest Neighbor, Weighted k-Nearest Neighbor, and Lazy Learning. The Nearest Neighbor algorithm (k Nearest Neighbor with k=1) shows better performance and is selected to be the prediction algorithm implemented for the cellphone application. Two applications have been designed on top of the prediction algorithm: one predicts the expected available energy at the station and the other one predicts the expected charging finishing time. The total time, including accessing the database, data processing, and prediction is about one second for both applications.

  9. Magneto-structural correlations in trinuclear Cu(II) complexes: a density functional study

    CERN Document Server

    Rodríguez-Forteá, A; Alvarez, S; Centre-De Recera-En-Quimica-Teorica; Alemany, P A; Centre-De Recera-En-Quimica-Teorica

    2003-01-01

    Density functional theoretical methods have been used to study magneto-structural correlations for linear trinuclear hydroxo-bridged copper(II) complexes. The nearest-neighbor exchange coupling constant shows very similar trends to those found earlier for dinuclear compounds for which the Cu-O-Cu angle and the out of plane displacement of the hydrogen atoms at the bridge are the two key structural factors that determine the nature of their magnetic behavior. Changes in these two parameters can induce variations of over 1000 cm sup - sup 1 in the value of the nearest-neighbor coupling constant. On the contrary, coupling between next-nearest neighbors is found to be practically independent of structural changes with a value for the coupling constant of about -60 cm sup - sup 1. The magnitude calculated for this coupling constant indicates that considering its value to be negligible, as usually done in experimental studies, can lead to considerable errors, especially for compounds in which the nearest-neighbor c...

  10. Fall Detection Using Smartphone Audio Features.

    Science.gov (United States)

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  11. Cost-Effectiveness of Seven Approaches to Map Vegetation Communities — A Case Study from Northern Australia’s Tropical Savannas

    Directory of Open Access Journals (Sweden)

    Stuart Phinn

    2013-01-01

    Full Text Available Vegetation communities are traditionally mapped from aerial photography interpretation. Other semi-automated methods include pixel- and object-based image analysis. While these methods have been used for decades, there is a lack of comparative research. We evaluated the cost-effectiveness of seven approaches to map vegetation communities in a northern Australia’s tropical savanna environment. The seven approaches included: (1. aerial photography interpretation, (2. pixel-based image-only classification (Maximum Likelihood Classifier, (3. pixel-based integrated classification (Maximum Likelihood Classifier, (4. object-based image-only classification (nearest neighbor classifier, (5. object-based integrated classification (nearest neighbor classifier, (6. object-based image-only classification (step-wise ruleset, and (7. object-based integrated classification (step-wise ruleset. Approach 1 was applied to 1:50,000 aerial photography and approaches 2–7 were applied to SPOT5 and Landsat5 TM multispectral data. The integrated approaches (3, 5 and 7 included ancillary data (a digital elevation model, slope model, normalized difference vegetation index and hydrology information. The cost-effectiveness was assessed taking into consideration the accuracy and costs associated with each classification approach and image dataset. Accuracy was assessed in terms of overall accuracy and the costs were evaluated using four main components: field data acquisition and preparation, image data acquisition and preparation, image classification and accuracy assessment. Overall accuracy ranged from 28%, for the image-only pixel-based approach, to 67% for the aerial photography interpretation, while total costs ranged from AU$338,000 to AU$388,180 (Australian dollars, for the pixel-based image-only classification and aerial photography interpretation respectively. The most labor-intensive component was field data acquisition and preparation, followed by image data

  12. Interacting-fermion approximation in the two-dimensional ANNNI model

    International Nuclear Information System (INIS)

    Grynberg, M.D.; Ceva, H.

    1990-12-01

    We investigate the effect of including domain-walls interactions in the two-dimensional axial next-nearest-neighbor Ising or ANNNI model. At low temperatures this problem is reduced to a one-dimensional system of interacting fermions which can be treated exactly. It is found that the critical boundaries of the low-temperature phases are in good agreement with those obtained using a free-fermion approximation. In contrast with the monotonic behavior derived from the free-fermion approach, the wall density or wave number displays reentrant phenomena when the ratio of the next-nearest-neighbor and nearest-neighbor interactions is greater than one-half. (author). 17 refs, 2 figs

  13. A system for tracking and recognizing pedestrian faces using a network of loosely coupled cameras

    Science.gov (United States)

    Gagnon, L.; Laliberté, F.; Foucher, S.; Branzan Albu, A.; Laurendeau, D.

    2006-05-01

    A face recognition module has been developed for an intelligent multi-camera video surveillance system. The module can recognize a pedestrian face in terms of six basic emotions and the neutral state. Face and facial features detection (eyes, nasal root, nose and mouth) are first performed using cascades of boosted classifiers. These features are used to normalize the pose and dimension of the face image. Gabor filters are then sampled on a regular grid covering the face image to build a facial feature vector that feeds a nearest neighbor classifier with a cosine distance similarity measure for facial expression interpretation and face model construction. A graphical user interface allows the user to adjust the module parameters.

  14. An efficient architecture for LVQ-SLM for PAPR reduction

    International Nuclear Information System (INIS)

    Khalid, S.; Yasin, M.

    2010-01-01

    In this paper we propose an efficient architecture for the implementation of a LVQ (Learning Vector Quantization)NN (Neural Network), used as a classifier, for PAPR (Peak to Average Power Ratio) reduction. A special feature of the implementation is a combinatorial module for nearest neighbor search that allows online execution of this important operation during classification. The LVQ classifier is programmed in Verilog and the entire circuit is synthesized on FPGAs (Field Programmable Gate Arrays) using Xilinx at the rate ISE (Integrated Software Environment) 8.1i. The model is implemented with 64 sub carriers, considering the parametric values of WLANs standard IEEE 802.11a. Using the architecture, efficient on-line classification is achieved. (author)

  15. Automatic music genres classification as a pattern recognition problem

    Science.gov (United States)

    Ul Haq, Ihtisham; Khan, Fauzia; Sharif, Sana; Shaukat, Arsalan

    2013-12-01

    Music genres are the simplest and effect descriptors for searching music libraries stores or catalogues. The paper compares the results of two automatic music genres classification systems implemented by using two different yet simple classifiers (K-Nearest Neighbor and Naïve Bayes). First a 10-12 second sample is selected and features are extracted from it, and then based on those features results of both classifiers are represented in the form of accuracy table and confusion matrix. An experiment carried out on test 60 taken from middle of a song represents the true essence of its genre as compared to the samples taken from beginning and ending of a song. The novel techniques have achieved an accuracy of 91% and 78% by using Naïve Bayes and KNN classifiers respectively.

  16. Distribution of Steps with Finite-Range Interactions: Analytic Approximations and Numerical Results

    Science.gov (United States)

    GonzáLez, Diego Luis; Jaramillo, Diego Felipe; TéLlez, Gabriel; Einstein, T. L.

    2013-03-01

    While most Monte Carlo simulations assume only nearest-neighbor steps interact elastically, most analytic frameworks (especially the generalized Wigner distribution) posit that each step elastically repels all others. In addition to the elastic repulsions, we allow for possible surface-state-mediated interactions. We investigate analytically and numerically how next-nearest neighbor (NNN) interactions and, more generally, interactions out to q'th nearest neighbor alter the form of the terrace-width distribution and of pair correlation functions (i.e. the sum over n'th neighbor distribution functions, which we investigated recently.[2] For physically plausible interactions, we find modest changes when NNN interactions are included and generally negligible changes when more distant interactions are allowed. We discuss methods for extracting from simulated experimental data the characteristic scale-setting terms in assumed potential forms.

  17. Solitary wave for a nonintegrable discrete nonlinear Schrödinger equation in nonlinear optical waveguide arrays

    Science.gov (United States)

    Ma, Li-Yuan; Ji, Jia-Liang; Xu, Zong-Wei; Zhu, Zuo-Nong

    2018-03-01

    We study a nonintegrable discrete nonlinear Schrödinger (dNLS) equation with the term of nonlinear nearest-neighbor interaction occurred in nonlinear optical waveguide arrays. By using discrete Fourier transformation, we obtain numerical approximations of stationary and travelling solitary wave solutions of the nonintegrable dNLS equation. The analysis of stability of stationary solitary waves is performed. It is shown that the nonlinear nearest-neighbor interaction term has great influence on the form of solitary wave. The shape of solitary wave is important in the electric field propagating. If we neglect the nonlinear nearest-neighbor interaction term, much important information in the electric field propagating may be missed. Our numerical simulation also demonstrates the difference of chaos phenomenon between the nonintegrable dNLS equation with nonlinear nearest-neighbor interaction and another nonintegrable dNLS equation without the term. Project supported by the National Natural Science Foundation of China (Grant Nos. 11671255 and 11701510), the Ministry of Economy and Competitiveness of Spain (Grant No. MTM2016-80276-P (AEI/FEDER, EU)), and the China Postdoctoral Science Foundation (Grant No. 2017M621964).

  18. Exact Cross-Validation for kNN and applications to passive and active learning in classification

    OpenAIRE

    Célisse, Alain; Mary-Huard, Tristan

    2011-01-01

    In the binary classification framework, a closed form expression of the cross-validation Leave-p-Out (LpO) risk estimator for the k Nearest Neighbor algorithm (kNN) is derived. It is first used to study the LpO risk minimization strategy for choosing k in the passive learning setting. The impact of p on the choice of k and the LpO estimation of the risk are inferred. In the active learning setting, a procedure is proposed that selects new examples using a LpO committee of kNN classifiers. The...

  19. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    Directory of Open Access Journals (Sweden)

    Nisrine Jrad

    2009-01-01

    rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers.

  20. Handwritten Digit Recognition using Edit Distance-Based KNN

    OpenAIRE

    Bernard , Marc; Fromont , Elisa; Habrard , Amaury; Sebban , Marc

    2012-01-01

    We discuss the student project given for the last 5 years to the 1st year Master Students which follow the Machine Learning lecture at the University Jean Monnet in Saint Etienne, France. The goal of this project is to develop a GUI that can recognize digits and/or letters drawn manually. The system is based on a string representation of the dig- its using Freeman codes and on the use of an edit-distance-based K-Nearest Neighbors classifier. In addition to the machine learning knowledge about...

  1. Feature extraction using distribution representation for colorimetric sensor arrays used as explosives detectors

    DEFF Research Database (Denmark)

    Alstrøm, Tommy Sonne; Raich, Raviv; Kostesha, Natalie

    2012-01-01

    is required. We present a new approach of extracting features from a colorimetric sensor array based on a color distribution representation. For each sensor in the array, we construct a K-nearest neighbor classifier based on the Hellinger distances between color distribution of a test compound and the color......We present a colorimetric sensor array which is able to detect explosives such as DNT, TNT, HMX, RDX and TATP and identifying volatile organic compounds in the presence of water vapor in air. To analyze colorimetric sensors with statistical methods, a suitable representation of sensory readings...

  2. Acoustic modeling for emotion recognition

    CERN Document Server

    Anne, Koteswara Rao; Vankayalapati, Hima Deepthi

    2015-01-01

     This book presents state of art research in speech emotion recognition. Readers are first presented with basic research and applications – gradually more advance information is provided, giving readers comprehensive guidance for classify emotions through speech. Simulated databases are used and results extensively compared, with the features and the algorithms implemented using MATLAB. Various emotion recognition models like Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Support Vector Machines (SVM) and K-Nearest neighbor (KNN) and are explored in detail using prosody and spectral features, and feature fusion techniques.

  3. Matrix-valued Boltzmann equation for the nonintegrable Hubbard chain.

    Science.gov (United States)

    Fürst, Martin L R; Mendl, Christian B; Spohn, Herbert

    2013-07-01

    The standard Fermi-Hubbard chain becomes nonintegrable by adding to the nearest neighbor hopping additional longer range hopping amplitudes. We assume that the quartic interaction is weak and investigate numerically the dynamics of the chain on the level of the Boltzmann type kinetic equation. Only the spatially homogeneous case is considered. We observe that the huge degeneracy of stationary states in the case of nearest neighbor hopping is lost and the convergence to the thermal Fermi-Dirac distribution is restored. The convergence to equilibrium is exponentially fast. However for small next-nearest neighbor hopping amplitudes one has a rapid relaxation towards the manifold of quasistationary states and slow relaxation to the final equilibrium state.

  4. Satelite structure in 59Co NMR spectrum of magnetically ordered Dysub(1-x)Ysub(x)Co2 intermetallic compound

    International Nuclear Information System (INIS)

    Yoshimura, Kazuyoshi; Hirosawa, Satoshi; Nakamura, Yoji

    1984-01-01

    The magnetic environment effect of cobalt in Dysub(1-x)Ysub(x)Co 2 has been studied by means of bulk magnetization and 59 Co spin-echo NMR measurements at 4.2K. Clearly resolved satellite structures of the NMR spectra have been observed. The hyperfine field distributions of 59 Co are decomposed into contributions of Co atoms in various nearest neighbor configurations of rare earth atoms. In this analysis the dipole field due to nearest neighbor rare earth moments plays an important role. The result indicates that the magnetic moment of Co in the RCo 2 cubic Laves phase pseudobinary compounds is quite sensitive to the nearest neighbor rare earth environment. (author)

  5. Automatic target classification of man-made objects in synthetic aperture radar images using Gabor wavelet and neural network

    Science.gov (United States)

    Vasuki, Perumal; Roomi, S. Mohamed Mansoor

    2013-01-01

    Processing of synthetic aperture radar (SAR) images has led to the development of automatic target classification approaches. These approaches help to classify individual and mass military ground vehicles. This work aims to develop an automatic target classification technique to classify military targets like truck/tank/armored car/cannon/bulldozer. The proposed method consists of three stages via preprocessing, feature extraction, and neural network (NN). The first stage removes speckle noise in a SAR image by the identified frost filter and enhances the image by histogram equalization. The second stage uses a Gabor wavelet to extract the image features. The third stage classifies the target by an NN classifier using image features. The proposed work performs better than its counterparts, like K-nearest neighbor (KNN). The proposed work performs better on databases like moving and stationary target acquisition and recognition against the earlier methods by KNN.

  6. Automatic tissue characterization from ultrasound imagery

    Science.gov (United States)

    Kadah, Yasser M.; Farag, Aly A.; Youssef, Abou-Bakr M.; Badawi, Ahmed M.

    1993-08-01

    In this work, feature extraction algorithms are proposed to extract the tissue characterization parameters from liver images. Then the resulting parameter set is further processed to obtain the minimum number of parameters representing the most discriminating pattern space for classification. This preprocessing step was applied to over 120 pathology-investigated cases to obtain the learning data for designing the classifier. The extracted features are divided into independent training and test sets and are used to construct both statistical and neural classifiers. The optimal criteria for these classifiers are set to have minimum error, ease of implementation and learning, and the flexibility for future modifications. Various algorithms for implementing various classification techniques are presented and tested on the data. The best performance was obtained using a single layer tensor model functional link network. Also, the voting k-nearest neighbor classifier provided comparably good diagnostic rates.

  7. Compensation phenomena of a mixed spin-2 and spin-12 Heisenberg ferrimagnetic model: Green function study

    International Nuclear Information System (INIS)

    Li Jun; Wei Guozhu; Du An

    2005-01-01

    The compensation and critical behaviors of a mixed spin-2 and spin-12 Heisenberg ferrimagnetic system on a square lattice are investigated theoretically by the two-time Green's function technique, which takes into account the quantum nature of Heisenberg spins. The model can be relevant for understanding the magnetic behavior of the new class of organometallic ferromagnetic materials that exhibit spontaneous magnetic properties at room temperature. We carry out the calculation of the sublattice magnetizations and the spin-wave spectra of the ground state. In particular, we have studied the effects of the nearest, next-nearest-neighbor interactions, the crystal field and the external magnetic field on the compensation temperature and the critical temperature. When only the nearest-neighbor interactions and the crystal field are included, no compensation temperature exists; when the next-nearest-neighbor interaction between spin-12 is taken into account and exceeds a minimum value, a compensation point appears and it is basically unchanged for other parameters in Hamiltonian fixed. The next-nearest-neighbor interactions between spin-2 and the external magnetic field have the effects of changing the compensation temperature and there is a narrow range of parameters of the Hamiltonian for which the model has the compensation temperatures and compensation temperature exists only for a small value of them

  8. Diagnostic radiology in the nearest future

    International Nuclear Information System (INIS)

    Lindenbraten, L.D.

    1984-01-01

    Basic trends of diagnostic radiology (DR) development in the nearest future are formulated. Possibilities of perspective ways and means of DR studies are described. The prohlems of strategy, tactics, organization of diagnostic radiological service are considered. An attempt has been made to outline the professional image of a specialist in the DR of the future. It is shown that prediction of the DR future development is the planning stage of the present, the choice of a right way of development

  9. A Coupled k-Nearest Neighbor Algorithm for Multi-Label Classification

    Science.gov (United States)

    2015-05-22

    classification, an image may contain several concepts simultaneously, such as beach, sunset and kangaroo . Such tasks are usually denoted as multi-label...informatics, a gene can belong to both metabolism and transcription classes; and in music categorization, a song may labeled as Mozart and sad. In the

  10. Renormalization-group studies of antiferromagnetic chains. I. Nearest-neighbor interactions

    International Nuclear Information System (INIS)

    Rabin, J.M.

    1980-01-01

    The real-space renormalization-group method introduced by workers at the Stanford Linear Accelerator Center (SLAC) is used to study one-dimensional antiferromagnetic chains at zero temperature. Calculations using three-site blocks (for the Heisenberg-Ising model) and two-site blocks (for the isotropic Heisenberg model) are compared with exact results. In connection with the two-site calculation a duality transformation is introduced under which the isotropic Heisenberg model is self-dual. Such duality transformations can be defined for models other than those considered here, and may be useful in various block-spin calculations

  11. MOST OBSERVATIONS OF OUR NEAREST NEIGHBOR: FLARES ON PROXIMA CENTAURI

    Energy Technology Data Exchange (ETDEWEB)

    Davenport, James R. A. [Department of Physics and Astronomy, Western Washington University, 516 High Street, Bellingham, WA 98225 (United States); Kipping, David M. [Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY 10027 (United States); Sasselov, Dimitar [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Matthews, Jaymie M. [Department of Physics and Astronomy, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1 (Canada); Cameron, Chris [Department of Mathematics, Physics and Geology, Cape Breton University, 1250 Grand Lake Road, Sydney, NS B1P 6L2 (Canada)

    2016-10-01

    We present a study of white-light flares from the active M5.5 dwarf Proxima Centauri using the Canadian microsatellite Microvariability and Oscillations of STars . Using 37.6 days of monitoring data from 2014 to 2015, we have detected 66 individual flare events, the largest number of white-light flares observed to date on Proxima Cen. Flare energies in our sample range from 10{sup 29} to 10{sup 31.5} erg. The flare rate is lower than that of other classic flare stars of a similar spectral type, such as UV Ceti, which may indicate Proxima Cen had a higher flare rate in its youth. Proxima Cen does have an unusually high flare rate given its slow rotation period, however. Extending the observed power-law occurrence distribution down to 10{sup 28} erg, we show that flares with flux amplitudes of 0.5% occur 63 times per day, while superflares with energies of 10{sup 33} erg occur ∼8 times per year. Small flares may therefore pose a great difficulty in searches for transits from the recently announced 1.27 M {sub ⊕} Proxima b, while frequent large flares could have significant impact on the planetary atmosphere.

  12. Performance modeling of neighbor discovery in proactive routing protocols

    Directory of Open Access Journals (Sweden)

    Andres Medina

    2011-07-01

    Full Text Available It is well known that neighbor discovery is a critical component of proactive routing protocols in wireless ad hoc networks. However there is no formal study on the performance of proposed neighbor discovery mechanisms. This paper provides a detailed model of key performance metrics of neighbor discovery algorithms, such as node degree and the distribution of the distance to symmetric neighbors. The model accounts for the dynamics of neighbor discovery as well as node density, mobility, radio and interference. The paper demonstrates a method for applying these models to the evaluation of global network metrics. In particular, it describes a model of network connectivity. Validation of the models shows that the degree estimate agrees, within 5% error, with simulations for the considered scenarios. The work presented in this paper serves as a basis for the performance evaluation of remaining performance metrics of routing protocols, vital for large scale deployment of ad hoc networks.

  13. Efficient Pruning Method for Ensemble Self-Generating Neural Networks

    Directory of Open Access Journals (Sweden)

    Hirotaka Inoue

    2003-12-01

    Full Text Available Recently, multiple classifier systems (MCS have been used for practical applications to improve classification accuracy. Self-generating neural networks (SGNN are one of the suitable base-classifiers for MCS because of their simple setting and fast learning. However, the computation cost of the MCS increases in proportion to the number of SGNN. In this paper, we propose an efficient pruning method for the structure of the SGNN in the MCS. We compare the pruned MCS with two sampling methods. Experiments have been conducted to compare the pruned MCS with an unpruned MCS, the MCS based on C4.5, and k-nearest neighbor method. The results show that the pruned MCS can improve its classification accuracy as well as reducing the computation cost.

  14. Comparative Analysis of Automatic Exudate Detection between Machine Learning and Traditional Approaches

    Science.gov (United States)

    Sopharak, Akara; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Thomas

    To prevent blindness from diabetic retinopathy, periodic screening and early diagnosis are neccessary. Due to lack of expert ophthalmologists in rural area, automated early exudate (one of visible sign of diabetic retinopathy) detection could help to reduce the number of blindness in diabetic patients. Traditional automatic exudate detection methods are based on specific parameter configuration, while the machine learning approaches which seems more flexible may be computationally high cost. A comparative analysis of traditional and machine learning of exudates detection, namely, mathematical morphology, fuzzy c-means clustering, naive Bayesian classifier, Support Vector Machine and Nearest Neighbor classifier are presented. Detected exudates are validated with expert ophthalmologists' hand-drawn ground-truths. The sensitivity, specificity, precision, accuracy and time complexity of each method are also compared.

  15. Case-Based Reasoning untuk Diagnosis Penyakit Jantung

    Directory of Open Access Journals (Sweden)

    Eka Wahyudi

    2017-01-01

                The test results using medical records data validated by expert indicate that the system is able to recognize diseases heart using nearest neighbor similarity method, minskowski distance similarity and euclidean distance similarity correctly respectively of 100%. Using nearest neighbor get accuracy of 86.21%, minkowski 100%, and euclidean 94.83%

  16. Performance Evaluation of Downscaling Sentinel-2 Imagery for Land Use and Land Cover Classification by Spectral-Spatial Features

    Directory of Open Access Journals (Sweden)

    Hongrui Zheng

    2017-12-01

    Full Text Available Land Use and Land Cover (LULC classification is vital for environmental and ecological applications. Sentinel-2 is a new generation land monitoring satellite with the advantages of novel spectral capabilities, wide coverage and fine spatial and temporal resolutions. The effects of different spatial resolution unification schemes and methods on LULC classification have been scarcely investigated for Sentinel-2. This paper bridged this gap by comparing the differences between upscaling and downscaling as well as different downscaling algorithms from the point of view of LULC classification accuracy. The studied downscaling algorithms include nearest neighbor resampling and five popular pansharpening methods, namely, Gram-Schmidt (GS, nearest neighbor diffusion (NNDiffusion, PANSHARP algorithm proposed by Y. Zhang, wavelet transformation fusion (WTF and high-pass filter fusion (HPF. Two spatial features, textural metrics derived from Grey-Level-Co-occurrence Matrix (GLCM and extended attribute profiles (EAPs, are investigated to make up for the shortcoming of pixel-based spectral classification. Random forest (RF is adopted as the classifier. The experiment was conducted in Xitiaoxi watershed, China. The results demonstrated that downscaling obviously outperforms upscaling in terms of classification accuracy. For downscaling, image sharpening has no obvious advantages than spatial interpolation. Different image sharpening algorithms have distinct effects. Two multiresolution analysis (MRA-based methods, i.e., WTF and HFP, achieve the best performance. GS achieved a similar accuracy with NNDiffusion and PANSHARP. Compared to image sharpening, the introduction of spatial features, both GLCM and EAPs can greatly improve the classification accuracy for Sentinel-2 imagery. Their effects on overall accuracy are similar but differ significantly to specific classes. In general, using the spectral bands downscaled by nearest neighbor interpolation can meet

  17. Climatic zonation and land suitability determination for saffron in Khorasan-Razavi province using data mining algorithms

    Directory of Open Access Journals (Sweden)

    mehdi Bashiri

    2017-12-01

    Full Text Available Yield prediction for agricultural crops plays an important role in export-import planning, purchase guarantees, pricing, secure profits and increasing in agricultural productivity. Crop yield is affected by several parameters especially climate. In this study, the saffron yield in the Khorasan-Razavi province was evaluated by different classification algorithms including artificial neural networks, regression models, local linear trees, decision trees, discriminant analysis, random forest, support vector machine and nearest neighbor analysis. These algorithms analyzed data for 20 years (1989-2009 including 11 climatological parameters. The results showed that a few numbers of climatological parameters affect the saffron yield. The minimum, mean and maximum of temperature, had the highest positive correlations and the relative humidity of 6.5h, sunny hours, relative humidity of 18.5h, evaporation, relative humidity of 12.5h and absolute humidity had the highest negative correlations with saffron cultivation areas, respectively. In addition, in classification of saffron cultivation areas, the discriminant analysis and support vector machine had higher accuracies. The correlation between saffron cultivation area and saffron yield values was relatively high (r=0.38. The nearest neighbor analysis had the best prediction accuracy for classification of cultivation areas. For this algorithm the coefficients of determination were 1 and 0.944 for training and testing stages, respectively. However, the algorithms accuracy for prediction of crop yield from climatological parameters was low (the average coefficients of determination equal to 0.48 and 0.05 for training and testing stages. The best algorithm i.e. nearest neighbor analysis had coefficients of determination equal to 1 and 0.177 for saffron yield prediction. Results showed that, using climatological parameters and data mining algorithms can classify cultivation areas. By this way it is possible

  18. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  19. Wavelet Packet Transform Based Driver Distraction Level Classification Using EEG

    Directory of Open Access Journals (Sweden)

    Mousa Kadhim Wali

    2013-01-01

    Full Text Available We classify the driver distraction level (neutral, low, medium, and high based on different wavelets and classifiers using wireless electroencephalogram (EEG signals. 50 subjects were used for data collection using 14 electrodes. We considered for this research 4 distraction stimuli such as Global Position Systems (GPS, music player, short message service (SMS, and mental tasks. Deriving the amplitude spectrum of three different frequency bands theta, alpha, and beta of EEG signals was based on fusion of discrete wavelet packet transform (DWPT and FFT. Comparing the results of three different classifiers (subtractive fuzzy clustering probabilistic neural network, -nearest neighbor was based on spectral centroid, and power spectral features extracted by different wavelets (db4, db8, sym8, and coif5. The results of this study indicate that the best average accuracy achieved by subtractive fuzzy inference system classifier is 79.21% based on power spectral density feature extracted by sym8 wavelet which gave a good class discrimination under ANOVA test.

  20. Fidelity study of superconductivity in extended Hubbard models

    Science.gov (United States)

    Plonka, N.; Jia, C. J.; Wang, Y.; Moritz, B.; Devereaux, T. P.

    2015-07-01

    The Hubbard model with local on-site repulsion is generally thought to possess a superconducting ground state for appropriate parameters, but the effects of more realistic long-range Coulomb interactions have not been studied extensively. We study the influence of these interactions on superconductivity by including nearest- and next-nearest-neighbor extended Hubbard interactions in addition to the usual on-site terms. Utilizing numerical exact diagonalization, we analyze the signatures of superconductivity in the ground states through the fidelity metric of quantum information theory. We find that nearest and next-nearest neighbor interactions have thresholds above which they destabilize superconductivity regardless of whether they are attractive or repulsive, seemingly due to competing charge fluctuations.

  1. Bees do not use nearest-neighbour rules for optimization of multi-location routes.

    Science.gov (United States)

    Lihoreau, Mathieu; Chittka, Lars; Le Comber, Steven C; Raine, Nigel E

    2012-02-23

    Animals collecting patchily distributed resources are faced with complex multi-location routing problems. Rather than comparing all possible routes, they often find reasonably short solutions by simply moving to the nearest unvisited resources when foraging. Here, we report the travel optimization performance of bumble-bees (Bombus terrestris) foraging in a flight cage containing six artificial flowers arranged such that movements between nearest-neighbour locations would lead to a long suboptimal route. After extensive training (80 foraging bouts and at least 640 flower visits), bees reduced their flight distances and prioritized shortest possible routes, while almost never following nearest-neighbour solutions. We discuss possible strategies used during the establishment of stable multi-location routes (or traplines), and how these could allow bees and other animals to solve complex routing problems through experience, without necessarily requiring a sophisticated cognitive representation of space.

  2. Nuclear hyperfine structure of muonium in CuCl resolved by means of avoided level crossing

    International Nuclear Information System (INIS)

    Schneider, J.W.; Celio, M.; Keller, H.; Kuendig, W.; Odermatt, W.; Puempin, B.; Savic, I.M.; Simmler, H.; Estle, T.L.; Schwab, C.; Kiefl, R.F.; Renker, D.

    1990-01-01

    We report detailed avoided-level-crossing spectra of a muonium center (Mu II ) in single-crystal CuCl in a magnetic field range of 4--5 T and at a temperature of 100 K. The hyperfine parameters of the muon and the closest two shells of nuclei indicate that this center consists of muonium at a tetrahedral interstice with four Cu nearest neighbors and six Cl next-nearest neighbors and that the spin density is appreciable on the muon and on the ten neighboring nuclei but negligible elsewhere

  3. STUDY COMPARISON OF SVM-, K-NN- AND BACKPROPAGATION-BASED CLASSIFIER FOR IMAGE RETRIEVAL

    Directory of Open Access Journals (Sweden)

    Muhammad Athoillah

    2015-03-01

    Full Text Available Classification is a method for compiling data systematically according to the rules that have been set previously. In recent years classification method has been proven to help many people’s work, such as image classification, medical biology, traffic light, text classification etc. There are many methods to solve classification problem. This variation method makes the researchers find it difficult to determine which method is best for a problem. This framework is aimed to compare the ability of classification methods, such as Support Vector Machine (SVM, K-Nearest Neighbor (K-NN, and Backpropagation, especially in study cases of image retrieval with five category of image dataset. The result shows that K-NN has the best average result in accuracy with 82%. It is also the fastest in average computation time with 17,99 second during retrieve session for all categories class. The Backpropagation, however, is the slowest among three of them. In average it needed 883 second for training session and 41,7 second for retrieve session.

  4. A dumbed-down approach to unite Fermilab, its neighbors

    CERN Multimedia

    Constable, B

    2004-01-01

    "...Fermilab is reaching out to its suburban neighbors...With the nation on orange alert, Fermilab scientists no longer can sit on the front porch and invite neighbors in for coffee and quasars" (1 page).

  5. Random ensemble learning for EEG classification.

    Science.gov (United States)

    Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid

    2018-01-01

    Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Iwan Setyawan

    2012-12-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing methods used in a hand gesture recognition system. The pre-processing methods are based on the combinations of several image processing operations, namely edge detection, low pass filtering, histogram equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possible classes. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  7. Velocity statistics for interacting edge dislocations in one dimension from Dyson's Coulomb gas model.

    Science.gov (United States)

    Jafarpour, Farshid; Angheluta, Luiza; Goldenfeld, Nigel

    2013-10-01

    The dynamics of edge dislocations with parallel Burgers vectors, moving in the same slip plane, is mapped onto Dyson's model of a two-dimensional Coulomb gas confined in one dimension. We show that the tail distribution of the velocity of dislocations is power law in form, as a consequence of the pair interaction of nearest neighbors in one dimension. In two dimensions, we show the presence of a pairing phase transition in a system of interacting dislocations with parallel Burgers vectors. The scaling exponent of the velocity distribution at effective temperatures well below this pairing transition temperature can be derived from the nearest-neighbor interaction, while near the transition temperature, the distribution deviates from the form predicted by the nearest-neighbor interaction, suggesting the presence of collective effects.

  8. Neighbors United for Health

    Science.gov (United States)

    Westhoff, Wayne W.; Corvin, Jaime; Virella, Irmarie

    2009-01-01

    Modeled upon the ecclesiastic community group concept of Latin America to unite and strengthen the bond between the Church and neighborhoods, a community-based organization created Vecinos Unidos por la Salud (Neighbors United for Health) to bring health messages into urban Latino neighborhoods. The model is based on five tenants, and incorporates…

  9. Pollinator-mediated interactions in experimental arrays vary with neighbor identity.

    Science.gov (United States)

    Ha, Melissa K; Ivey, Christopher T

    2017-02-01

    Local ecological conditions influence the impact of species interactions on evolution and community structure. We investigated whether pollinator-mediated interactions between coflowering plants vary with plant density, coflowering neighbor identity, and flowering season. We conducted a field experiment in which flowering time and floral neighborhood were manipulated in a factorial design. Early- and late-flowering Clarkia unguiculata plants were placed into arrays with C. biloba neighbors, noncongeneric neighbors, additional conspecific plants, or no additional plants as a density control. We compared whole-plant pollen limitation of seed set, pollinator behavior, and pollen deposition among treatments. Interactions mediated by shared pollinators depended on the identity of the neighbor and possibly changed through time, although flowering-season comparisons were compromised by low early-season plant survival. Interactions with conspecific neighbors were likely competitive late in the season. Interactions with C. biloba appeared to involve facilitation or neutral interactions. Interactions with noncongeners were more consistently competitive. The community composition of pollinators varied among treatment combinations. Pollinator-mediated interactions involved competition and likely facilitation, depending on coflowering neighbor. Experimental manipulation helped to reveal context-dependent variation in indirect biotic interactions. © 2017 Botanical Society of America.

  10. The clinic as a good corporate neighbor.

    Science.gov (United States)

    Sass, Hans-Martin

    2013-02-01

    Clinics today specialize in health repair services similar to car repair shops; procedures and prices are standardized, regulated, and inflexibly uniform. Clinics of the future have to become Health Care Centers in order to be more respected and more effective corporate neighbors in offering outreach services in health education and preventive health care. The traditional concept of care for health is much broader than repair management and includes the promotion of lay health competence and responsibility in healthy social and natural environments. The corporate profile and ethics of the clinic as a good and competitive local neighbor will have to focus on [a] better personalized care, [b] education and services in preventive care, [c] direct or web-based information and advice for general, seasonal, or age related health risks, and on developing and improving trustworthy character traits of the clinic as a corporate person and a good neighbor.

  11. Robotic situational awareness of actions in human teaming

    Science.gov (United States)

    Tahmoush, Dave

    2015-06-01

    When robots can sense and interpret the activities of the people they are working with, they become more of a team member and less of just a piece of equipment. This has motivated work on recognizing human actions using existing robotic sensors like short-range ladar imagers. These produce three-dimensional point cloud movies which can be analyzed for structure and motion information. We skeletonize the human point cloud and apply a physics-based velocity correlation scheme to the resulting joint motions. The twenty actions are then recognized using a nearest-neighbors classifier that achieves good accuracy.

  12. Pair and triplet approximation of a spatial lattice population model with multiscale dispersal using Markov chains for estimating spatial autocorrelation.

    Science.gov (United States)

    Hiebeler, David E; Millett, Nicholas E

    2011-06-21

    We investigate a spatial lattice model of a population employing dispersal to nearest and second-nearest neighbors, as well as long-distance dispersal across the landscape. The model is studied via stochastic spatial simulations, ordinary pair approximation, and triplet approximation. The latter method, which uses the probabilities of state configurations of contiguous blocks of three sites as its state variables, is demonstrated to be greatly superior to pair approximations for estimating spatial correlation information at various scales. Correlations between pairs of sites separated by arbitrary distances are estimated by constructing spatial Markov processes using the information from both approximations. These correlations demonstrate why pair approximation misses basic qualitative features of the model, such as decreasing population density as a large proportion of offspring are dropped on second-nearest neighbors, and why triplet approximation is able to include them. Analytical and numerical results show that, excluding long-distance dispersal, the initial growth rate of an invading population is maximized and the equilibrium population density is also roughly maximized when the population spreads its offspring evenly over nearest and second-nearest neighboring sites. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

    Directory of Open Access Journals (Sweden)

    Carlos E. Galván-Tejada

    2017-02-01

    Full Text Available Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.

  14. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

    Science.gov (United States)

    Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

    2017-02-14

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.

  15. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic

    Directory of Open Access Journals (Sweden)

    Fawaz S. Al-Anzi

    2017-04-01

    Full Text Available Cosine similarity is one of the most popular distance measures in text classification problems. In this paper, we used this important measure to investigate the performance of Arabic language text classification. For textual features, vector space model (VSM is generally used as a model to represent textual information as numerical vectors. However, Latent Semantic Indexing (LSI is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular value decomposition (SVD method to extract textual features based on LSI. In our experiments, we conducted comparison between some of the well-known classification methods such as Naïve Bayes, k-Nearest Neighbors, Neural Network, Random Forest, Support Vector Machine, and classification tree. We used a corpus that contains 4,000 documents of ten topics (400 document for each topic. The corpus contains 2,127,197 words with about 139,168 unique words. The testing set contains 400 documents, 40 documents for each topics. As a weighing scheme, we used Term Frequency.Inverse Document Frequency (TF.IDF. This study reveals that the classification methods that use LSI features significantly outperform the TF.IDF-based methods. It also reveals that k-Nearest Neighbors (based on cosine measure and support vector machine are the best performing classifiers.

  16. Voting-based Classification for E-mail Spam Detection

    Directory of Open Access Journals (Sweden)

    Bashar Awad Al-Shboul

    2016-06-01

    Full Text Available The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited e-mails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying e-mails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for classifying each e-mail individually. The classification results of three different classifiers (i.e. Decision Trees, Random Forests and k-Nearest Neighbor are combined in various voting schemes (i.e. majority vote, average probability, product of probabilities, minimum probability and maximum probability for making the final decision. To validate our method, two different spam e-mail collections were used.

  17. Unwanted Behaviors and Nuisance Behaviors Among Neighbors in a Belgian Community Sample.

    Science.gov (United States)

    Michaux, Emilie; Groenen, Anne; Uzieblo, Katarzyna

    2015-06-30

    Unwanted behaviors between (ex-)intimates have been extensively studied, while those behaviors within other contexts such as neighbors have received much less scientific consideration. Research indicates that residents are likely to encounter problem behaviors from their neighbors. Besides the lack of clarity in the conceptualization of problem behaviors among neighbors, little is known on which types of behaviors characterize neighbor problems. In this study, the occurrence of two types of problem behaviors encountered by neighbors was explored within a Belgian community sample: unwanted behaviors such as threats and neighbor nuisance issues such as noise nuisance. By clearly distinguishing those two types of behaviors, this study aimed at contributing to the conceptualization of neighbor problems. Next, the coping strategies used to deal with the neighbor problems were investigated. Our results indicated that unwanted behaviors were more frequently encountered by residents compared with nuisance problems. Four out of 10 respondents reported both unwanted pursuit behavior and nuisance problems. It was especially unlikely to encounter nuisance problems in isolation of unwanted pursuit behaviors. While different coping styles (avoiding the neighbor, confronting the neighbor, and enlisting help from others) were equally used by the stalked participants, none of them was perceived as being more effective in reducing the stalking behaviors. Strikingly, despite being aware of specialized help services such as community mediation services, only a very small subgroup enlisted this kind of professional help. © The Author(s) 2015.

  18. Structure and Bonding in Noncrystalline Solids Abstracts

    Science.gov (United States)

    1983-06-02

    displacement cascades are unlikely. Related damage studies as diffuse X- ray scattering, magnetic susceptibility and positron - annihilation lifetime...the positron annihilation lifetime data; diffuse X-ray scattering studies give evidence for "amorphized" clusters in neutron but not in elec-ron...feldspar glasses and glasses in the system CaO- MgO -SiO 2 . These results indicate that the nearest-neighbor and next- nearest-neighbor environments are very

  19. Anomalous magnon Nernst effect of topological magnonic materials

    OpenAIRE

    Wang, X. S.; Wang, X. R.

    2017-01-01

    The magnon transport driven by thermal gradient in a perpendicularly magnetized honeycomb lattice is studied. The system with the nearest-neighbor pseudodipolar interaction and the next-nearest-neighbor Dzyaloshinskii-Moriya interaction (DMI) has various topologically nontrivial phases. When an in-plane thermal gradient is applied, a transverse in-plane magnon current is generated. This phenomenon is termed as the anomalous magnon Nernst effect that closely resembles the anomalous Nernst effe...

  20. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Regina Lionnie

    2013-09-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing  methods  used  in  a  hand  gesture  recognition  system.  The  preprocessing methods are based on the combinations ofseveral image processing operations,  namely  edge  detection,  low  pass  filtering,  histogram  equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possibleclasses. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  1. Neighbor Rupture Degree of Some Middle Graphs

    Directory of Open Access Journals (Sweden)

    Gökşen BACAK-TURAN

    2017-12-01

    Full Text Available Networks have an important place in our daily lives. Internet networks, electricity networks, water networks, transportation networks, social networks and biological networks are some of the networks we run into every aspects of our lives. A network consists of centers connected by links. A network is represented when centers and connections modelled by vertices and edges, respectively. In consequence of the failure of some centers or connection lines, measurement of the resistance of the network until the communication interrupted is called vulnerability of the network. In this study, neighbor rupture degree which is a parameter that explores the vulnerability values of the resulting graphs due to the failure of some centers of a communication network and its neighboring centers becoming nonfunctional were applied to some middle graphs and neighbor rupture degree of the $M(C_{n},$ $M(P_{n},$ $M(K_{1,n},$ $M(W_{n},$ $M(P_{n}\\times K_{2}$ and $M(C_{n}\\times K_{2}$ have been found.

  2. ALIGNMENTS OF GROUP GALAXIES WITH NEIGHBORING GROUPS

    International Nuclear Information System (INIS)

    Wang Yougang; Chen Xuelei; Park, Changbom; Yang Xiaohu; Choi, Yun-Young

    2009-01-01

    Using a sample of galaxy groups found in the Sloan Digital Sky Survey Data Release 4, we measure the following four types of alignment signals: (1) the alignment between the distributions of the satellites of each group relative to the direction of the nearest neighbor group (NNG); (2) the alignment between the major axis direction of the central galaxy of the host group (HG) and the direction of the NNG; (3) the alignment between the major axes of the central galaxies of the HG and the NNG; and (4) the alignment between the major axes of the satellites of the HG and the direction of the NNG. We find strong signal of alignment between the satellite distribution and the orientation of central galaxy relative to the direction of the NNG, even when the NNG is located beyond 3r vir of the host group. The major axis of the central galaxy of the HG is aligned with the direction of the NNG. The alignment signals are more prominent for groups that are more massive and with early-type central galaxies. We also find that there is a preference for the two major axes of the central galaxies of the HG and NNG to be parallel for the system with both early central galaxies, however, not for the systems with both late-type central galaxies. For the orientation of satellite galaxies, we do not find any significant alignment signals relative to the direction of the NNG. From these four types of alignment measurements, we conclude that the large-scale environment traced by the nearby group affects primarily the shape of the host dark matter halo, and hence also affects the distribution of satellite galaxies and the orientation of central galaxies. In addition, the NNG directly affects the distribution of the satellite galaxies by inducing asymmetric alignment signals, and the NNG at very small separation may also contribute a second-order impact on the orientation of the central galaxy in the HG.

  3. Computer Simulation of Energy Parameters and Magnetic Effects in Fe-Si-C Ternary Alloys

    Science.gov (United States)

    Ridnyi, Ya. M.; Mirzoev, A. A.; Mirzaev, D. A.

    2018-06-01

    The paper presents ab initio simulation with the WIEN2k software package of the equilibrium structure and properties of silicon and carbon atoms dissolved in iron with the body-centered cubic crystal system of the lattice. Silicon and carbon atoms manifest a repulsive interaction in the first two nearest neighbors, in the second neighbor the repulsion being stronger than in the first. In the third and next-nearest neighbors a very weak repulsive interaction occurs and tends to zero with increasing distance between atoms. Silicon and carbon dissolution reduces the magnetic moment of iron atoms.

  4. Co-Expression of Neighboring Genes in the Zebrafish (Danio rerio Genome

    Directory of Open Access Journals (Sweden)

    Daryi Wang

    2009-08-01

    Full Text Available Neighboring genes in the eukaryotic genome have a tendency to express concurrently, and the proximity of two adjacent genes is often considered a possible explanation for their co-expression behavior. However, the actual contribution of the physical distance between two genes to their co-expression behavior has yet to be defined. To further investigate this issue, we studied the co-expression of neighboring genes in zebrafish, which has a compact genome and has experienced a whole genome duplication event. Our analysis shows that the proportion of highly co-expressed neighboring pairs (Pearson’s correlation coefficient R>0.7 is low (0.24% ~ 0.67%; however, it is still significantly higher than that of random pairs. In particular, the statistical result implies that the co-expression tendency of neighboring pairs is negatively correlated with their physical distance. Our findings therefore suggest that physical distance may play an important role in the co-expression of neighboring genes. Possible mechanisms related to the neighboring genes’ co-expression are also discussed.

  5. New Sliding Puzzle with Neighbors Swap Motion

    OpenAIRE

    Prihardono, Ariyanto; Kawagoe, Kenichi

    2015-01-01

    The sliding puzzles (15-puzzle, 8-puzzle, 5-puzzle) are known to have 2 kind of puz-zle: solvable puzzle and unsolvable puzzle. In this thesis, we make a new puzzle with only 1 kind of it, solvable puzzle. This new puzzle is made by adopting sliding puzzle with several additional rules from M13 puzzle; the puzzle that is formed form The Mathieu group M13. This puzzle has a movement that called a neighbors swap motion, a rule of movement that enables every neighboring points to swap. This extr...

  6. The advantages of the surface Laplacian in brain-computer interface research.

    Science.gov (United States)

    McFarland, Dennis J

    2015-09-01

    Brain-computer interface (BCI) systems frequently use signal processing methods, such as spatial filtering, to enhance performance. The surface Laplacian can reduce spatial noise and aid in identification of sources. In BCI research, these two functions of the surface Laplacian correspond to prediction accuracy and signal orthogonality. In the present study, an off-line analysis of data from a sensorimotor rhythm-based BCI task dissociated these functions of the surface Laplacian by comparing nearest-neighbor and next-nearest neighbor Laplacian algorithms. The nearest-neighbor Laplacian produced signals that were more orthogonal while the next-nearest Laplacian produced signals that resulted in better accuracy. Both prediction and signal identification are important for BCI research. Better prediction of user's intent produces increased speed and accuracy of communication and control. Signal identification is important for ruling out the possibility of control by artifacts. Identifying the nature of the control signal is relevant both to understanding exactly what is being studied and in terms of usability for individuals with limited motor control. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. The role of orthography in the semantic activation of neighbors.

    Science.gov (United States)

    Hino, Yasushi; Lupker, Stephen J; Taylor, Tamsen E

    2012-09-01

    There is now considerable evidence that a letter string can activate semantic information appropriate to its orthographic neighbors (e.g., Forster & Hector's, 2002, TURPLE effect). This phenomenon is the focus of the present research. Using Japanese words, we examined whether semantic activation of neighbors is driven directly by orthographic similarity alone or whether there is also a role for phonological similarity. In Experiment 1, using a relatedness judgment task in which a Kanji word-Katakana word pair was presented on each trial, an inhibitory effect was observed when the initial Kanji word was related to an orthographic and phonological neighbor of the Katakana word target but not when the initial Kanji word was related to a phonological but not orthographic neighbor of the Katakana word target. This result suggests that phonology plays little, if any, role in the activation of neighbors' semantics when reading familiar words. In Experiment 2, the targets were transcribed into Hiragana, a script they are typically not written in, requiring readers to engage in phonological coding. In that experiment, inhibitory effects were observed in both conditions. This result indicates that phonologically mediated semantic activation of neighbors will emerge when phonological processing is necessary in order to understand a written word (e.g., when that word is transcribed into an unfamiliar script). PsycINFO Database Record (c) 2012 APA, all rights reserved.

  8. Chimera states in bursting neurons

    OpenAIRE

    Bera, Bidesh K.; Ghosh, Dibakar; Lakshmanan, M.

    2015-01-01

    We study the existence of chimera states in pulse-coupled networks of bursting Hindmarsh-Rose neurons with nonlocal, global and local (nearest neighbor) couplings. Through a linear stability analysis, we discuss the behavior of stability function in the incoherent (i.e. disorder), coherent, chimera and multi-chimera states. Surprisingly, we find that chimera and multi-chimera states occur even using local nearest neighbor interaction in a network of identical bursting neurons alone. This is i...

  9. Highly Anisotropic Magnon Dispersion in Ca_{2}RuO_{4}: Evidence for Strong Spin Orbit Coupling.

    Science.gov (United States)

    Kunkemöller, S; Khomskii, D; Steffens, P; Piovano, A; Nugroho, A A; Braden, M

    2015-12-11

    The magnon dispersion in Ca_{2}RuO_{4} has been determined by inelastic neutron scattering on single crytals containing 1% of Ti. The dispersion is well described by a conventional Heisenberg model suggesting a local moment model with nearest neighbor interaction of J=8  meV. Nearest and next-nearest neighbor interaction as well as interlayer coupling parameters are required to properly describe the entire dispersion. Spin-orbit coupling induces a very large anisotropy gap in the magnetic excitations in apparent contrast with a simple planar magnetic model. Orbital ordering breaking tetragonal symmetry, and strong spin-orbit coupling can thus be identified as important factors in this system.

  10. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

    Science.gov (United States)

    S K, Somasundaram; P, Alli

    2017-11-09

    The main complication of diabetes is Diabetic retinopathy (DR), retinal vascular disease and it leads to the blindness. Regular screening for early DR disease detection is considered as an intensive labor and resource oriented task. Therefore, automatic detection of DR diseases is performed only by using the computational technique is the great solution. An automatic method is more reliable to determine the presence of an abnormality in Fundus images (FI) but, the classification process is poorly performed. Recently, few research works have been designed for analyzing texture discrimination capacity in FI to distinguish the healthy images. However, the feature extraction (FE) process was not performed well, due to the high dimensionality. Therefore, to identify retinal features for DR disease diagnosis and early detection using Machine Learning and Ensemble Classification method, called, Machine Learning Bagging Ensemble Classifier (ML-BEC) is designed. The ML-BEC method comprises of two stages. The first stage in ML-BEC method comprises extraction of the candidate objects from Retinal Images (RI). The candidate objects or the features for DR disease diagnosis include blood vessels, optic nerve, neural tissue, neuroretinal rim, optic disc size, thickness and variance. These features are initially extracted by applying Machine Learning technique called, t-distributed Stochastic Neighbor Embedding (t-SNE). Besides, t-SNE generates a probability distribution across high-dimensional images where the images are separated into similar and dissimilar pairs. Then, t-SNE describes a similar probability distribution across the points in the low-dimensional map. This lessens the Kullback-Leibler divergence among two distributions regarding the locations of the points on the map. The second stage comprises of application of ensemble classifiers to the extracted features for providing accurate analysis of digital FI using machine learning. In this stage, an automatic detection

  11. Chaotic particle swarm optimization with mutation for classification.

    Science.gov (United States)

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.

  12. Chaotic Particle Swarm Optimization with Mutation for Classification

    Science.gov (United States)

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms. PMID:25709937

  13. Color and neighbor edge directional difference feature for image retrieval

    Institute of Scientific and Technical Information of China (English)

    Chaobing Huang; Shengsheng Yu; Jingli Zhou; Hongwei Lu

    2005-01-01

    @@ A novel image feature termed neighbor edge directional difference unit histogram is proposed, in which the neighbor edge directional difference unit is defined and computed for every pixel in the image, and is used to generate the neighbor edge directional difference unit histogram. This histogram and color histogram are used as feature indexes to retrieve color image. The feature is invariant to image scaling and translation and has more powerful descriptive for the natural color images. Experimental results show that the feature can achieve better retrieval performance than other color-spatial features.

  14. Some Observations about the Nearest-Neighbor Model of the Error Threshold

    International Nuclear Information System (INIS)

    Gerrish, Philip J.

    2009-01-01

    I explore some aspects of the 'error threshold' - a critical mutation rate above which a population is nonviable. The phase transition that occurs as mutation rate crosses this threshold has been shown to be mathematically equivalent to the loss of ferromagnetism that occurs as temperature exceeds the Curie point. I will describe some refinements and new results based on the simplest of these mutation models, will discuss the commonly unperceived robustness of this simple model, and I will show some preliminary results comparing qualitative predictions with simulations of finite populations adapting at high mutation rates. I will talk about how these qualitative predictions are relevant to biomedical science and will discuss how my colleagues and I are looking for phase-transition signatures in real populations of Escherichia coli that go extinct as a result of excessive mutation.

  15. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    Directory of Open Access Journals (Sweden)

    Fábio R de Moraes

    Full Text Available Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR from free surface residues (FSR. We formulated a linear discriminative analysis (LDA classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/ are suitable for such a task. Receiver operating characteristic (ROC analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study

  16. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    Science.gov (United States)

    de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  17. Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

    Science.gov (United States)

    de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  18. Ground-state ordering of the J1-J2 model on the simple cubic and body-centered cubic lattices

    Science.gov (United States)

    Farnell, D. J. J.; Götze, O.; Richter, J.

    2016-06-01

    The J1-J2 Heisenberg model is a "canonical" model in the field of quantum magnetism in order to study the interplay between frustration and quantum fluctuations as well as quantum phase transitions driven by frustration. Here we apply the coupled cluster method (CCM) to study the spin-half J1-J2 model with antiferromagnetic nearest-neighbor bonds J1>0 and next-nearest-neighbor bonds J2>0 for the simple cubic (sc) and body-centered cubic (bcc) lattices. In particular, we wish to study the ground-state ordering of these systems as a function of the frustration parameter p =z2J2/z1J1 , where z1 (z2) is the number of nearest (next-nearest) neighbors. We wish to determine the positions of the phase transitions using the CCM and we aim to resolve the nature of the phase transition points. We consider the ground-state energy, order parameters, spin-spin correlation functions, as well as the spin stiffness in order to determine the ground-state phase diagrams of these models. We find a direct first-order phase transition at a value of p =0.528 from a state of nearest-neighbor Néel order to next-nearest-neighbor Néel order for the bcc lattice. For the sc lattice the situation is more subtle. CCM results for the energy, the order parameter, the spin-spin correlation functions, and the spin stiffness indicate that there is no direct first-order transition between ground-state phases with magnetic long-range order, rather it is more likely that two phases with antiferromagnetic long range are separated by a narrow region of a spin-liquid-like quantum phase around p =0.55 . Thus the strong frustration present in the J1-J2 Heisenberg model on the sc lattice may open a window for an unconventional quantum ground state in this three-dimensional spin model.

  19. Object Classification in Semi Structured Enviroment Using Forward-Looking Sonar

    Directory of Open Access Journals (Sweden)

    Matheus dos Santos

    2017-09-01

    Full Text Available The submarine exploration using robots has been increasing in recent years. The automation of tasks such as monitoring, inspection, and underwater maintenance requires the understanding of the robot’s environment. The object recognition in the scene is becoming a critical issue for these systems. On this work, an underwater object classification pipeline applied in acoustic images acquired by Forward-Looking Sonar (FLS are studied. The object segmentation combines thresholding, connected pixels searching and peak of intensity analyzing techniques. The object descriptor extract intensity and geometric features of the detected objects. A comparison between the Support Vector Machine, K-Nearest Neighbors, and Random Trees classifiers are presented. An open-source tool was developed to annotate and classify the objects and evaluate their classification performance. The proposed method efficiently segments and classifies the structures in the scene using a real dataset acquired by an underwater vehicle in a harbor area. Experimental results demonstrate the robustness and accuracy of the method described in this paper.

  20. Emotion detection model of Filipino music

    Science.gov (United States)

    Noblejas, Kathleen Alexis; Isidro, Daryl Arvin; Samonte, Mary Jane C.

    2017-02-01

    This research explored the creation of a model to detect emotion from Filipino songs. The emotion model used was based from Paul Ekman's six basic emotions. The songs were classified into the following genres: kundiman, novelty, pop, and rock. The songs were annotated by a group of music experts based on the emotion the song induces to the listener. Musical features of the songs were extracted using jAudio while the lyric features were extracted by Bag-of- Words feature representation. The audio and lyric features of the Filipino songs were extracted for classification by the chosen three classifiers, Naïve Bayes, Support Vector Machines, and k-Nearest Neighbors. The goal of the research was to know which classifier would work best for Filipino music. Evaluation was done by 10-fold cross validation and accuracy, precision, recall, and F-measure results were compared. Models were also tested with unknown test data to further determine the models' accuracy through the prediction results.

  1. Granular computing in decision approximation an application of rough mereology

    CERN Document Server

    Polkowski, Lech

    2015-01-01

    This book presents a study in knowledge discovery in data with knowledge understood as a set of relations among objects and their properties. Relations in this case are implicative decision rules and the paradigm in which they are induced is that of computing with granules defined by rough inclusions, the latter introduced and studied  within rough mereology, the fuzzified version of mereology. In this book basic classes of rough inclusions are defined and based on them methods for inducing granular structures from data are highlighted. The resulting granular structures are subjected to classifying algorithms, notably k—nearest  neighbors and bayesian classifiers. Experimental results are given in detail both in tabular and visualized form for fourteen data sets from UCI data repository. A striking feature of granular classifiers obtained by this approach is that preserving the accuracy of them on original data, they reduce  substantially the size of the granulated data set as well as the set of granular...

  2. Geographical traceability of Marsdenia tenacissima by Fourier transform infrared spectroscopy and chemometrics

    Science.gov (United States)

    Li, Chao; Yang, Sheng-Chao; Guo, Qiao-Sheng; Zheng, Kai-Yan; Wang, Ping-Li; Meng, Zhen-Gui

    2016-01-01

    A combination of Fourier transform infrared spectroscopy with chemometrics tools provided an approach for studying Marsdenia tenacissima according to its geographical origin. A total of 128 M. tenacissima samples from four provinces in China were analyzed with FTIR spectroscopy. Six pattern recognition methods were used to construct the discrimination models: support vector machine-genetic algorithms, support vector machine-particle swarm optimization, K-nearest neighbors, radial basis function neural network, random forest and support vector machine-grid search. Experimental results showed that K-nearest neighbors was superior to other mathematical algorithms after data were preprocessed with wavelet de-noising, with a discrimination rate of 100% in both the training and prediction sets. This study demonstrated that FTIR spectroscopy coupled with K-nearest neighbors could be successfully applied to determine the geographical origins of M. tenacissima samples, thereby providing reliable authentication in a rapid, cheap and noninvasive way.

  3. Green function study of a mixed spin-((3)/(2)) and spin-((1)/(2)) Heisenberg ferrimagnetic model

    International Nuclear Information System (INIS)

    Li Jun; Wei Guozhu; Du An

    2004-01-01

    The magnetic properties of a mixed spin-((3)/(2)) and spin-((1)/(2)) Heisenberg ferrimagnetic system on a square lattice are investigated theoretically by a multisublattice Green-function technique which takes into account the quantum nature of Heisenberg spins. This model can be relevant for understanding the magnetic behavior of the new class of organometallic materials that exhibit spontaneous magnetic moments at room temperature. We discuss the spontaneous magnetic moments and the finite-temperature phase diagram. We find that there is no compensation point at finite temperature when only the nearest-neighbor interaction and the single-ion anisotropy are included. When the next-nearest-neighbor interaction between spin-((1)/(2)) is taken into account and exceeds a minimum value, a compensation point appears and it is basically unchanged for other values in Hamiltonian fixed. The next-nearest-neighbor interaction between spin-((3)/(2)) has the effect of changing the compensation temperature

  4. Neighboring Genes Show Correlated Evolution in Gene Expression

    Science.gov (United States)

    Ghanbarian, Avazeh T.; Hurst, Laurence D.

    2015-01-01

    When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543

  5. D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    Science.gov (United States)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2016-06-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  6. Reasons patients leave their nearest healthcare service to attend Karen Park Clinic, Pretoria North

    Directory of Open Access Journals (Sweden)

    Agnes T. Masango- Makgobela

    2013-10-01

    Conclusion: The majority of patients who had attended their nearest clinic were adamant that they would not return. It is necessary to reduce waiting times, thus reducing long queues. This can be achieved by having adequate, satisfied healthcare providers to render a quality service and by organising training for management. Patients can thus be redirected to their nearest clinic and the health centre’s capacity can be increased by procuring adequate drugs. There is a need to follow up on patients’ complaints about staff attitudes.

  7. Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

    Science.gov (United States)

    Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun

    2016-04-01

    This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies

  8. Local randomization in neighbor selection improves PRM roadmap quality

    KAUST Repository

    McMahon, Troy; Jacobs, Sam; Boyd, Bryan; Tapia, Lydia; Amato, Nancy M.

    2012-01-01

    Probabilistic Roadmap Methods (PRMs) are one of the most used classes of motion planning methods. These sampling-based methods generate robot configurations (nodes) and then connect them to form a graph (roadmap) containing representative feasible pathways. A key step in PRM roadmap construction involves identifying a set of candidate neighbors for each node. Traditionally, these candidates are chosen to be the k-closest nodes based on a given distance metric. In this paper, we propose a new neighbor selection policy called LocalRand(k,K'), that first computes the K' closest nodes to a specified node and then selects k of those nodes at random. Intuitively, LocalRand attempts to benefit from random sampling while maintaining the higher levels of local planner success inherent to selecting more local neighbors. We provide a methodology for selecting the parameters k and K'. We perform an experimental comparison which shows that for both rigid and articulated robots, LocalRand results in roadmaps that are better connected than the traditional k-closest policy or a purely random neighbor selection policy. The cost required to achieve these results is shown to be comparable to k-closest. © 2012 IEEE.

  9. Local randomization in neighbor selection improves PRM roadmap quality

    KAUST Repository

    McMahon, Troy

    2012-10-01

    Probabilistic Roadmap Methods (PRMs) are one of the most used classes of motion planning methods. These sampling-based methods generate robot configurations (nodes) and then connect them to form a graph (roadmap) containing representative feasible pathways. A key step in PRM roadmap construction involves identifying a set of candidate neighbors for each node. Traditionally, these candidates are chosen to be the k-closest nodes based on a given distance metric. In this paper, we propose a new neighbor selection policy called LocalRand(k,K\\'), that first computes the K\\' closest nodes to a specified node and then selects k of those nodes at random. Intuitively, LocalRand attempts to benefit from random sampling while maintaining the higher levels of local planner success inherent to selecting more local neighbors. We provide a methodology for selecting the parameters k and K\\'. We perform an experimental comparison which shows that for both rigid and articulated robots, LocalRand results in roadmaps that are better connected than the traditional k-closest policy or a purely random neighbor selection policy. The cost required to achieve these results is shown to be comparable to k-closest. © 2012 IEEE.

  10. Staining pattern classification of antinuclear autoantibodies based on block segmentation in indirect immunofluorescence images.

    Directory of Open Access Journals (Sweden)

    Jiaqian Li

    Full Text Available Indirect immunofluorescence based on HEp-2 cell substrate is the most commonly used staining method for antinuclear autoantibodies associated with different types of autoimmune pathologies. The aim of this paper is to design an automatic system to identify the staining patterns based on block segmentation compared to the cell segmentation most used in previous research. Various feature descriptors and classifiers are tested and compared in the classification of the staining pattern of blocks and it is found that the technique of the combination of the local binary pattern and the k-nearest neighbor algorithm achieve the best performance. Relying on the results of block pattern classification, experiments on the whole images show that classifier fusion rules are able to identify the staining patterns of the whole well (specimen image with a total accuracy of about 94.62%.

  11. Texture-based analysis of COPD

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Nielsen, Mads; Lo, Pechin Chien Pau

    2012-01-01

    This study presents a fully automatic, data-driven approach for texture-based quantitative analysis of chronic obstructive pulmonary disease (COPD) in pulmonary computed tomography (CT) images. The approach uses supervised learning where the class labels are, in contrast to previous work, based...... on measured lung function instead of on manually annotated regions of interest (ROIs). A quantitative measure of COPD is obtained by fusing COPD probabilities computed in ROIs within the lung fields where the individual ROI probabilities are computed using a k nearest neighbor (kNN ) classifier. The distance...... and subsequently applied to classify 200 independent images from the same screening trial. The texture-based measure was significantly better at discriminating between subjects with and without COPD than were the two most common quantitative measures of COPD in the literature, which are based on density...

  12. Food Powder Classification Using a Portable Visible-Near-Infrared Spectrometer

    Directory of Open Access Journals (Sweden)

    Hanjong You

    2017-10-01

    Full Text Available Visible-near-infrared (VIS-NIR spectroscopy is a fast and non-destructive method for analyzing materials. However, most commercial VIS-NIR spectrometers are inappropriate for use in various locations such as in homes or offices because of their size and cost. In this paper, we classified eight food powders using a portable VIS-NIR spectrometer with a wavelength range of 450–1,000 nm. We developed three machine learning models using the spectral data for the eight food powders. The proposed three machine learning models (random forest, k-nearest neighbors, and support vector machine achieved an accuracy of 87%, 98%, and 100%, respectively. Our experimental results showed that the support vector machine model is the most suitable for classifying non-linear spectral data. We demonstrated the potential of material analysis using a portable VIS-NIR spectrometer.

  13. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach.

    Science.gov (United States)

    Own, Chung-Ming; Meng, Zhaopeng; Liu, Kehan

    2015-09-03

    Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs) and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS), which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  14. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach

    Directory of Open Access Journals (Sweden)

    Chung-Ming Own

    2015-09-01

    Full Text Available Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS, which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  15. Accelerating distributed average consensus by exploring the information of second-order neighbors

    Energy Technology Data Exchange (ETDEWEB)

    Yuan Deming [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Xu Shengyuan, E-mail: syxu02@yahoo.com.c [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Zhao Huanyu [School of Automation, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu (China); Chu Yuming [Department of Mathematics, Huzhou Teacher' s College, Huzhou 313000, Zhejiang (China)

    2010-05-17

    The problem of accelerating distributed average consensus by using the information of second-order neighbors in both the discrete- and continuous-time cases is addressed in this Letter. In both two cases, when the information of second-order neighbors is used in each iteration, the network will converge with a speed faster than the algorithm only using the information of first-order neighbors. Moreover, the problem of using partial information of second-order neighbors is considered, and the edges are not chosen randomly from second-order neighbors. In the continuous-time case, the edges are chosen by solving a convex optimization problem which is formed by using the convex relaxation method. In the discrete-time case, for small network the edges are chosen optimally via the brute force method. Finally, simulation examples are provided to demonstrate the effectiveness of the proposed algorithm.

  16. Forest structure of Mediterranean yew (Taxus baccata L. populations and neighbor effects on juvenile yew performance in the NE Iberian Peninsula

    Directory of Open Access Journals (Sweden)

    Pere Casals

    2015-12-01

    Full Text Available Aim of study: In the Mediterranean region, yew (Taxus baccata L. usually grows with other tree species in mixed forests. Yew recruitment and juvenile growth may depend on the structure of the forest and the net balance between competition for soil water and nutrients with neighbors and facilitation that these neighbors exert by protecting the plants from direct sun exposure. This study aims, at a regional scale, to analyze the structure of forests containing yew, and, on an individual level, to analyze the effect of the surrounding vegetation structure on the performance of yew juveniles.Area of study: The structural typologies of yew populations were defined based on field inventories conducted in 55 plots distributed in 14 localities in the North-Eastern (NE Iberian Peninsula, covering a wide range of yew distribution in the area. In a second step, an analysis of neighboring species' effects on juveniles was conducted based on the data from 103 plots centered in yew juveniles in five localities.Main Results: A cluster analysis classified the inventoried stands into four forest structural types: two multi-stratified forests with scattered yew and two yew groves. Multiple regression modeling showed that the δ13C measured in last year's leaves positively relates to the basal area of conifer neighbors, but negatively with the cover of the yew crown by other trees.Research highlights: At a stand-level, the density of recruits and juveniles (625 ± 104 recruits ha-1, 259 ± 55 juveniles ha-1 in mixed forests was found to be higher than that on yew dominant stands (181 ± 88 recruits ha-1 and 57 ± 88 juveniles ha-1. At an individual-level, the water stress (estimated from leaf δ13C of yew juveniles seems alleviated by the crown cover by neighbors while it increases with the basal area of conifers. Yew conservation should focus on selective felling for the reduction of basal area of neighbors surrounding the target tree, but avoid affecting the

  17. Theoretical study of the electronic and magnetic properties of β-TeVO4

    Science.gov (United States)

    Saul, Andres; Radtke, Guillaume

    2014-03-01

    The β phase of this compound can be described by zigzag chains formed by VO5 distorted square pyramids sharing corners. This oxide, with V4+ ions as magnetic centers, can be thus seen as a realization of a quasi-one-dimensional Heisenberg S=1/2 Hamiltonian. The corner-sharing of the VO5 pyramids could lead to the prediction of AFM nearest neighbor interactions mediated by a weak super-exchange mechanism opening the possibility of complex magnetic properties due to competing next nearest-neighbors or inter-chain interactions. In this work we have studied its electronic and magnetic properties using density functional calculations. In particular, we evaluated the magnetic couplings on the basis of broken-symmetry formalism. We have performed extensive calculations comparing the results of the standard GGA (PBE) functional to the hybrid PBE0 functional and two different GGA+U implementations (SIC and AMF). The overall picture that arises from our calculations is of a frustrated AFM system with small FM nearest neigbors interactions but larger AFM nearest neighbors couplings. We discuss our results in the framework of the Kugel-Khomskii model using a projection of the electronic structure in localized Wannier functions.

  18. Classified one-step high-radix signed-digit arithmetic units

    Science.gov (United States)

    Cherri, Abdallah K.

    1998-08-01

    High-radix number systems enable higher information storage density, less complexity, fewer system components, and fewer cascaded gates and operations. A simple one-step fully parallel high-radix signed-digit arithmetic is proposed for parallel optical computing based on new joint spatial encodings. This reduces hardware requirements and improves throughput by reducing the space-bandwidth produce needed. The high-radix signed-digit arithmetic operations are based on classifying the neighboring input digit pairs into various groups to reduce the computation rules. A new joint spatial encoding technique is developed to present both the operands and the computation rules. This technique increases the spatial bandwidth product of the spatial light modulators of the system. An optical implementation of the proposed high-radix signed-digit arithmetic operations is also presented. It is shown that our one-step trinary signed-digit and quaternary signed-digit arithmetic units are much simpler and better than all previously reported high-radix signed-digit techniques.

  19. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  20. Classification in medical images using adaptive metric k-NN

    Science.gov (United States)

    Chen, C.; Chernoff, K.; Karemore, G.; Lo, P.; Nielsen, M.; Lauze, F.

    2010-03-01

    The performance of the k-nearest neighborhoods (k-NN) classifier is highly dependent on the distance metric used to identify the k nearest neighbors of the query points. The standard Euclidean distance is commonly used in practice. This paper investigates the performance of k-NN classifier with respect to different adaptive metrics in the context of medical imaging. We propose using adaptive metrics such that the structure of the data is better described, introducing some unsupervised learning knowledge in k-NN. We investigated four different metrics are estimated: a theoretical metric based on the assumption that images are drawn from Brownian Image Model (BIM), the normalized metric based on variance of the data, the empirical metric is based on the empirical covariance matrix of the unlabeled data, and an optimized metric obtained by minimizing the classification error. The spectral structure of the empirical covariance also leads to Principal Component Analysis (PCA) performed on it which results the subspace metrics. The metrics are evaluated on two data sets: lateral X-rays of the lumbar aortic/spine region, where we use k-NN for performing abdominal aorta calcification detection; and mammograms, where we use k-NN for breast cancer risk assessment. The results show that appropriate choice of metric can improve classification.

  1. The electronic structures and ferromagnetism of Fe-doped GaSb: The first-principle calculation study

    Science.gov (United States)

    Lin, Xue-ling; Niu, Cao-ping; Pan, Feng-chun; Chen, Huan-ming; Wang, Xu-ming

    2017-09-01

    The electronic structures and the magnetic properties of Fe doped GaSb have been investigated by the first-principles calculation based on the framework of the generalized gradient approximation (GGA) and GGA+U schemes. The calculated results indicated that Fe atoms tend to form the anti-ferromagnetic (AFM) coupling with the nearest-neighbor positions preferentially. Compared with the anti-ferromagnetic coupling, the ferromagnetic interactions occurred at the second nearest-neighbor and third nearest-neighbor sites have a bigger superiority energetically. The effect of strong electron correlation at Fe-d orbit taking on the magnetic properties predicted by GGA+U approach demonstrated that the ferromagnetic (FM) coupling between the Fe ions is even stronger in consideration of the strong electron correlation effect. The ferromagnetism in Fe doped GaSb system predicted by our investigation implied that the doping of Fe into GaSb can be as a vital routine for manufacturing the FM semiconductors with higher Curie temperature.

  2. Beyond formal groups: neighboring acts and watershed protection in Appalachia

    Directory of Open Access Journals (Sweden)

    Heather Lukacs

    2016-09-01

    Full Text Available This paper explores how watershed organizations in Appalachia have persisted in addressing water quality issues in areas with a history of coal mining. We identified two watershed groups that have taken responsibility for restoring local creeks that were previously highly degraded and sporadically managed. These watershed groups represent cases of self-organized commons governance in resource-rich, economically poor Appalachian communities. We describe the extent and characteristics of links between watershed group volunteers and watershed residents who are not group members. Through surveys, participant observation, and key-informant consultation, we found that neighbors – group members as well as non-group-members – supported the group's function through informal neighboring acts. Past research has shown that local commons governance institutions benefit from being nested in supportive external structures. We found that the persistence and success of community watershed organizations depends on the informal participation of local residents, affirming the necessity of looking beyond formal, organized groups to understand the resources, expertise, and information needed to address complex water pollution at the watershed level. Our findings augment the concept of nestedness in commons governance to include that of a formal organization acting as a neighbor that exchanges informal neighboring acts with local residents. In this way, we extend the concept of neighboring to include interactions between individuals and a group operating in the same geographic area.

  3. Empirical mode decomposition and k-nearest embedding vectors for timely analyses of antibiotic resistance trends.

    Science.gov (United States)

    Teodoro, Douglas; Lovis, Christian

    2013-01-01

    Antibiotic resistance is a major worldwide public health concern. In clinical settings, timely antibiotic resistance information is key for care providers as it allows appropriate targeted treatment or improved empirical treatment when the specific results of the patient are not yet available. To improve antibiotic resistance trend analysis algorithms by building a novel, fully data-driven forecasting method from the combination of trend extraction and machine learning models for enhanced biosurveillance systems. We investigate a robust model for extraction and forecasting of antibiotic resistance trends using a decade of microbiology data. Our method consists of breaking down the resistance time series into independent oscillatory components via the empirical mode decomposition technique. The resulting waveforms describing intrinsic resistance trends serve as the input for the forecasting algorithm. The algorithm applies the delay coordinate embedding theorem together with the k-nearest neighbor framework to project mappings from past events into the future dimension and estimate the resistance levels. The algorithms that decompose the resistance time series and filter out high frequency components showed statistically significant performance improvements in comparison with a benchmark random walk model. We present further qualitative use-cases of antibiotic resistance trend extraction, where empirical mode decomposition was applied to highlight the specificities of the resistance trends. The decomposition of the raw signal was found not only to yield valuable insight into the resistance evolution, but also to produce novel models of resistance forecasters with boosted prediction performance, which could be utilized as a complementary method in the analysis of antibiotic resistance trends.

  4. A predictive toxicogenomics signature to classify genotoxic versus non-genotoxic chemicals in human TK6 cells

    Directory of Open Access Journals (Sweden)

    Andrew Williams

    2015-12-01

    Full Text Available Genotoxicity testing is a critical component of chemical assessment. The use of integrated approaches in genetic toxicology, including the incorporation of gene expression data to determine the DNA damage response pathways involved in response, is becoming more common. In companion papers previously published in Environmental and Molecular Mutagenesis, Li et al. (2015 [6] developed a dose optimization protocol that was based on evaluating expression changes in several well-characterized stress-response genes using quantitative real-time PCR in human lymphoblastoid TK6 cells in culture. This optimization approach was applied to the analysis of TK6 cells exposed to one of 14 genotoxic or 14 non-genotoxic agents, with sampling 4 h post-exposure. Microarray-based transcriptomic analyses were then used to develop a classifier for genotoxicity using the nearest shrunken centroids method. A panel of 65 genes was identified that could accurately classify toxicants as genotoxic or non-genotoxic. In Buick et al. (2015 [1], the utility of the biomarker for chemicals that require metabolic activation was evaluated. In this study, TK6 cells were exposed to increasing doses of four chemicals (two genotoxic that require metabolic activation and two non-genotoxic chemicals in the presence of rat liver S9 to demonstrate that S9 does not impair the ability to classify genotoxicity using this genomic biomarker in TK6cells.

  5. Plant neighbor identity influences plant biochemistry and physiology related to defense.

    Science.gov (United States)

    Broz, Amanda K; Broeckling, Corey D; De-la-Peña, Clelia; Lewis, Matthew R; Greene, Erick; Callaway, Ragan M; Sumner, Lloyd W; Vivanco, Jorge M

    2010-06-17

    Chemical and biological processes dictate an individual organism's ability to recognize and respond to other organisms. A small but growing body of evidence suggests that plants may be capable of recognizing and responding to neighboring plants in a species specific fashion. Here we tested whether or not individuals of the invasive exotic weed, Centaurea maculosa, would modulate their defensive strategy in response to different plant neighbors. In the greenhouse, C. maculosa individuals were paired with either conspecific (C. maculosa) or heterospecific (Festuca idahoensis) plant neighbors and elicited with the plant defense signaling molecule methyl jasmonate to mimic insect herbivory. We found that elicited C. maculosa plants grown with conspecific neighbors exhibited increased levels of total phenolics, whereas those grown with heterospecific neighbors allocated more resources towards growth. To further investigate these results in the field, we conducted a metabolomics analysis to explore chemical differences between individuals of C. maculosa growing in naturally occurring conspecific and heterospecific field stands. Similar to the greenhouse results, C. maculosa individuals accumulated higher levels of defense-related secondary metabolites and lower levels of primary metabolites when growing in conspecific versus heterospecific field stands. Leaf herbivory was similar in both stand types; however, a separate field study positively correlated specialist herbivore load with higher densities of C. maculosa conspecifics. Our results suggest that an individual C. maculosa plant can change its defensive strategy based on the identity of its plant neighbors. This is likely to have important consequences for individual and community success.

  6. Does a pear growl? Interference from semantic properties of orthographic neighbors.

    Science.gov (United States)

    Pecher, Diane; de Rooij, Jimmy; Zeelenberg, René

    2009-07-01

    In this study, we investigated whether semantic properties of a word's orthographic neighbors are activated during visual word recognition. In two experiments, words were presented with a property that was not true for the word itself. We manipulated whether the property was true for an orthographic neighbor of the word. Our results showed that rejection of the property was slower and less accurate when the property was true for a neighbor than when the property was not true for a neighbor. These findings indicate that semantic information is activated before orthographic processing is finished. The present results are problematic for the links model (Forster, 2006; Forster & Hector, 2002) that was recently proposed in order to bring form-first models of visual word recognition into line with previously reported findings (Forster & Hector, 2002; Pecher, Zeelenberg, & Wagenmakers, 2005; Rodd, 2004).

  7. A distance weighted-based approach for self-organized aggregation in robot swarms

    KAUST Repository

    Khaldi, Belkacem

    2017-12-14

    In this paper, a Distance-Weighted K Nearest Neighboring (DW-KNN) topology is proposed to study self-organized aggregation as an emergent swarming behavior within robot swarms. A virtual physics approach is applied among the proposed neighborhood topology to keep the robots together. A distance-weighted function based on a Smoothed Particle Hydrodynamic (SPH) interpolation approach is used as a key factor to identify the K-Nearest neighbors taken into account when aggregating the robots. The intra virtual physical connectivity among these neighbors is achieved using a virtual viscoelastic-based proximity model. With the ARGoS based-simulator, we model and evaluate the proposed approach showing various self-organized aggregations performed by a swarm of N foot-bot robots.

  8. REPTREE CLASSIFIER FOR IDENTIFYING LINK SPAM IN WEB SEARCH ENGINES

    Directory of Open Access Journals (Sweden)

    S.K. Jayanthi

    2013-01-01

    Full Text Available Search Engines are used for retrieving the information from the web. Most of the times, the importance is laid on top 10 results sometimes it may shrink as top 5, because of the time constraint and reliability on the search engines. Users believe that top 10 or 5 of total results are more relevant. Here comes the problem of spamdexing. It is a method to deceive the search result quality. Falsified metrics such as inserting enormous amount of keywords or links in website may take that website to the top 10 or 5 positions. This paper proposes a classifier based on the Reptree (Regression tree representative. As an initial step Link-based features such as neighbors, pagerank, truncated pagerank, trustrank and assortativity related attributes are inferred. Based on this features, tree is constructed. The tree uses the feature inference to differentiate spam sites from legitimate sites. WEBSPAM-UK-2007 dataset is taken as a base. It is preprocessed and converted into five datasets FEATA, FEATB, FEATC, FEATD and FEATE. Only link based features are taken for experiments. This paper focus on link spam alone. Finally a representative tree is created which will more precisely classify the web spam entries. Results are given. Regression tree classification seems to perform well as shown through experiments.

  9. Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection

    Science.gov (United States)

    Safi’ie, M. A.; Utami, E.; Fatta, H. A.

    2018-03-01

    Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.

  10. Boosting nearest-neighbour to long-range integrable spin chains

    International Nuclear Information System (INIS)

    Bargheer, Till; Beisert, Niklas; Loebbert, Florian

    2008-01-01

    We present an integrability-preserving recursion relation for the explicit construction of long-range spin chain Hamiltonians. These chains are generalizations of the Haldane–Shastry and Inozemtsev models and they play an important role in recent advances in string/gauge duality. The method is based on arbitrary nearest-neighbour integrable spin chains and it sheds light on the moduli space of deformation parameters. We also derive the closed chain asymptotic Bethe equations. (letter)

  11. Classifying Microorganisms

    DEFF Research Database (Denmark)

    Sommerlund, Julie

    2006-01-01

    This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological characteris......This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological...... characteristics. The coexistence of the classification systems does not lead to a conflict between them. Rather, the systems seem to co-exist in different configurations, through which they are complementary, contradictory and inclusive in different situations-sometimes simultaneously. The systems come...

  12. Detect thy neighbor: Identity recognition at the root level in plants

    NARCIS (Netherlands)

    Chen, B.J.W.; During, H.J.; Anten, N.P.R.

    2012-01-01

    Some plant species increase root allocation at the expense of reproduction in the presence of non-self and non-kin neighbors, indicating the capacity of neighbor-identityrecognition at the rootlevel. Yet in spite of the potential consequences of rootidentityrecognition for the relationship between

  13. Neighboring trees affect ectomycorrhizal fungal community composition in a woodland-forest ecotone.

    Science.gov (United States)

    Hubert, Nathaniel A; Gehring, Catherine A

    2008-09-01

    Ectomycorrhizal fungi (EMF) are frequently species rich and functionally diverse; yet, our knowledge of the environmental factors that influence local EMF diversity and species composition remains poor. In particular, little is known about the influence of neighboring plants on EMF community structure. We tested the hypothesis that the EMF of plants with heterospecific neighbors would differ in species richness and community composition from the EMF of plants with conspecific neighbors. We conducted our study at the ecotone between pinyon (Pinus edulis)-juniper (Juniperus monosperma) woodland and ponderosa pine (Pinus ponderosa) forest in northern Arizona, USA where the dominant trees formed associations with either EMF (P. edulis and P. ponderosa) or arbuscular mycorrhizal fungi (AMF; J. monosperma). We also compared the EMF communities of pinyon and ponderosa pines where their rhizospheres overlapped. The EMF community composition, but not species richness of pinyon pines was significantly influenced by neighboring AM juniper, but not by neighboring EM ponderosa pine. Ponderosa pine EMF communities were different in species composition when growing in association with pinyon pine than when growing in association with a conspecific. The EMF communities of pinyon and ponderosa pines were similar where their rhizospheres overlapped consisting of primarily the same species in similar relative abundance. Our findings suggest that neighboring tree species identity shaped EMF community structure, but that these effects were specific to host-neighbor combinations. The overlap in community composition between pinyon pine and ponderosa pine suggests that these tree species may serve as reservoirs of EMF inoculum for one another.

  14. Reduction in predator defense in the presence of neighbors in a colonial fish.

    Directory of Open Access Journals (Sweden)

    Franziska C Schädelin

    Full Text Available Predation pressure has long been considered a leading explanation of colonies, where close neighbors may reduce predation via dilution, alarming or group predator attacks. Attacking predators may be costly in terms of energy and survival, leading to the question of how neighbors contribute to predator deterrence in relationship to each other. Two hypotheses explaining the relative efforts made by neighbors are byproduct-mutualism, which occurs when breeders inadvertently attack predators by defending their nests, and reciprocity, which occurs when breeders deliberately exchange predator defense efforts with neighbors. Most studies investigating group nest defense have been performed with birds. However, colonial fish may constitute a more practical model system for an experimental approach because of the greater ability of researchers to manipulate their environment. We investigated in the colonial fish, Neolamprologus caudopunctatus, whether prospecting pairs preferred to breed near conspecifics or solitarily, and how breeders invested in anti-predator defense in relation to neighbors. In a simple choice test, prospecting pairs selected breeding sites close to neighbors versus a solitary site. Predators were then sequentially presented to the newly established test pairs, the previously established stimulus pairs or in between the two pairs. Test pairs attacked the predator eight times more frequently when they were presented on their non-neighbor side compared to between the two breeding sites, where stimulus pairs maintained high attack rates. Thus, by joining an established pair, test pairs were able to reduce their anti-predator efforts near neighbors, at no apparent cost to the stimulus pairs. These findings are unlikely to be explained by reciprocity or byproduct-mutualism. Our results instead suggest a commensal relationship in which new pairs exploit the high anti-predator efforts of established pairs, which invest similarly with or

  15. Plant neighbor identity influences plant biochemistry and physiology related to defense

    Directory of Open Access Journals (Sweden)

    Callaway Ragan M

    2010-06-01

    Full Text Available Abstract Background Chemical and biological processes dictate an individual organism's ability to recognize and respond to other organisms. A small but growing body of evidence suggests that plants may be capable of recognizing and responding to neighboring plants in a species specific fashion. Here we tested whether or not individuals of the invasive exotic weed, Centaurea maculosa, would modulate their defensive strategy in response to different plant neighbors. Results In the greenhouse, C. maculosa individuals were paired with either conspecific (C. maculosa or heterospecific (Festuca idahoensis plant neighbors and elicited with the plant defense signaling molecule methyl jasmonate to mimic insect herbivory. We found that elicited C. maculosa plants grown with conspecific neighbors exhibited increased levels of total phenolics, whereas those grown with heterospecific neighbors allocated more resources towards growth. To further investigate these results in the field, we conducted a metabolomics analysis to explore chemical differences between individuals of C. maculosa growing in naturally occurring conspecific and heterospecific field stands. Similar to the greenhouse results, C. maculosa individuals accumulated higher levels of defense-related secondary metabolites and lower levels of primary metabolites when growing in conspecific versus heterospecific field stands. Leaf herbivory was similar in both stand types; however, a separate field study positively correlated specialist herbivore load with higher densities of C. maculosa conspecifics. Conclusions Our results suggest that an individual C. maculosa plant can change its defensive strategy based on the identity of its plant neighbors. This is likely to have important consequences for individual and community success.

  16. 3D NEAREST NEIGHBOUR SEARCH USING A CLUSTERED HIERARCHICAL TREE STRUCTURE

    Directory of Open Access Journals (Sweden)

    A. Suhaibah

    2016-06-01

    Full Text Available Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  17. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  18. Quantum Correlation Properties in Composite Parity-Conserved Matrix Product States

    Science.gov (United States)

    Zhu, Jing-Min

    2016-09-01

    We give a new thought for constructing long-range quantum correlation in quantum many-body systems. Our proposed composite parity-conserved matrix product state has long-range quantum correlation only for two spin blocks where their spin-block length larger than 1 compared to any subsystem only having short-range quantum correlation, and we investigate quantum correlation properties of two spin blocks varying with environment parameter and spacing spin number. We also find that the geometry quantum discords of two nearest-neighbor spin blocks and two next-nearest-neighbor spin blocks become smaller and for other conditions the geometry quantum discord becomes larger than that in any subcomponent, i.e., the increase or the production of the long-range quantum correlation is at the cost of reducing the short-range quantum correlation compared to the corresponding classical correlation and total correlation having no any characteristic of regulation. For nearest-neighbor and next-nearest-neighbor all the correlations take their maximal values at the same points, while for other conditions no whether for spacing same spin number or for different spacing spin numbers all the correlations taking their maximal values are respectively at different points which are very close. We believe that our work is helpful to comprehensively and deeply understand the organization and structure of quantum correlation especially for long-range quantum correlation of quantum many-body systems; and further helpful for the classification, the depiction and the measure of quantum correlation of quantum many-body systems.

  19. A multilevel-skin neighbor list algorithm for molecular dynamics simulation

    Science.gov (United States)

    Zhang, Chenglong; Zhao, Mingcan; Hou, Chaofeng; Ge, Wei

    2018-01-01

    Searching of the interaction pairs and organization of the interaction processes are important steps in molecular dynamics (MD) algorithms and are critical to the overall efficiency of the simulation. Neighbor lists are widely used for these steps, where thicker skin can reduce the frequency of list updating but is discounted by more computation in distance check for the particle pairs. In this paper, we propose a new neighbor-list-based algorithm with a precisely designed multilevel skin which can reduce unnecessary computation on inter-particle distances. The performance advantages over traditional methods are then analyzed against the main simulation parameters on Intel CPUs and MICs (many integrated cores), and are clearly demonstrated. The algorithm can be generalized for various discrete simulations using neighbor lists.

  20. Effects of second neighbor interactions on skyrmion lattices in chiral magnets

    International Nuclear Information System (INIS)

    Oliveira, E A S; Silva, R L; Silva, R C; Pereira, A R

    2017-01-01

    In this paper we investigate the influences of the second neighbor interactions on a skyrmion lattice in two-dimensional chiral magnets. Such a system contains the exchange and the Dzyaloshinskii–Moriya for the spin interactions and therefore, we analyse three situations: firstly, the second neighbor interaction is present only in the exchange coupling; secondly, it is present only in the Dzyaloshinskii–Moriya coupling. Finally, the second neighbor interactions are present in both exchange and Dzyaloshinskii–Moriya couplings. We show that such effects cause important modifications to the helical and skyrmion phases when an external magnetic field is applied. (paper)

  1. Improving Fraudster Detection in Online Auctions by Using Neighbor-Driven Attributes

    Directory of Open Access Journals (Sweden)

    Jun-Lin Lin

    2015-12-01

    Full Text Available Online auction websites use a simple reputation system to help their users to evaluate the trustworthiness of sellers and buyers. However, to improve their reputation in the reputation system, fraudulent users can easily deceive the reputation system by creating fake transactions. This inflated-reputation fraud poses a major problem for online auction websites because it can lead legitimate users into scams. Numerous approaches have been proposed in the literature to address this problem, most of which involve using social network analysis (SNA to derive critical features (e.g., k-core, center weight, and neighbor diversity for distinguishing fraudsters from legitimate users. This paper discusses the limitations of these SNA features and proposes a class of SNA features referred to as neighbor-driven attributes (NDAs. The NDAs of users are calculated from the features of their neighbors. Because fraudsters require collusive neighbors to provide them with positive ratings in the reputation system, using NDAs can be helpful for detecting fraudsters. Although the idea of NDAs is not entirely new, experimental results on a real-world dataset showed that using NDAs improves classification accuracy compared with state-of-the-art methods that use the k-core, center weight, and neighbor diversity.

  2. Interacting steps with finite-range interactions: Analytical approximation and numerical results

    Science.gov (United States)

    Jaramillo, Diego Felipe; Téllez, Gabriel; González, Diego Luis; Einstein, T. L.

    2013-05-01

    We calculate an analytical expression for the terrace-width distribution P(s) for an interacting step system with nearest- and next-nearest-neighbor interactions. Our model is derived by mapping the step system onto a statistically equivalent one-dimensional system of classical particles. The validity of the model is tested with several numerical simulations and experimental results. We explore the effect of the range of interactions q on the functional form of the terrace-width distribution and pair correlation functions. For physically plausible interactions, we find modest changes when next-nearest neighbor interactions are included and generally negligible changes when more distant interactions are allowed. We discuss methods for extracting from simulated experimental data the characteristic scale-setting terms in assumed potential forms.

  3. Effects of temperature on domain-growth kinetics of fourfold-degenerate (2×1) ordering in Ising models

    DEFF Research Database (Denmark)

    Høst-Madsen, Anders; Shah, Peter Jivan; Hansen, Torben

    1987-01-01

    Computer-simulation techniques are used to study the domain-growth kinetics of (2×1) ordering in a two-dimensional Ising model with nonconserved order parameter and with variable ratio α of next-nearest- and nearest-neighbor interactions. At zero temperature, persistent growth characterized...

  4. Effective-field theory of the Ising model with three alternative layers on the honeycomb and square lattices

    Energy Technology Data Exchange (ETDEWEB)

    Deviren, Bayram [Institute of Science, Erciyes University, Kayseri 38039 (Turkey); Canko, Osman [Department of Physics, Erciyes University, Kayseri 38039 (Turkey); Keskin, Mustafa [Department of Physics, Erciyes University, Kayseri 38039 (Turkey)], E-mail: keskin@erciyes.edu.tr

    2008-09-15

    The Ising model with three alternative layers on the honeycomb and square lattices is studied by using the effective-field theory with correlations. We consider that the nearest-neighbor spins of each layer are coupled ferromagnetically and the adjacent spins of the nearest-neighbor layers are coupled either ferromagnetically or anti-ferromagnetically depending on the sign of the bilinear exchange interactions. We investigate the thermal variations of the magnetizations and present the phase diagrams. The phase diagrams contain the paramagnetic, ferromagnetic and anti-ferromagnetic phases, and the system also exhibits a tricritical behavior.

  5. Effective-field theory of the Ising model with three alternative layers on the honeycomb and square lattices

    International Nuclear Information System (INIS)

    Deviren, Bayram; Canko, Osman; Keskin, Mustafa

    2008-01-01

    The Ising model with three alternative layers on the honeycomb and square lattices is studied by using the effective-field theory with correlations. We consider that the nearest-neighbor spins of each layer are coupled ferromagnetically and the adjacent spins of the nearest-neighbor layers are coupled either ferromagnetically or anti-ferromagnetically depending on the sign of the bilinear exchange interactions. We investigate the thermal variations of the magnetizations and present the phase diagrams. The phase diagrams contain the paramagnetic, ferromagnetic and anti-ferromagnetic phases, and the system also exhibits a tricritical behavior

  6. NMR evidence of a gapless chiral phase in the S=1 zigzag antiferromagnet CaV2O4

    International Nuclear Information System (INIS)

    Fukushima, Hiroyuki; Kikuchi, Hikomitsu; Chiba, Meiro; Fujii, Yutaka; Yamamoto, Yoshiyuki; Hori, Hidenobu

    2002-01-01

    We have performed magnetic susceptibility and 51 V NMR experiments with CaV 2 O 4 , a model substance for a frustrated S=1 spin chain with competing nearest neighbor (NN) and next-nearest neighbor (NNN) antiferromagnetic interactions. We report on the analysis of the magnetic susceptibility and the 51 V NMR experiments suggesting a gapless nature of CaV 2 O 4 . The absence of a spin gap is in clear contrast to the case of a non-frustrated spin chains which usually have a Haldane gap. (author)

  7. bufferkdtree

    DEFF Research Database (Denmark)

    Gieseke, Fabian Cristian; Oancea, Cosmin Eugen; Igel, Christian

    2017-01-01

    The bufferkdtree package is an open-source software that provides an efficient implementation for processing huge amounts of nearest neighbor queries in Euclidean spaces of moderate dimensionality. Its underlying implementation resorts to a variant of the classical k-d tree data structure, called...... buffer k-d tree, which can be used to efficiently perform bulk nearest neighbor searches on modern many-core devices. The package, which is based on Python, C, and OpenCL, is made publicly available online at https://github.com/gieseke/bufferkdtree under the GPLv2 license....

  8. Golden mean renormalization for a generalized Harper equation: The Ketoja-Satija orchid

    International Nuclear Information System (INIS)

    Mestel, B.D.; Osbaldestin, A.H.

    2004-01-01

    We provide a rigorous analysis of the fluctuations of localized eigenstates in a generalized Harper equation with golden mean flux and with next-nearest-neighbor interactions. For next-nearest-neighbor interaction above a critical threshold, these self-similar fluctuations are characterized by orbits of a renormalization operator on a universal strange attractor, whose projection was dubbed the ''orchid'' by Ketoja and Satija [Phys. Rev. Lett. 75, 2762 (1995)]. We show that the attractor is given essentially by an embedding of a subshift of finite type, and give a description of its periodic orbits

  9. The distribution of the number of node neighbors in random hypergraphs

    International Nuclear Information System (INIS)

    López, Eduardo

    2013-01-01

    Hypergraphs, the generalization of graphs in which edges become conglomerates of r nodes called hyperedges of rank r ⩾ 2, are excellent models to study systems with interactions that are beyond the pairwise level. For hypergraphs, the node degree ℓ (number of hyperedges connected to a node) and the number of neighbors k of a node differ from each other in contrast to the case of graphs, where counting the number of edges is equivalent to counting the number of neighbors. In this paper, I calculate the distribution of the number of node neighbors in random hypergraphs in which hyperedges of uniform rank r have a homogeneous (equal for all hyperedges) probability p to appear. This distribution is equivalent to the degree distribution of ensembles of graphs created as projections of hypergraph or bipartite network ensembles, where the projection connects any two nodes in the projected graph when they are also connected in the hypergraph or bipartite network. The calculation is non-trivial due to the possibility that neighbor nodes belong simultaneously to multiple hyperedges (node overlaps). From the exact results, the traditional asymptotic approximation to the distribution in the sparse regime (small p) where overlaps are ignored is rederived and improved; the approximation exhibits Poisson-like behavior accompanied by strong fluctuations modulated by power-law decays in the system size N with decay exponents equal to the minimum number of overlapping nodes possible for a given number of neighbors. It is shown that the dense limit cannot be explained if overlaps are ignored, and the correct asymptotic distribution is provided. The neighbor distribution requires the calculation of a new combinatorial coefficient Q r−1 (k, ℓ), which counts the number of distinct labeled hypergraphs of k nodes, ℓ hyperedges of rank r − 1, and where every node is connected to at least one hyperedge. Some identities of Q r−1 (k, ℓ) are derived and applied to the

  10. Epileptic Seizure Detection based on Wavelet Transform Statistics Map and EMD Method for Hilbert-Huang Spectral Analyzing in Gamma Frequency Band of EEG Signals

    Directory of Open Access Journals (Sweden)

    Morteza Behnam

    2015-08-01

    Full Text Available Seizure detection using brain signal (EEG analysis is the important clinical methods in drug therapy and the decisions before brain surgery. In this paper, after signal conditioning using suitable filtering, the Gamma frequency band has been extracted and the other brain rhythms, ambient noises and the other bio-signal are canceled. Then, the wavelet transform of brain signal and the map of wavelet transform in multi levels are computed. By dividing the color map to different epochs, the histogram of each sub-image is obtained and the statistics of it based on statistical momentums and Negentropy values are calculated. Statistical feature vector using Principle Component Analysis (PCA is reduced to one dimension. By EMD algorithm and sifting procedure for analyzing the data by Intrinsic Mode Function (IMF and computing the residues of brain signal using spectrum of Hilbert transform and Hilbert – Huang spectrum forming, one spatial feature based on the Euclidian distance for signal classification is obtained. By K-Nearest Neighbor (KNN classifier and by considering the optimal neighbor parameter, EEG signals are classified in two classes, seizure and non-seizure signal, with the rate of accuracy 76.54% and with variance of error 0.3685 in the different tests.

  11. Time Series Analysis Using Geometric Template Matching.

    Science.gov (United States)

    Frank, Jordan; Mannor, Shie; Pineau, Joelle; Precup, Doina

    2013-03-01

    We present a novel framework for analyzing univariate time series data. At the heart of the approach is a versatile algorithm for measuring the similarity of two segments of time series called geometric template matching (GeTeM). First, we use GeTeM to compute a similarity measure for clustering and nearest-neighbor classification. Next, we present a semi-supervised learning algorithm that uses the similarity measure with hierarchical clustering in order to improve classification performance when unlabeled training data are available. Finally, we present a boosting framework called TDEBOOST, which uses an ensemble of GeTeM classifiers. TDEBOOST augments the traditional boosting approach with an additional step in which the features used as inputs to the classifier are adapted at each step to improve the training error. We empirically evaluate the proposed approaches on several datasets, such as accelerometer data collected from wearable sensors and ECG data.

  12. Electronic Nose Odor Classification with Advanced Decision Tree Structures

    Directory of Open Access Journals (Sweden)

    S. Guney

    2013-09-01

    Full Text Available Electronic nose (e-nose is an electronic device which can measure chemical compounds in air and consequently classify different odors. In this paper, an e-nose device consisting of 8 different gas sensors was designed and constructed. Using this device, 104 different experiments involving 11 different odor classes (moth, angelica root, rose, mint, polis, lemon, rotten egg, egg, garlic, grass, and acetone were performed. The main contribution of this paper is the finding that using the chemical domain knowledge it is possible to train an accurate odor classification system. The domain knowledge about chemical compounds is represented by a decision tree whose nodes are composed of classifiers such as Support Vector Machines and k-Nearest Neighbor. The overall accuracy achieved with the proposed algorithm and the constructed e-nose device was 97.18 %. Training and testing data sets used in this paper are published online.

  13. Authentications of Myanmar National Registration Card

    Directory of Open Access Journals (Sweden)

    Myint Myint Sein

    2013-04-01

    Full Text Available The automatic identification system of Myanmar national registration card (NRC holder is presented in this paper. The proposed system can be handled the identification by the extracted low quality face image and fingerprint image from Myanmar NRC. Both of the facial recognition and fingerprint recognition system are developed for Myanmar citizenship confirmation. Age invariant face recognition algorithm is performed based on combination of DiaPCA (Diagonal principal Component Analysis and KNN (Kth nearest neighbor classifier approaches. An algorithm of the fingerprint recognition is proposed for recognition of the poor quality fingerprint image with fabric background.  Several experiments have been done for confirming the effectiveness of the proposed approach.

  14. The classification of hunger behaviour of Lates Calcarifer through the integration of image processing technique and k-Nearest Neighbour learning algorithm

    Science.gov (United States)

    Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.

    2018-04-01

    Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.

  15. Watch Out for Your Neighbor: Climbing onto Shrubs Is Related to Risk of Cannibalism in the Scorpion Buthus cf. occitanus.

    Science.gov (United States)

    Sánchez-Piñero, Francisco; Urbano-Tenorio, Fernando

    The distribution and behavior of foraging animals usually imply a balance between resource availability and predation risk. In some predators such as scorpions, cannibalism constitutes an important mortality factor determining their ecology and behavior. Climbing on vegetation by scorpions has been related both to prey availability and to predation (cannibalism) risk. We tested different hypotheses proposed to explain climbing on vegetation by scorpions. We analyzed shrub climbing in Buthus cf. occitanus with regard to the following: a) better suitability of prey size for scorpions foraging on shrubs than on the ground, b) selection of shrub species with higher prey load, c) seasonal variations in prey availability on shrubs, and d) whether or not cannibalism risk on the ground increases the frequency of shrub climbing. Prey availability on shrubs was compared by estimating prey abundance in sticky traps placed in shrubs. A prey sample from shrubs was measured to compare prey size. Scorpions were sampled in six plots (50 m x 10 m) to estimate the proportion of individuals climbing on shrubs. Size difference and distance between individuals and their closest scorpion neighbor were measured to assess cannibalism risk. The results showed that mean prey size was two-fold larger on the ground. Selection of particular shrub species was not related to prey availability. Seasonal variations in the number of scorpions on shrubs were related to the number of active scorpions, but not with fluctuations in prey availability. Size differences between a scorpion and its nearest neighbor were positively related with a higher probability for a scorpion to climb onto a shrub when at a disadvantage, but distance was not significantly related. These results do not support hypotheses explaining shrub climbing based on resource availability. By contrast, our results provide evidence that shrub climbing is related to cannibalism risk.

  16. Anisotropic ordering in a two-temperature lattice gas

    DEFF Research Database (Denmark)

    Szolnoki, Attila; Szabó, György; Mouritsen, Ole G.

    1997-01-01

    We consider a two-dimensional lattice gas model with repulsive nearest- and next-nearest-neighbor interactions that evolves in time according to anisotropic Kawasaki dynamics. The hopping of particles along the principal directions is governed by two heat baths at different temperatures T-x and T...

  17. Out-of-hospital cardiac arrest: Probability of bystander defibrillation relative to distance to nearest automated external defibrillator.

    Science.gov (United States)

    Sondergaard, Kathrine B; Hansen, Steen Moller; Pallisgaard, Jannik L; Gerds, Thomas Alexander; Wissenberg, Mads; Karlsson, Lena; Lippert, Freddy K; Gislason, Gunnar H; Torp-Pedersen, Christian; Folke, Fredrik

    2018-03-01

    Despite wide dissemination of automated external defibrillators (AEDs), bystander defibrillation rates remain low. We aimed to investigate how route distance to the nearest accessible AED was associated with probability of bystander defibrillation in public and residential locations. We used data from the nationwide Danish Cardiac Arrest Registry and the Danish AED Network to identify out-of-hospital cardiac arrests and route distances to nearest accessible registered AED during 2008-2013. The association between route distance and bystander defibrillation was described using restricted cubic spline logistic regression. We included 6971 out-of-hospital cardiac arrest cases. The proportion of arrests according to distance in meters (≤100, 101-200, >200) to the nearest accessible AED was: 4.6% (n=320), 5.3% (n=370), and 90.1% (n=6281), respectively. For cardiac arrests in public locations, the probability of bystander defibrillation at 0, 100 and 200m from the nearest AED was 35.7% (95% confidence interval 28.0%-43.5%), 21.3% (95% confidence interval 17.4%-25.2%), and 13.7% (95% confidence interval 10.1%-16.8%), respectively. The corresponding numbers for cardiac arrests in residential locations were 7.0% (95% confidence interval -2.1%-16.1%), 1.5% (95% confidence interval 0.002%-2.8%), and 0.9% (95% confidence interval 0.0005%-1.7%), respectively. In public locations, the probability of bystander defibrillation decreased rapidly within the first 100m route distance from cardiac arrest to nearest accessible AED whereas the probability of bystander defibrillation was low for all distances in residential areas. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Interactions of galaxies outside clusters and massive groups

    Science.gov (United States)

    Yadav, Jaswant K.; Chen, Xuelei

    2018-06-01

    We investigate the dependence of physical properties of galaxies on small- and large-scale density environment. The galaxy population consists of mainly passively evolving galaxies in comparatively low-density regions of Sloan Digital Sky Survey (SDSS). We adopt (i) local density, ρ _{20}, derived using adaptive smoothing kernel, (ii) projected distance, r_p, to the nearest neighbor galaxy and (iii) the morphology of the nearest neighbor galaxy as various definitions of environment parameters of every galaxy in our sample. In order to detect long-range interaction effects, we group galaxy interactions into four cases depending on morphology of the target and neighbor galaxies. This study builds upon an earlier study by Park and Choi (2009) by including improved definitions of target and neighbor galaxies, thus enabling us to better understand the effect of "the nearest neighbor" interaction on the galaxy. We report that the impact of interaction on galaxy properties is detectable at least up to the pair separation corresponding to the virial radius of (the neighbor) galaxies. This turns out to be mostly between 210 and 360 h^{-1}kpc for galaxies included in our study. We report that early type fraction for isolated galaxies with r_p > r_{vir,nei} is almost ignorant of the background density and has a very weak density dependence for closed pairs. Star formation activity of a galaxy is found to be crucially dependent on neighbor galaxy morphology. We find star formation activity parameters and structure parameters of galaxies to be independent of the large-scale background density. We also exhibit that changing the absolute magnitude of the neighbor galaxies does not affect significantly the star formation activity of those target galaxies whose morphology and luminosities are fixed.

  19. Incidence and Prevalence of Tuberculosis in Iran and Neighboring Countries

    Directory of Open Access Journals (Sweden)

    Arezoo Tavakoli

    2017-07-01

    Full Text Available Background Tuberculosis is one of the major public health concerns in many countries, however the available and effective treatment is known. Tuberculosis typically determined with socio-economic problems such as war, malnutrition and HIV prevalence. In Iran, many progresses are carried to control tuberculosis but, different factors such as immigration from neighboring countries are affective to tuberculosis infection. Objectives In this paper, the incidence and prevalence of tuberculosis is evaluated in different regions of Iran and neighboring countries. Methods The data are collected from different and valid sources such as Scopus, Pubmed and also many reports from world health organization (WHO and center of disease control and prevention (CDC for a period of 25 years (1990 - 2015 evaluated for Iran and neighboring countries. Results This study as a descriptive- analytical research is conducted cross- sectional among Iran and neighboring countries since 1990. The information is obtained from exact and valid informative data from web of sciences. The east and west border countries of Iran which are faced with war and immigration in Afghanistan, Pakistan and Iraq are source of tuberculosis infection that effect on tuberculosis prevalence in Iran. The data were analyzed by SPSS 22 and Excel 2013. Conclusions The incidence of tuberculosis in Iran has been decreased because of many controlling actions such as BCG vaccination, electronic reporting system for tuberculosis and free access to tuberculosis medication. Some of Iran neighboring countries such as Tajikistan and Pakistan have the highest incidence of tuberculosis which known as a challenge for tuberculosis control in Iran while Saudi Arabia and Turkey have the lowest incidence.

  20. Theory of lithium islands and monolayers: Electronic structure and stability

    International Nuclear Information System (INIS)

    Quassowski, S.; Hermann, K.

    1995-01-01

    Systematic calculations on planar clusters and monolayers of lithium are performed to study geometries and stabilities of the clusters as well as their convergence behavior with increasing cluster size. The calculations are based on ab initio methods using density-functional theory within the local-spin-density approximation for exchange and correlation. The optimized nearest-neighbor distances d NN of the Li n clusters, n=1,...,25, of both hexagonal and square geometry increase with cluster size, converging quite rapidly towards the monolayer results. Further, the cluster cohesive energies E c increase with cluster size and converge towards the respective monolayer values that form upper bounds. Clusters of hexagonal geometry are found to be more stable than square clusters of comparable size, consistent with the monolayer results. The size dependence of the cluster cohesive energies can be described approximately by a coordination model based on the concept of pairwise additive nearest-neighbor binding. This indicates that the average binding in the Li n clusters and their relative stabilities can be explained by simple geometric effects which derive from the nearest-neighbor coordination

  1. Missing value imputation in DNA microarrays based on conjugate gradient method.

    Science.gov (United States)

    Dorri, Fatemeh; Azmi, Paeiz; Dorri, Faezeh

    2012-02-01

    Analysis of gene expression profiles needs a complete matrix of gene array values; consequently, imputation methods have been suggested. In this paper, an algorithm that is based on conjugate gradient (CG) method is proposed to estimate missing values. k-nearest neighbors of the missed entry are first selected based on absolute values of their Pearson correlation coefficient. Then a subset of genes among the k-nearest neighbors is labeled as the best similar ones. CG algorithm with this subset as its input is then used to estimate the missing values. Our proposed CG based algorithm (CGimpute) is evaluated on different data sets. The results are compared with sequential local least squares (SLLSimpute), Bayesian principle component analysis (BPCAimpute), local least squares imputation (LLSimpute), iterated local least squares imputation (ILLSimpute) and adaptive k-nearest neighbors imputation (KNNKimpute) methods. The average of normalized root mean squares error (NRMSE) and relative NRMSE in different data sets with various missing rates shows CGimpute outperforms other methods. Copyright © 2011 Elsevier Ltd. All rights reserved.

  2. Mapping growing stock volume and forest live biomass: a case study of the Polissya region of Ukraine

    Science.gov (United States)

    Bilous, Andrii; Myroniuk, Viktor; Holiaka, Dmytrii; Bilous, Svitlana; See, Linda; Schepaschenko, Dmitry

    2017-10-01

    Forest inventory and biomass mapping are important tasks that require inputs from multiple data sources. In this paper we implement two methods for the Ukrainian region of Polissya: random forest (RF) for tree species prediction and k-nearest neighbors (k-NN) for growing stock volume and biomass mapping. We examined the suitability of the five-band RapidEye satellite image to predict the distribution of six tree species. The accuracy of RF is quite high: ~99% for forest/non-forest mask and 89% for tree species prediction. Our results demonstrate that inclusion of elevation as a predictor variable in the RF model improved the performance of tree species classification. We evaluated different distance metrics for the k-NN method, including Euclidean or Mahalanobis distance, most similar neighbor (MSN), gradient nearest neighbor, and independent component analysis. The MSN with the four nearest neighbors (k = 4) is the most precise (according to the root-mean-square deviation) for predicting forest attributes across the study area. The k-NN method allowed us to estimate growing stock volume with an accuracy of 3 m3 ha-1 and for live biomass of about 2 t ha-1 over the study area.

  3. Nearest-Neighbor Interactions and Their Influence on the Structural Aspects of Dipeptides

    Directory of Open Access Journals (Sweden)

    Gunajyoti Das

    2013-01-01

    Full Text Available In this theoretical study, the role of the side chain moiety of C-terminal residue in influencing the structural and molecular properties of dipeptides is analyzed by considering a series of seven dipeptides. The C-terminal positions of the dipeptides are varied with seven different amino acid residues, namely. Val, Leu, Asp, Ser, Gln, His, and Pyl while their N-terminal positions are kept constant with Sec residues. Full geometry optimization and vibrational frequency calculations are carried out at B3LYP/6-311++G(d,p level in gas and aqueous phase. The stereo-electronic effects of the side chain moieties of C-terminal residues are found to influence the values of Φ and Ω dihedrals, planarity of the peptide planes, and geometry around the C7   α-carbon atoms of the dipeptides. The gas phase intramolecular H-bond combinations of the dipeptides are similar to those in aqueous phase. The theoretical vibrational spectra of the dipeptides reflect the nature of intramolecular H-bonds existing in the dipeptide structures. Solvation effects of aqueous environment are evident on the geometrical parameters related to the amide planes, dipole moments, HOMOLUMO energy gaps as well as thermodynamic stability of the dipeptides.

  4. PENINGKATAN KECERDASAN COMPUTER PLAYER PADA GAME PERTARUNGAN BERBASIS K-NEAREST NEIGHBOR BERBOBOT

    Directory of Open Access Journals (Sweden)

    M Ihsan Alfani Putera

    2018-02-01

    Full Text Available Salah satu teknologi komputer yang berkembang dan perubahannya cukup pesat adalah game. Tujuan dibuatnya game adalah sebagai sarana hiburan dan memberikan kesenangan bagi penggunanya. Contoh elemen dalam pembuatan game yang penting adalah adanya tantangan yang seimbang sesuai level. Dalam hal ini, adanya kecerdasan buatan atau AI merupakan salah satu unsur yang diperlukan dalam pembentukan game. Penggunaan AI yang tidak beradaptasi ke strategi lawan akan  mudah diprediksi dan repetitif. Jika AI terlalu pintar maka player akan kesulitan dalam memainkan game tersebut. Dengan keadaan seperti itu akan menurunkan tingkat enjoyment dari pemain. Oleh karena itu, dibutuhkan suatu metode AI yang dapat beradaptasi dengan kemampuan dari player yang bermain. Sehingga tingkat kesulitan yang dihadapi dapat mengikuti kemampuan pemainnya dan pengalaman enjoyment ketika bermain game terus terjaga. Pada penelitian sebelumnya, metode AI yang sering digunakan pada game berjenis pertarungan adalah K-NN. Namun metode tersebut menganggap semua atribut dalam game adalah sama sehingga hal ini mempengaruhi hasil learning AI menjadi kurang optimal.Penelitian ini mengusulkan metode untuk AI dengan menggunakan metode K-NN berbobot pada game berjenis pertarungan. Dimana, pembobotan tersebut dilakukan untuk memberikan pengaruh setiap atribut dengan bobot disesuaikan dengan aksi player. Dari hasil evaluasi yang dilakukan terhadap 50 kali pertandingan pada 3 skenario uji coba, metode yang diusulkan yaitu K-NN berbobot mampu menghasilkan tingkat kecerdasan AI dengan akurasi sebesar 51%. Sedangkan, metode sebelumnya yaitu K-NN tanpa bobot hanya menghasilkan tingkat kecerdasan AI sebesar 38% dan metode random menghasilkan tingkat kecerdasan AI sebesar 25%.

  5. PENINGKATAN KECERDASAN COMPUTER PLAYER PADA GAME PERTARUNGAN BERBASIS K-NEAREST NEIGHBOR BERBOBOT

    OpenAIRE

    M Ihsan Alfani Putera; Darlis Heru Murti

    2018-01-01

    Salah satu teknologi komputer yang berkembang dan perubahannya cukup pesat adalah game. Tujuan dibuatnya game adalah sebagai sarana hiburan dan memberikan kesenangan bagi penggunanya. Contoh elemen dalam pembuatan game yang penting adalah adanya tantangan yang seimbang sesuai level. Dalam hal ini, adanya kecerdasan buatan atau AI merupakan salah satu unsur yang diperlukan dalam pembentukan game. Penggunaan AI yang tidak beradaptasi ke strategi lawan akan  mudah diprediksi dan repetitif. Jika ...

  6. Evidence for cultural differences between neighboring chimpanzee communities.

    Science.gov (United States)

    Luncz, Lydia V; Mundry, Roger; Boesch, Christophe

    2012-05-22

    The majority of evidence for cultural behavior in animals has come from comparisons between populations separated by large geographical distances that often inhabit different environments. The difficulty of excluding ecological and genetic variation as potential explanations for observed behaviors has led some researchers to challenge the idea of animal culture. Chimpanzees (Pan troglodytes verus) in the Taï National Park, Côte d'Ivoire, crack Coula edulis nuts using stone and wooden hammers and tree root anvils. In this study, we compare for the first time hammer selection for nut cracking across three neighboring chimpanzee communities that live in the same forest habitat, which reduces the likelihood of ecological variation. Furthermore, the study communities experience frequent dispersal of females at maturity, which eliminates significant genetic variation. We compared key ecological factors, such as hammer availability and nut hardness, between the three neighboring communities and found striking differences in group-specific hammer selection among communities despite similar ecological conditions. Differences were found in the selection of hammer material and hammer size in response to changes in nut resistance over time. Our findings highlight the subtleties of cultural differences in wild chimpanzees and illustrate how cultural knowledge is able to shape behavior, creating differences among neighboring social groups. Copyright © 2012 Elsevier Ltd. All rights reserved.

  7. Segmenting Multiple Sclerosis Lesions using a Spatially Constrained K-Nearest Neighbour approach

    DEFF Research Database (Denmark)

    Lyksborg, Mark; Larsen, Rasmus; Sørensen, Per Soelberg

    2012-01-01

    We propose a method for the segmentation of Multiple Sclerosis lesions. The method is based on probability maps derived from a K-Nearest Neighbours classication. These are used as a non parametric likelihood in a Bayesian formulation with a prior that assumes connectivity of neighbouring voxels. ...

  8. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    Science.gov (United States)

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  9. Regional Calibration of SCS-CN L-THIA Model: Application for Ungauged Basins

    Directory of Open Access Journals (Sweden)

    Ji-Hong Jeon

    2014-05-01

    Full Text Available Estimating surface runoff for ungauged watershed is an important issue. The Soil Conservation Service Curve Number (SCS-CN method developed from long-term experimental data is widely used to estimate surface runoff from gaged or ungauged watersheds. Many modelers have used the documented SCS-CN parameters without calibration, sometimes resulting in significant errors in estimating surface runoff. Several methods for regionalization of SCS-CN parameters were evaluated. The regionalization methods include: (1 average; (2 land use area weighted average; (3 hydrologic soil group area weighted average; (4 area combined land use and hydrologic soil group weighted average; (5 spatial nearest neighbor; (6 inverse distance weighted average; and (7 global calibration method, and model performance for each method was evaluated with application to 14 watersheds located in Indiana. Eight watersheds were used for calibration and six watersheds for validation. For the validation results, the spatial nearest neighbor method provided the highest average Nash-Sutcliffe (NS value at 0.58 for six watersheds but it included the lowest NS value and variance of NS values of this method was the highest. The global calibration method provided the second highest average NS value at 0.56 with low variation of NS values. Although the spatial nearest neighbor method provided the highest average NS value, this method was not statistically different than other methods. However, the global calibration method was significantly different than other methods except the spatial nearest neighbor method. Therefore, we conclude that the global calibration method is appropriate to regionalize SCS-CN parameters for ungauged watersheds.

  10. Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms.

    Science.gov (United States)

    Liu, Rensong; Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan

    2017-01-01

    Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K -nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance.

  11. Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms

    Science.gov (United States)

    Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan

    2017-01-01

    Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance. PMID:28874909

  12. Improving Recommendations in Tag-based Systems with Spectral Clustering of Tag Neighbors

    DEFF Research Database (Denmark)

    Pan, Rong; Xu, Guandong; Dolog, Peter

    2012-01-01

    Tag as a useful metadata reflects the collaborative and conceptual features of documents in social collaborative annotation systems. In this paper, we propose a collaborative approach for expanding tag neighbors and investigate the spectral clustering algorithm to filter out noisy tag neighbors...... in order to get appropriate recommendation for users. The preliminary experiments have been conducted on MovieLens dataset to compare our proposed approach with the traditional collaborative filtering recommendation approach and naive tag neighbors expansion approach in terms of precision, and the result...... demonstrates that our approach could considerably improve the performance of recommendations....

  13. Local biotic adaptation of trees and shrubs to plant neighbors

    Science.gov (United States)

    Grady, Kevin C.; Wood, Troy E.; Kolb, Thomas E.; Hersch-Green, Erika; Shuster, Stephen M.; Gehring, Catherine A.; Hart, Stephen C.; Allan, Gerard J.; Whitham, Thomas G.

    2017-01-01

    Natural selection as a result of plant–plant interactions can lead to local biotic adaptation. This may occur where species frequently interact and compete intensely for resources limiting growth, survival, and reproduction. Selection is demonstrated by comparing a genotype interacting with con- or hetero-specific sympatric neighbor genotypes with a shared site-level history (derived from the same source location), to the same genotype interacting with foreign neighbor genotypes (from different sources). Better genotype performance in sympatric than allopatric neighborhoods provides evidence of local biotic adaptation. This pattern might be explained by selection to avoid competition by shifting resource niches (differentiation) or by interactions benefitting one or more members (facilitation). We tested for local biotic adaptation among two riparian trees, Populus fremontii and Salix gooddingii, and the shrub Salix exigua by transplanting replicated genotypes from multiple source locations to a 17 000 tree common garden with sympatric and allopatric treatments along the Colorado River in California. Three major patterns were observed: 1) across species, 62 of 88 genotypes grew faster with sympatric neighbors than allopatric neighbors; 2) these growth rates, on an individual tree basis, were 44, 15 and 33% higher in sympatric than allopatric treatments for P. fremontii, S. exigua and S. gooddingii, respectively, and; 3) survivorship was higher in sympatric treatments for P. fremontiiand S. exigua. These results support the view that fitness of foundation species supporting diverse communities and dominating ecosystem processes is determined by adaptive interactions among multiple plant species with the outcome that performance depends on the genetic identity of plant neighbors. The occurrence of evolution in a plant-community context for trees and shrubs builds on ecological evolutionary research that has demonstrated co-evolution among herbaceous taxa, and

  14. ProClusEnsem: Predicting membrane protein types by fusing different modes of pseudo amino acid composition

    KAUST Repository

    Wang, Jim Jing-Yan; Li, Yongping; Wang, Quanquan; You, Xinge; Man, Jiaju; Wang, Chao; Gao, Xin

    2012-01-01

    Knowing the type of an uncharacterized membrane protein often provides a useful clue in both basic research and drug discovery. With the explosion of protein sequences generated in the post genomic era, determination of membrane protein types by experimental methods is expensive and time consuming. It therefore becomes important to develop an automated method to find the possible types of membrane proteins. In view of this, various computational membrane protein prediction methods have been proposed. They extract protein feature vectors, such as PseAAC (pseudo amino acid composition) and PsePSSM (pseudo position-specific scoring matrix) for representation of protein sequence, and then learn a distance metric for the KNN (K nearest neighbor) or NN (nearest neighbor) classifier to predicate the final type. Most of the metrics are learned using linear dimensionality reduction algorithms like Principle Components Analysis (PCA) and Linear Discriminant Analysis (LDA). Such metrics are common to all the proteins in the dataset. In fact, they assume that the proteins lie on a uniform distribution, which can be captured by the linear dimensionality reduction algorithm. We doubt this assumption, and learn local metrics which are optimized for local subset of the whole proteins. The learning procedure is iterated with the protein clustering. Then a novel ensemble distance metric is given by combining the local metrics through Tikhonov regularization. The experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of the proposed algorithm named ProClusEnsem. © 2012 Elsevier Ltd.

  15. ProClusEnsem: Predicting membrane protein types by fusing different modes of pseudo amino acid composition

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-05-01

    Knowing the type of an uncharacterized membrane protein often provides a useful clue in both basic research and drug discovery. With the explosion of protein sequences generated in the post genomic era, determination of membrane protein types by experimental methods is expensive and time consuming. It therefore becomes important to develop an automated method to find the possible types of membrane proteins. In view of this, various computational membrane protein prediction methods have been proposed. They extract protein feature vectors, such as PseAAC (pseudo amino acid composition) and PsePSSM (pseudo position-specific scoring matrix) for representation of protein sequence, and then learn a distance metric for the KNN (K nearest neighbor) or NN (nearest neighbor) classifier to predicate the final type. Most of the metrics are learned using linear dimensionality reduction algorithms like Principle Components Analysis (PCA) and Linear Discriminant Analysis (LDA). Such metrics are common to all the proteins in the dataset. In fact, they assume that the proteins lie on a uniform distribution, which can be captured by the linear dimensionality reduction algorithm. We doubt this assumption, and learn local metrics which are optimized for local subset of the whole proteins. The learning procedure is iterated with the protein clustering. Then a novel ensemble distance metric is given by combining the local metrics through Tikhonov regularization. The experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of the proposed algorithm named ProClusEnsem. © 2012 Elsevier Ltd.

  16. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  17. Correlations in a chain of three oscillators with nearest neighbour coupling

    Science.gov (United States)

    Idrus, B.; Konstadopoulou, A.; Spiller, T.; Vourdas, A.

    2010-04-01

    A chain of three oscillators A, B, C with nearest neighbour coupling, is considered. It is shown that the correlations between A, C (which are not coupled directly) can be stronger than the correlations between A, B. Also in some cases various witnesses of entanglement show that A, C are entangled but they cannot lead to any conclusion about A, B.

  18. Density functional approach for the magnetism of β-TeVO4

    Science.gov (United States)

    Saúl, A.; Radtke, G.

    2014-03-01

    Density functional calculations have been carried out to investigate the microscopic origin of the magnetic properties of β-TeVO4. Two different approaches, based either on a perturbative treatment of the multiorbital Hubbard model in the strongly correlated limit or on the calculation of supercell total energy differences, have been employed to evaluate magnetic couplings in this compound. The picture provided by these two approaches is that of weakly coupled frustrated chains with ferromagnetic nearest-neighbor and antiferromagnetic second-nearest-neighbor couplings. These results, differing substantially from previous reports, should motivate further experimental investigations of the magnetic properties of this compound.

  19. Identification of the interstitial Mn site in ferromagnetic (Ga,Mn)As

    CERN Document Server

    AUTHOR|(CDS)2093111; Wahl, Ulrich; Augustyns, Valerie; Silva, Daniel; Granadeiro Costa, Angelo Rafael; Houben, K; Edmonds, Kevin W; Gallagher, BL; Campion, RP; Van Bael, MJ; Castro Ribeiro Da Silva, Manuel; Martins Correia, Joao; Esteves De Araujo, Araujo Joao Pedro; Temst, Kristiaan; Vantomme, André; Da Costa Pereira, Lino Miguel

    2015-01-01

    We determined the lattice location of Mn in ferromagnetic (Ga,Mn)As using the electron emission channeling technique. We show that interstitial Mn occupies the tetrahedral site with As nearest neighbors (TAs) both before and after thermal annealing at 200 °C, whereas the occupancy of the tetrahedral site with Ga nearest neighbors (TGa) is negligible. TAs is therefore the energetically favorable site for interstitial Mn in isolated form as well as when forming complexes with substitutional Mn. These results shed new light on the long standing controversy regarding TAs versus TGa occupancy of interstitial Mn in (Ga,Mn)As.

  20. Low-field susceptibility of classical Heisenberg chains with arbitrary and different nearest-neighbour exchange

    International Nuclear Information System (INIS)

    Cregg, P J; Murphy, K; Garcia-Palacios, J L; Svedlindh, P

    2008-01-01

    Interest in molecular magnets continues to grow, offering a link between the atomic and nanoscale properties. The classical Heisenberg model has been effective in modelling exchange interactions in such systems. In this, the magnetization and susceptibility are calculated through the partition function, where the Hamiltonian contains both Zeeman and exchange energy. For an ensemble of N spins, this requires integrals in 2N dimensions. For two, three and four spin nearest-neighbour chains these integrals reduce to sums of known functions. For the case of the three and four spin chains, the sums are equivalent to results of Joyce. Expanding these sums, the effect of the exchange on the linear susceptibility appears as Langevin functions with exchange term arguments. These expressions are generalized here to describe an N spin nearest-neighbour chain, where the exchange between each pair of nearest neighbours is different and arbitrary. For a common exchange constant, this reduces to the result of Fisher. The high-temperature expansion of the Langevin functions for the different exchange constants leads to agreement with the appropriate high-temperature quantum formula of Schmidt et al, when the spin number is large. Simulations are presented for open linear chains of three, four and five spins with up to four different exchange constants, illustrating how the exchange constants can be retrieved successfully

  1. Working with Family, Friend, and Neighbor Caregivers: Lessons from Four Diverse Communities

    Science.gov (United States)

    Powell, Douglas R.

    2011-01-01

    This article is excerpted from "Who's Watching the Babies? Improving the Quality of Family, Friend, and Neighbor Care" by Douglas R. Powell ("ZERO TO THREE," 2008). The article explores questions about program development and implementation strategies for supporting Family, Friend, and Neighbor (FFN) caregivers: How do programs and their host…

  2. Supervised Self-Organizing Classification of Superresolution ISAR Images: An Anechoic Chamber Experiment

    Directory of Open Access Journals (Sweden)

    Radoi Emanuel

    2006-01-01

    Full Text Available The problem of the automatic classification of superresolution ISAR images is addressed in the paper. We describe an anechoic chamber experiment involving ten-scale-reduced aircraft models. The radar images of these targets are reconstructed using MUSIC-2D (multiple signal classification method coupled with two additional processing steps: phase unwrapping and symmetry enhancement. A feature vector is then proposed including Fourier descriptors and moment invariants, which are calculated from the target shape and the scattering center distribution extracted from each reconstructed image. The classification is finally performed by a new self-organizing neural network called SART (supervised ART, which is compared to two standard classifiers, MLP (multilayer perceptron and fuzzy KNN ( nearest neighbors. While the classification accuracy is similar, SART is shown to outperform the two other classifiers in terms of training speed and classification speed, especially for large databases. It is also easier to use since it does not require any input parameter related to its structure.

  3. Oil palm fresh fruit bunch ripeness classification based on rule- based expert system of ROI image processing technique results

    International Nuclear Information System (INIS)

    Alfatni, M S M; Shariff, A R M; Marhaban, M H; Shafie, S B; Saaed, O M B; Abdullah, M Z; BAmiruddin, M D

    2014-01-01

    There is a processing need for a fast, easy and accurate classification system for oil palm fruit ripeness. Such a system will be invaluable to farmers and plantation managers who need to sell their oil palm fresh fruit bunch (FFB) for the mill as this will avoid disputes. In this paper,a new approach was developed under the name of expert rules-based systembased on the image processing techniques results of thethree different oil palm FFB region of interests (ROIs), namely; ROI1 (300x300 pixels), ROI2 (50x50 pixels) and ROI3 (100x100 pixels). The results show that the best rule-based ROIs for statistical colour feature extraction with k-nearest neighbors (KNN) classifier at 94% were chosen as well as the ROIs that indicated results higher than the rule-based outcome, such as the ROIs of statistical colour feature extraction with artificial neural network (ANN) classifier at 94%, were selected for further FFB ripeness inspection system

  4. Method of Menu Selection by Gaze Movement Using AC EOG Signals

    Science.gov (United States)

    Kanoh, Shin'ichiro; Futami, Ryoko; Yoshinobu, Tatsuo; Hoshimiya, Nozomu

    A method to detect the direction and the distance of voluntary eye gaze movement from EOG (electrooculogram) signals was proposed and tested. In this method, AC-amplified vertical and horizontal transient EOG signals were classified into 8-class directions and 2-class distances of voluntary eye gaze movements. A horizontal and a vertical EOGs during eye gaze movement at each sampling time were treated as a two-dimensional vector, and the center of gravity of the sample vectors whose norms were more than 80% of the maximum norm was used as a feature vector to be classified. By the classification using the k-nearest neighbor algorithm, it was shown that the averaged correct detection rates on each subject were 98.9%, 98.7%, 94.4%, respectively. This method can avoid strict EOG-based eye tracking which requires DC amplification of very small signal. It would be useful to develop robust human interfacing systems based on menu selection for severely paralyzed patients.

  5. Multi-Model Prediction for Demand Forecast in Water Distribution Networks

    Directory of Open Access Journals (Sweden)

    Rodrigo Lopez Farias

    2018-03-01

    Full Text Available This paper presents a multi-model predictor called Qualitative Multi-Model Predictor Plus (QMMP+ for demand forecast in water distribution networks. QMMP+ is based on the decomposition of the quantitative and qualitative information of the time-series. The quantitative component (i.e., the daily consumption prediction is forecasted and the pattern mode estimated using a Nearest Neighbor (NN classifier and a Calendar. The patterns are updated via a simple Moving Average scheme. The NN classifier and the Calendar are executed simultaneously every period and the most suited model for prediction is selected using a probabilistic approach. The proposed solution for water demand forecast is compared against Radial Basis Function Artificial Neural Networks (RBF-ANN, the statistical Autoregressive Integrated Moving Average (ARIMA, and Double Seasonal Holt-Winters (DSHW approaches, providing the best results when applied to real demand of the Barcelona Water Distribution Network. QMMP+ has demonstrated that the special modelling treatment of water consumption patterns improves the forecasting accuracy.

  6. Exploring the Potential of Active Learning for Automatic Identification of Marine Oil Spills Using 10-Year (2004–2013 RADARSAT Data

    Directory of Open Access Journals (Sweden)

    Yongfeng Cao

    2017-10-01

    Full Text Available This paper intends to find a more cost-effective way for training oil spill classification systems by introducing active learning (AL and exploring its potential, so that satisfying classifiers could be learned with reduced number of labeled samples. The dataset used has 143 oil spills and 124 look-alikes from 198 RADARSAT images covering the east and west coasts of Canada from 2004 to 2013. Six uncertainty-based active sample selecting (ACS methods are designed to choose the most informative samples. A method for reducing information redundancy amongst the selected samples and a method with varying sample preference are considered. Four classifiers (k-nearest neighbor (KNN, support vector machine (SVM, linear discriminant analysis (LDA and decision tree (DT are coupled with ACS methods to explore the interaction and possible preference between classifiers and ACS methods. Three kinds of measures are adopted to highlight different aspect of classification performance of these AL-boosted classifiers. Overall, AL proves its strong potential with 4% to 78% reduction on training samples in different settings. The SVM classifier shows to be the best one for using in the AL frame, with perfect performance evolving curves in different kinds of measures. The exploration and exploitation criterion can further improve the performance of the AL-boosted SVM classifier but not of the other classifiers.

  7. Thermodynamic optimization of the (Na2O + SiO2 + NaF + SiF4) reciprocal system using the Modified Quasichemical Model in the Quadruplet Approximation

    International Nuclear Information System (INIS)

    Lambotte, Guillaume; Chartrand, Patrice

    2011-01-01

    Highlights: → We model the Na 2 O-SiO 2 -NaF-SiF 4 reciprocal system based on a comprehensive review of all available experimental data. → The assessment includes Na 2 O-SiO 2 and NaF-SiF 4 binary systems. → Improvements to the Modified Quasichemical Model in the Quadruplet Approximation are presented. → The very strong short-range ordering among first-nearest and second-nearest neighbors in this system is reproduced. → This work constitutes the first assessment for all compositions and temperatures of a reciprocal oxyfluoride system. - Abstract: All available thermodynamic and phase diagram data for the condensed phases of the ternary reciprocal system (NaF + SiF 4 + Na 2 O + SiO 2 ) have been critically assessed. Model parameters for the unary (SiF 4 ), the binary systems and the ternary reciprocal system have been found, which permit to reproduce the most reliable experimental data. The Modified Quasichemical Model in the Quadruplet Approximation was used for the oxyfluoride liquid solution, which exhibits strong first-nearest-neighbor and second-nearest-neighbor short-range ordering. This thermodynamic model takes into account both types of short-range ordering as well as the coupling between them. Model parameters have been estimated for the hypothetical high-temperature liquid SiF 4 .

  8. Automated diagnosis of dry eye using infrared thermography images

    Science.gov (United States)

    Acharya, U. Rajendra; Tan, Jen Hong; Koh, Joel E. W.; Sudarshan, Vidya K.; Yeo, Sharon; Too, Cheah Loon; Chua, Chua Kuang; Ng, E. Y. K.; Tong, Louis

    2015-07-01

    Dry Eye (DE) is a condition of either decreased tear production or increased tear film evaporation. Prolonged DE damages the cornea causing the corneal scarring, thinning and perforation. There is no single uniform diagnosis test available to date; combinations of diagnostic tests are to be performed to diagnose DE. The current diagnostic methods available are subjective, uncomfortable and invasive. Hence in this paper, we have developed an efficient, fast and non-invasive technique for the automated identification of normal and DE classes using infrared thermography images. The features are extracted from nonlinear method called Higher Order Spectra (HOS). Features are ranked using t-test ranking strategy. These ranked features are fed to various classifiers namely, K-Nearest Neighbor (KNN), Nave Bayesian Classifier (NBC), Decision Tree (DT), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM) to select the best classifier using minimum number of features. Our proposed system is able to identify the DE and normal classes automatically with classification accuracy of 99.8%, sensitivity of 99.8%, and specificity if 99.8% for left eye using PNN and KNN classifiers. And we have reported classification accuracy of 99.8%, sensitivity of 99.9%, and specificity if 99.4% for right eye using SVM classifier with polynomial order 2 kernel.

  9. Distributed Classification of Localization Attacks in Sensor Networks Using Exchange-Based Feature Extraction and Classifier

    Directory of Open Access Journals (Sweden)

    Su-Zhe Wang

    2016-01-01

    Full Text Available Secure localization under different forms of attack has become an essential task in wireless sensor networks. Despite the significant research efforts in detecting the malicious nodes, the problem of localization attack type recognition has not yet been well addressed. Motivated by this concern, we propose a novel exchange-based attack classification algorithm. This is achieved by a distributed expectation maximization extractor integrated with the PECPR-MKSVM classifier. First, the mixed distribution features based on the probabilistic modeling are extracted using a distributed expectation maximization algorithm. After feature extraction, by introducing the theory from support vector machine, an extensive contractive Peaceman-Rachford splitting method is derived to build the distributed classifier that diffuses the iteration calculation among neighbor sensors. To verify the efficiency of the distributed recognition scheme, four groups of experiments were carried out under various conditions. The average success rate of the proposed classification algorithm obtained in the presented experiments for external attacks is excellent and has achieved about 93.9% in some cases. These testing results demonstrate that the proposed algorithm can produce much greater recognition rate, and it can be also more robust and efficient even in the presence of excessive malicious scenario.

  10. Modeling the effect of neighboring grains on twin growth in HCP polycrystals

    Science.gov (United States)

    Kumar, M. Arul; Beyerlein, I. J.; Lebensohn, R. A.; Tomé, C. N.

    2017-09-01

    In this paper, we study the dependence of neighboring grain orientation on the local stress state around a deformation twin in a hexagonal close packed (HCP) crystal and its effects on the resistance against twin thickening. We use a recently developed, full-field elasto-visco-plastic formulation based on fast Fourier transforms that account for the twinning shear transformation imposed by the twin lamella. The study is applied to Mg, Zr and Ti, since these HCP metals tend to deform by activation of different types of slip modes. The analysis shows that the local stress along the twin boundary are strongly controlled by the relative orientation of the easiest deformation modes in the neighboring grain with respect to the twin lamella in the parent grain. A geometric expression that captures this parent-neighbor relationship is proposed and incorporated into a larger scale, mean-field visco-plastic self-consistent model to simulate the role of neighboring grain orientation on twin thickening. We demonstrate that the approach improves the prediction of twin area fraction distribution when compared with experimental observations.

  11. Band nesting, massive Dirac fermions, and valley Landé and Zeeman effects in transition metal dichalcogenides: A tight-binding model

    Science.gov (United States)

    Bieniek, Maciej; Korkusiński, Marek; Szulakowska, Ludmiła; Potasz, Paweł; Ozfidan, Isil; Hawrylak, Paweł

    2018-02-01

    We present here the minimal tight-binding model for a single layer of transition metal dichalcogenides (TMDCs) MX 2(M , metal; X , chalcogen) which illuminates the physics and captures band nesting, massive Dirac fermions, and valley Landé and Zeeman magnetic field effects. TMDCs share the hexagonal lattice with graphene but their electronic bands require much more complex atomic orbitals. Using symmetry arguments, a minimal basis consisting of three metal d orbitals and three chalcogen dimer p orbitals is constructed. The tunneling matrix elements between nearest-neighbor metal and chalcogen orbitals are explicitly derived at K ,-K , and Γ points of the Brillouin zone. The nearest-neighbor tunneling matrix elements connect specific metal and sulfur orbitals yielding an effective 6 ×6 Hamiltonian giving correct composition of metal and chalcogen orbitals but not the direct gap at K points. The direct gap at K , correct masses, and conduction band minima at Q points responsible for band nesting are obtained by inclusion of next-neighbor Mo-Mo tunneling. The parameters of the next-nearest-neighbor model are successfully fitted to MX 2(M =Mo ; X =S ) density functional ab initio calculations of the highest valence and lowest conduction band dispersion along K -Γ line in the Brillouin zone. The effective two-band massive Dirac Hamiltonian for MoS2, Landé g factors, and valley Zeeman splitting are obtained.

  12. Hybrid classifiers methods of data, knowledge, and classifier combination

    CERN Document Server

    Wozniak, Michal

    2014-01-01

    This book delivers a definite and compact knowledge on how hybridization can help improving the quality of computer classification systems. In order to make readers clearly realize the knowledge of hybridization, this book primarily focuses on introducing the different levels of hybridization and illuminating what problems we will face with as dealing with such projects. In the first instance the data and knowledge incorporated in hybridization were the action points, and then a still growing up area of classifier systems known as combined classifiers was considered. This book comprises the aforementioned state-of-the-art topics and the latest research results of the author and his team from Department of Systems and Computer Networks, Wroclaw University of Technology, including as classifier based on feature space splitting, one-class classification, imbalance data, and data stream classification.

  13. Dispersion of a layered electron gas with nearest neighbour-tunneling

    International Nuclear Information System (INIS)

    Miesenboeck, H.M.

    1988-09-01

    The dispersion of the first plasmon band is calculated within the Random Phase Approximation for a superlattice of two-dimensional electron-gases, mutually interacting, and with nearest neighbour hopping between the planes. It is further shown that the deviations of this dispersion from the one in systems with zero interplane motion are very small in commonly realized experimental situations and that they are expected to be observable only in samples with plane distances of 100A and less. (author). 15 refs, 3 figs, 1 tab

  14. SibRank: Signed bipartite network analysis for neighbor-based collaborative ranking

    Science.gov (United States)

    Shams, Bita; Haratizadeh, Saman

    2016-09-01

    Collaborative ranking is an emerging field of recommender systems that utilizes users' preference data rather than rating values. Unfortunately, neighbor-based collaborative ranking has gained little attention despite its more flexibility and justifiability. This paper proposes a novel framework, called SibRank that seeks to improve the state of the art neighbor-based collaborative ranking methods. SibRank represents users' preferences as a signed bipartite network, and finds similar users, through a novel personalized ranking algorithm in signed networks.

  15. ENTROPY CHARACTERISTICS IN MODELS FOR COORDINATION OF NEIGHBORING ROAD SECTIONS

    Directory of Open Access Journals (Sweden)

    N. I. Kulbashnaya

    2016-01-01

    Full Text Available The paper considers an application of entropy characteristics as criteria to coordinate traffic conditions at neighboring road sections. It has been proved that the entropy characteristics are widely used in the methods that take into account information influence of the environment on drivers and in the mechanisms that create such traffic conditions which ensure preservation of the optimal level of driver’s emotional tension during the drive. Solution of such problem is considered in the aspect of coordination of traffic conditions at neighboring road sections that, in its turn, is directed on exclusion of any driver’s transitional processes. Methodology for coordination of traffic conditions at neighboring road sections is based on the E. V. Gavrilov’s concept on coordination of some parameters of road sections which can be expressed in the entropy characteristics. The paper proposes to execute selection of coordination criteria according to accident rates because while moving along neighboring road sections traffic conditions change drastically that can result in creation of an accident situation. Relative organization of a driver’s perception field and driver’s interaction with the traffic environment has been selected as entropy characteristics. Therefore, the given characteristics are made conditional to the road accidents rate. The investigation results have revealed a strong correlation between the relative organization of the driver’s perception field and the relative organization of the driver’s interaction with the traffic environment and the accident rate. Results of the executed experiment have proved an influence of the accident rate on the investigated entropy characteristics.

  16. Do alcohol compliance checks decrease underage sales at neighboring establishments?

    Science.gov (United States)

    Erickson, Darin J; Smolenski, Derek J; Toomey, Traci L; Carlin, Bradley P; Wagenaar, Alexander C

    2013-11-01

    Underage alcohol compliance checks conducted by law enforcement agencies can reduce the likelihood of illegal alcohol sales at checked alcohol establishments, and theory suggests that an alcohol establishment that is checked may warn nearby establishments that compliance checks are being conducted in the area. In this study, we examined whether the effects of compliance checks diffuse to neighboring establishments. We used data from the Complying with the Minimum Drinking Age trial, which included more than 2,000 compliance checks conducted at more than 900 alcohol establishments. The primary outcome was the sale of alcohol to a pseudo-underage buyer without the need for age identification. A multilevel logistic regression was used to model the effect of a compliance check at each establishment as well as the effect of compliance checks at neighboring establishments within 500 m (stratified into four equal-radius concentric rings), after buyer, license, establishment, and community-level variables were controlled for. We observed a decrease in the likelihood of establishments selling alcohol to underage youth after they had been checked by law enforcement, but these effects quickly decayed over time. Establishments that had a close neighbor (within 125 m) checked in the past 90 days were also less likely to sell alcohol to young-appearing buyers. The spatial effect of compliance checks on other establishments decayed rapidly with increasing distance. Results confirm the hypothesis that the effects of police compliance checks do spill over to neighboring establishments. These findings have implications for the development of an optimal schedule of police compliance checks.

  17. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  18. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    Science.gov (United States)

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  19. Spin-waves in Antiferromagnetic Single-crystal LiFePO4

    International Nuclear Information System (INIS)

    Li, Jiying; Garlea, Vasile O.; Zarestky, Jarel; Vaknin, D.

    2006-01-01

    Spin-wave dispersions in the antiferromagnetic state of single-crystal LiFePO 4 were determined by inelastic neutron scattering measurements. The dispersion curves measured from the (0,1,0) reflection along both a* and b* reciprocal-space directions reflect the anisotropic coupling of the layered Fe 2+ (S=2) spin system. The spin-wave dispersion curves were theoretically modeled using linear spin-wave theory by including in the spin Hamiltonian in-plane nearest- and next-nearest-neighbor interactions (J 1 and J 2 ), inter-plane nearest-neighbor interactions (J(perpendicular)) and a single-ion anisotropy (D). A weak (0,1,0) magnetic peak was observed in elastic neutron scattering studies of the same crystal indicating that the ground state of the staggered iron moments is not along the (0,1,0) direction, as previously reported from polycrystalline samples studies, but slightly rotated away from this axis.

  20. Plant Clonal Integration Mediates the Horizontal Redistribution of Soil Resources, Benefiting Neighboring Plants.

    Science.gov (United States)

    Ye, Xue-Hua; Zhang, Ya-Lin; Liu, Zhi-Lan; Gao, Shu-Qin; Song, Yao-Bin; Liu, Feng-Hong; Dong, Ming

    2016-01-01

    Resources such as water taken up by plants can be released into soils through hydraulic redistribution and can also be translocated by clonal integration within a plant clonal network. We hypothesized that the resources from one (donor) microsite could be translocated within a clonal network, released into different (recipient) microsites and subsequently used by neighbor plants in the recipient microsite. To test these hypotheses, we conducted two experiments in which connected and disconnected ramet pairs of Potentilla anserina were grown under both homogeneous and heterogeneous water regimes, with seedlings of Artemisia ordosica as neighbors. The isotopes [(15)N] and deuterium were used to trace the translocation of nitrogen and water, respectively, within the clonal network. The water and nitrogen taken up by P. anserina ramets in the donor microsite were translocated into the connected ramets in the recipient microsites. Most notably, portions of the translocated water and nitrogen were released into the recipient microsite and were used by the neighboring A. ordosica, which increased growth of the neighboring A. ordosica significantly. Therefore, our hypotheses were supported, and plant clonal integration mediated the horizontal hydraulic redistribution of resources, thus benefiting neighboring plants. Such a plant clonal integration-mediated resource redistribution in horizontal space may have substantial effects on the interspecific relations and composition of the community and consequently on ecosystem processes.

  1. DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions.

    Science.gov (United States)

    Liang, Zhaohui; Huang, Jimmy Xiangji; Zeng, Xing; Zhang, Gang

    2016-08-10

    chain. A least square loss (LASSO) algorithm and a k-Nearest Neighbors (kNN) algorithm are used as the baselines for comparison and to evaluate the performance of our proposed deep learning model. There are 53 adverse reactions reported during the observation. They are assigned to 14 categories. In the comparison of classification accuracy, the deep learning model shows superiority over the LASSO and kNN model with a rate over 80 %. In the comparison of reliability, the deep learning model shows the best stability among the three models. Machine learning provides a new method to explore the complex associations among genomic variations and multiple events in pharmacogenomics studies. The new deep learning algorithm is capable of classifying various SNPs to the corresponding adverse reactions. We expect that as more genomic variations are added as features and more observations are made, the deep learning model can improve its performance and can act as a black-box but reliable verifier for other GWAS studies.

  2. Fault Diagnosis of Three Phase Induction Motor Using Current Signal, MSAF-Ratio15 and Selected Classifiers

    Directory of Open Access Journals (Sweden)

    Glowacz A.

    2017-12-01

    Full Text Available A degradation of metallurgical equipment is normal process depended on time. Some factors such as: operation process, friction, high temperature can accelerate the degradation process of metallurgical equipment. In this paper the authors analyzed three phase induction motors. These motors are common used in the metallurgy industry, for example in conveyor belt. The diagnostics of such motors is essential. An early detection of faults prevents financial loss and downtimes. The authors proposed a technique of fault diagnosis based on recognition of currents. The authors analyzed 4 states of three phase induction motor: healthy three phase induction motor, three phase induction motor with 1 faulty rotor bar, three phase induction motor with 2 faulty rotor bars, three phase induction motor with faulty ring of squirrel-cage. An analysis was carried out for original method of feature extraction called MSAF-RATIO15 (Method of Selection of Amplitudes of Frequencies – Ratio 15% of maximum of amplitude. A classification of feature vectors was performed by Bayes classifier, Linear Discriminant Analysis (LDA and Nearest Neighbour classifier. The proposed technique of fault diagnosis can be used for protection of three phase induction motors and other rotating electrical machines. In the near future the authors will analyze other motors and faults. There is also idea to use thermal, acoustic, electrical, vibration signal together.

  3. Rapid corn and soybean mapping in US Corn Belt and neighboring areas

    Science.gov (United States)

    Zhong, Liheng; Yu, Le; Li, Xuecao; Hu, Lina; Gong, Peng

    2016-11-01

    The goal of this study was to promptly map the extent of corn and soybeans early in the growing season. A classification experiment was conducted for the US Corn Belt and neighboring states, which is the most important production area of corn and soybeans in the world. To improve the timeliness of the classification algorithm, training was completely based on reference data and images from other years, circumventing the need to finish reference data collection in the current season. To account for interannual variability in crop development in the cross-year classification scenario, several innovative strategies were used. A random forest classifier was used in all tests, and MODIS surface reflectance products from the years 2008-2014 were used for training and cross-year validation. It is concluded that the fuzzy classification approach is necessary to achieve satisfactory results with R-squared ~0.9 (compared with the USDA Cropland Data Layer). The year of training data is an important factor, and it is recommended to select a year with similar crop phenology as the mapping year. With this phenology-based and cross-year-training method, in 2015 we mapped the cropping proportion of corn and soybeans around mid-August, when the two crops just reached peak growth.

  4. Modelo digital do terreno através de diferentes interpolações do programa Surfer 12 | Digital terrain model through different interpolations in the surfer 12 software

    Directory of Open Access Journals (Sweden)

    José Machado

    2016-04-01

    the MDT interpolation of measured points is required. The use of TDM, 3D surfaces and contours in moving fast computer programs and can create some problems, such as the type of interpolation used. This work aims to analyze the interpolation methods in points quoted from an irregular geometric figure generated by the Surfer program. They used 12 interpolations available (Data Metrics, Inverse Distance, Kriging, Local Polynomial, Minimum Curvature, Modified Shepard Method, Moving Average, Natural Neighbor, Nearest Neighbor, Polynomial Regression, Radial fuction and Triangulation with Linear Interpolation and analyzed the generated topographic maps. The relief was generated graphical representation via the MDT. They were awarded the excellent concepts, excellent, good, average and bad representation of relief and discussed according Relief representations to the listed geometric image. Data Metrics, Polynomial Regression, Moving Average e Local Polynomial (bad; Moving Average e Modified Shepard Method (regular; Nearest Neighbor (media; Inverse Distance (good; Kriging e Radial Function (great e Triangulation With Linear Interpolation e Natural Neighbor (excellent conditions to representation presented dates.

  5. Neighbor Discovery Algorithm in Wireless Local Area Networks Using Multi-beam Directional Antennas

    Science.gov (United States)

    Wang, Jin; Peng, Wei; Liu, Song

    2017-10-01

    Neighbor discovery is an important step for Wireless Local Area Networks (WLAN) and the use of multi-beam directional antennas can greatly improve the network performance. However, most neighbor discovery algorithms in WLAN, based on multi-beam directional antennas, can only work effectively in synchronous system but not in asynchro-nous system. And collisions at AP remain a bottleneck for neighbor discovery. In this paper, we propose two asynchrono-us neighbor discovery algorithms: asynchronous hierarchical scanning (AHS) and asynchronous directional scanning (ADS) algorithm. Both of them are based on three-way handshaking mechanism. AHS and ADS reduce collisions at AP to have a good performance in a hierarchical way and directional way respectively. In the end, the performance of the AHS and ADS are tested on OMNeT++. Moreover, it is analyzed that different application scenarios and the factors how to affect the performance of these algorithms. The simulation results show that AHS is suitable for the densely populated scenes around AP while ADS is suitable for that most of the neighborhood nodes are far from AP.

  6. Is a reduction in distance to nearest supermarket associated with BMI change among type 2 diabetes patients?

    Science.gov (United States)

    Zhang, Y Tara; Laraia, Barbara A; Mujahid, Mahasin S; Blanchard, Samuel D; Warton, E Margaret; Moffet, Howard H; Karter, Andrew J

    2016-07-01

    We examined whether residing within 2 miles of a new supermarket opening was longitudinally associated with a change in body mass index (BMI). We identified 12 new supermarkets that opened between 2009 and 2010 in 8 neighborhoods. Using the Kaiser Permanente Northern California Diabetes Registry, we identified members with type 2 diabetes residing continuously in any of these neighborhoods 12 months prior to the first supermarket opening until 10 months following the opening of the last supermarket. Exposure was defined as a reduction (yes/no) in travel distance to the nearest supermarket as a result of a new supermarket opening. First difference regression models were used to estimate the impact of reduced supermarket distance on BMI, adjusting for longitudinal changes in patient and neighborhood characteristics. Among patients in the exposed group, new supermarket openings reduced travel distance to the nearest supermarket by 0.7 miles on average. However, reduced distance to nearest supermarket was not associated with BMI changes. Overall, we found no evidence that reduced supermarket distance was associated with reduced levels of obesity for residents with type 2 diabetes. Published by Elsevier Ltd.

  7. Embedded vision equipment of industrial robot for inline detection of product errors by clustering–classification algorithms

    Directory of Open Access Journals (Sweden)

    Kamil Zidek

    2016-10-01

    Full Text Available The article deals with the design of embedded vision equipment of industrial robots for inline diagnosis of product error during manipulation process. The vision equipment can be attached to the end effector of robots or manipulators, and it provides an image snapshot of part surface before grasp, searches for error during manipulation, and separates products with error from the next operation of manufacturing. The new approach is a methodology based on machine teaching for the automated identification, localization, and diagnosis of systematic errors in products of high-volume production. To achieve this, we used two main data mining algorithms: clustering for accumulation of similar errors and classification methods for the prediction of any new error to proposed class. The presented methodology consists of three separate processing levels: image acquisition for fail parameterization, data clustering for categorizing errors to separate classes, and new pattern prediction with a proposed class model. We choose main representatives of clustering algorithms, for example, K-mean from quantization of vectors, fast library for approximate nearest neighbor from hierarchical clustering, and density-based spatial clustering of applications with noise from algorithm based on the density of the data. For machine learning, we selected six major algorithms of classification: support vector machines, normal Bayesian classifier, K-nearest neighbor, gradient boosted trees, random trees, and neural networks. The selected algorithms were compared for speed and reliability and tested on two platforms: desktop-based computer system and embedded system based on System on Chip (SoC with vision equipment.

  8. Using recurrent neural network models for early detection of heart failure onset.

    Science.gov (United States)

    Choi, Edward; Schuetz, Andy; Stewart, Walter F; Sun, Jimeng

    2017-03-01

    We explored whether use of deep learning to model temporal relations among events in electronic health records (EHRs) would improve model performance in predicting initial diagnosis of heart failure (HF) compared to conventional methods that ignore temporality. Data were from a health system's EHR on 3884 incident HF cases and 28 903 controls, identified as primary care patients, between May 16, 2000, and May 23, 2013. Recurrent neural network (RNN) models using gated recurrent units (GRUs) were adapted to detect relations among time-stamped events (eg, disease diagnosis, medication orders, procedure orders, etc.) with a 12- to 18-month observation window of cases and controls. Model performance metrics were compared to regularized logistic regression, neural network, support vector machine, and K-nearest neighbor classifier approaches. Using a 12-month observation window, the area under the curve (AUC) for the RNN model was 0.777, compared to AUCs for logistic regression (0.747), multilayer perceptron (MLP) with 1 hidden layer (0.765), support vector machine (SVM) (0.743), and K-nearest neighbor (KNN) (0.730). When using an 18-month observation window, the AUC for the RNN model increased to 0.883 and was significantly higher than the 0.834 AUC for the best of the baseline methods (MLP). Deep learning models adapted to leverage temporal relations appear to improve performance of models for detection of incident heart failure with a short observation window of 12-18 months. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  9. Bacterial genomes lacking long-range correlations may not be modeled by low-order Markov chains: the role of mixing statistics and frame shift of neighboring genes.

    Science.gov (United States)

    Cocho, Germinal; Miramontes, Pedro; Mansilla, Ricardo; Li, Wentian

    2014-12-01

    We examine the relationship between exponential correlation functions and Markov models in a bacterial genome in detail. Despite the well known fact that Markov models generate sequences with correlation function that decays exponentially, simply constructed Markov models based on nearest-neighbor dimer (first-order), trimer (second-order), up to hexamer (fifth-order), and treating the DNA sequence as being homogeneous all fail to predict the value of exponential decay rate. Even reading-frame-specific Markov models (both first- and fifth-order) could not explain the fact that the exponential decay is very slow. Starting with the in-phase coding-DNA-sequence (CDS), we investigated correlation within a fixed-codon-position subsequence, and in artificially constructed sequences by packing CDSs with out-of-phase spacers, as well as altering CDS length distribution by imposing an upper limit. From these targeted analyses, we conclude that the correlation in the bacterial genomic sequence is mainly due to a mixing of heterogeneous statistics at different codon positions, and the decay of correlation is due to the possible out-of-phase between neighboring CDSs. There are also small contributions to the correlation from bases at the same codon position, as well as by non-coding sequences. These show that the seemingly simple exponential correlation functions in bacterial genome hide a complexity in correlation structure which is not suitable for a modeling by Markov chain in a homogeneous sequence. Other results include: use of the (absolute value) second largest eigenvalue to represent the 16 correlation functions and the prediction of a 10-11 base periodicity from the hexamer frequencies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. Approximate and exact hybrid algorithms for private nearest-neighbor queries with database protection

    KAUST Repository

    Ghinita, Gabriel; Kalnis, Panos; Kantarcioǧlu, Murâ t; Bertino, Elisa

    2010-01-01

    Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. To protect user privacy, it is important not to disclose exact user coordinates to un-trusted entities that provide location-based services. Currently, there are two main approaches to protect the location privacy of users: (i) hiding locations inside cloaking regions (CRs) and (ii) encrypting location data using private information retrieval (PIR) protocols. Previous work focused on finding good trade-offs between privacy and performance of user protection techniques, but disregarded the important issue of protecting the POI dataset D. For instance, location cloaking requires large-sized CRs, leading to excessive disclosure of POIs (O({pipe}D{pipe}) in the worst case). PIR, on the other hand, reduces this bound to O(√{pipe}D{pipe}), but at the expense of high processing and communication overhead. We propose hybrid, two-step approaches for private location-based queries which provide protection for both the users and the database. In the first step, user locations are generalized to coarse-grained CRs which provide strong privacy. Next, a PIR protocol is applied with respect to the obtained query CR. To protect against excessive disclosure of POI locations, we devise two cryptographic protocols that privately evaluate whether a point is enclosed inside a rectangular region or a convex polygon. We also introduce algorithms to efficiently support PIR on dynamic POI sub-sets. We provide solutions for both approximate and exact NN queries. In the approximate case, our method discloses O(1) POI, orders of magnitude fewer than CR- or PIR-based techniques. For the exact case, we obtain optimal disclosure of a single POI, although with slightly higher computational overhead. Experimental results show that the hybrid approaches are scalable in practice, and outperform the pure-PIR approach in terms of computational and communication overhead. © 2010 Springer Science+Business Media, LLC.

  11. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents.

    OpenAIRE

    Glickman, B W; Horsfall, M J; Gordon, A J; Burns, P A

    1987-01-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preference was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species is the alkyl diazonium ion. In contrast, this preference ...

  12. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents.

    Science.gov (United States)

    Glickman, B W; Horsfall, M J; Gordon, A J; Burns, P A

    1987-01-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preference was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species is the alkyl diazonium ion. In contrast, this preference was not seen following treatment with ethylmethanesulfonate. The observed preference for 5'PuG-3' site over 5'-PyG-3' sites corresponds well with alterations observed in the Ha-ras oncogene recovered after treatment with NMU. This indicates that the mutations recovered in the oncogenes are likely the direct consequence of the alkylation treatment and that the local sequence effects seen in E. coli also appear to occur in mammalian cells. PMID:3329097

  13. Nearest neighbor affects G:C to A:T transitions induced by alkylating agents

    Energy Technology Data Exchange (ETDEWEB)

    Glickman, B.W.; Horsfall, M.J.; Gordon, A.J.E.; Burns, P.A.

    1987-12-01

    The influence of local DNA sequence on the distribution of G:C to A:T transitions induced in the lacI gene of E. coli by a series of alkylating agents has been analyzed. In the case of nitrosoguanidine, two nitrosoureas and a nitrosamine, a strong preference for mutation at sites proceeded 5' by a purine base was noted. This preferences was observed with both methyl and ethyl donors where the predicted common ultimate alkylating species in the alkyl diazonium ion. In contrast, this preferences was not seen following treatment with ethylmethanesulfonate. The observed preference for 5'PuG-3' site over 5'-PyG-3' sites corresponds well with alterations observed in the Ha-ras oncogene recovered after treatment with NMU. This indicates that the mutations recovered in the oncogenes are likely the direct consequence of the alkylation treatment and that the local sequence effects seen in E. coli also appear to occur in mammalian cells.

  14. Approximate and exact hybrid algorithms for private nearest-neighbor queries with database protection

    KAUST Repository

    Ghinita, Gabriel

    2010-12-15

    Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. To protect user privacy, it is important not to disclose exact user coordinates to un-trusted entities that provide location-based services. Currently, there are two main approaches to protect the location privacy of users: (i) hiding locations inside cloaking regions (CRs) and (ii) encrypting location data using private information retrieval (PIR) protocols. Previous work focused on finding good trade-offs between privacy and performance of user protection techniques, but disregarded the important issue of protecting the POI dataset D. For instance, location cloaking requires large-sized CRs, leading to excessive disclosure of POIs (O({pipe}D{pipe}) in the worst case). PIR, on the other hand, reduces this bound to O(√{pipe}D{pipe}), but at the expense of high processing and communication overhead. We propose hybrid, two-step approaches for private location-based queries which provide protection for both the users and the database. In the first step, user locations are generalized to coarse-grained CRs which provide strong privacy. Next, a PIR protocol is applied with respect to the obtained query CR. To protect against excessive disclosure of POI locations, we devise two cryptographic protocols that privately evaluate whether a point is enclosed inside a rectangular region or a convex polygon. We also introduce algorithms to efficiently support PIR on dynamic POI sub-sets. We provide solutions for both approximate and exact NN queries. In the approximate case, our method discloses O(1) POI, orders of magnitude fewer than CR- or PIR-based techniques. For the exact case, we obtain optimal disclosure of a single POI, although with slightly higher computational overhead. Experimental results show that the hybrid approaches are scalable in practice, and outperform the pure-PIR approach in terms of computational and communication overhead. © 2010 Springer Science+Business Media, LLC.

  15. Quasi-phases and pseudo-transitions in one-dimensional models with nearest neighbor interactions

    Science.gov (United States)

    de Souza, S. M.; Rojas, Onofre

    2018-01-01

    There are some particular one-dimensional models, such as the Ising-Heisenberg spin models with a variety of chain structures, which exhibit unexpected behaviors quite similar to the first and second order phase transition, which could be confused naively with an authentic phase transition. Through the analysis of the first derivative of free energy, such as entropy, magnetization, and internal energy, a "sudden" jump that closely resembles a first-order phase transition at finite temperature occurs. However, by analyzing the second derivative of free energy, such as specific heat and magnetic susceptibility at finite temperature, it behaves quite similarly to a second-order phase transition exhibiting an astonishingly sharp and fine peak. The correlation length also confirms the evidence of this pseudo-transition temperature, where a sharp peak occurs at the pseudo-critical temperature. We also present the necessary conditions for the emergence of these quasi-phases and pseudo-transitions.

  16. THE SOLAR NEIGHBORHOOD XXIX: THE HABITABLE REAL ESTATE OF OUR NEAREST STELLAR NEIGHBORS

    Energy Technology Data Exchange (ETDEWEB)

    Cantrell, Justin R.; Henry, Todd J.; White, Russel J., E-mail: cantrell@chara.gsu.edu, E-mail: thenry@chara.gsu.edu, E-mail: white@chara.gsu.edu [Georgia State University, Atlanta, GA 30302-4106 (United States)

    2013-10-01

    We use the sample of known stars and brown dwarfs within 5 pc of the Sun, supplemented with AFGK stars within 10 pc, to determine which stellar spectral types provide the most habitable real estate—defined as locations where liquid water could be present on Earth-like planets. Stellar temperatures and radii are determined by fitting model spectra to spatially resolved broadband photometric energy distributions for stars in the sample. Using these values, the locations of the habitable zones are calculated using an empirical formula for planetary surface temperature and assuming the condition of liquid water, called here the empirical habitable zone (EHZ). Systems that have dynamically disruptive companions are considered not habitable. We consider companions to be disruptive if the separation ratio of the companion to the habitable zone is less than 5:1. We use the results of these calculations to derive a simple formula for predicting the location of the EHZ for main sequence stars based on V – K color. We consider EHZ widths as more useful measures of the habitable real estate around stars than areas because multiple planets are not expected to orbit stars at identical stellar distances. This EHZ provides a qualitative guide on where to expect the largest population of planets in the habitable zones of main sequence stars. Because of their large numbers and lower frequency of short-period companions, M stars provide more EHZ real estate than other spectral types, possessing 36.5% of the habitable real estate en masse. K stars are second with 21.5%, while A, F, and G stars offer 18.5%, 6.9%, and 16.6%, respectively. Our calculations show that three M dwarfs within 10 pc harbor planets in their EHZs—GJ 581 may have two planets (d with msin i = 6.1 M {sub ⊕}; g with msin i = 3.1 M {sub ⊕}), GJ 667 C has one (c with msin i = 4.5 M {sub ⊕}), and GJ 876 has two (b with msin i = 1.89 M {sub Jup} and c with msin i = 0.56 M {sub Jup}). If Earth-like planets are as common around low-mass stars as recent Kepler results suggest, M stars will harbor more Earth-like planets in habitable zones than any other stellar spectral type.

  17. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.

    Science.gov (United States)

    Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

    2015-12-01

    Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data

    OpenAIRE

    Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

    2015-01-01

    Most clinical and biomedical data contain missing values. A patient’s record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across tim...

  19. THE SOLAR NEIGHBORHOOD XXIX: THE HABITABLE REAL ESTATE OF OUR NEAREST STELLAR NEIGHBORS

    International Nuclear Information System (INIS)

    Cantrell, Justin R.; Henry, Todd J.; White, Russel J.

    2013-01-01

    We use the sample of known stars and brown dwarfs within 5 pc of the Sun, supplemented with AFGK stars within 10 pc, to determine which stellar spectral types provide the most habitable real estate—defined as locations where liquid water could be present on Earth-like planets. Stellar temperatures and radii are determined by fitting model spectra to spatially resolved broadband photometric energy distributions for stars in the sample. Using these values, the locations of the habitable zones are calculated using an empirical formula for planetary surface temperature and assuming the condition of liquid water, called here the empirical habitable zone (EHZ). Systems that have dynamically disruptive companions are considered not habitable. We consider companions to be disruptive if the separation ratio of the companion to the habitable zone is less than 5:1. We use the results of these calculations to derive a simple formula for predicting the location of the EHZ for main sequence stars based on V – K color. We consider EHZ widths as more useful measures of the habitable real estate around stars than areas because multiple planets are not expected to orbit stars at identical stellar distances. This EHZ provides a qualitative guide on where to expect the largest population of planets in the habitable zones of main sequence stars. Because of their large numbers and lower frequency of short-period companions, M stars provide more EHZ real estate than other spectral types, possessing 36.5% of the habitable real estate en masse. K stars are second with 21.5%, while A, F, and G stars offer 18.5%, 6.9%, and 16.6%, respectively. Our calculations show that three M dwarfs within 10 pc harbor planets in their EHZs—GJ 581 may have two planets (d with msin i = 6.1 M ⊕ ; g with msin i = 3.1 M ⊕ ), GJ 667 C has one (c with msin i = 4.5 M ⊕ ), and GJ 876 has two (b with msin i = 1.89 M Jup and c with msin i = 0.56 M Jup ). If Earth-like planets are as common around low-mass stars as recent Kepler results suggest, M stars will harbor more Earth-like planets in habitable zones than any other stellar spectral type

  20. Hurricane-Induced Stage-Frequency Relationships for the Territory of American Samoa

    National Research Council Canada - National Science Library

    Militello, Adele

    1998-01-01

    .... The statistical approach taken to calculated frequency-of-occurrence relationships was the Empirical Simulation Technique, which applies historical wave information and a nearest neighbor technique...